The rjsoncons answer
illustrates several approaches. The approach most cleanly separating
-data transformation and data.table construction using JMESPath create an array of objects
+data transformation and data.table construction using JMESPath creates an array of objects
where each version_id is associated with vectors
of review_id, etc., corresponding to that version.
@Manual{,
title = {rjsoncons: Query, Pivot, Patch, and Validate 'JSON' and 'NDJSON'},
author = {Martin Morgan and Marcel Ramos and Daniel Parker},
year = {2024},
- note = {R package version 1.3.1.9001},
+ note = {R package version 1.3.1.9100},
url = {https://mtmorgan.github.io/rjsoncons/},
}
diff --git a/index.html b/index.html
index ee2bc22..dd620f2 100644
--- a/index.html
+++ b/index.html
@@ -22,7 +22,7 @@
rjsoncons
- 1.3.1.9001
+ 1.3.1.9100
diff --git a/news/index.html b/news/index.html
index 2f6afc3..db67f03 100644
--- a/news/index.html
+++ b/news/index.html
@@ -7,7 +7,7 @@
rjsoncons
- 1.3.1.9001
+ 1.3.1.9100
diff --git a/pkgdown.yml b/pkgdown.yml
index 2521a42..cfbb59e 100644
--- a/pkgdown.yml
+++ b/pkgdown.yml
@@ -5,7 +5,7 @@ articles:
a_rjsoncons: a_rjsoncons.html
articles/b_ndjson_extended: b_ndjson_extended.html
articles/c_examples: c_examples.html
-last_built: 2024-09-10T17:05Z
+last_built: 2024-09-10T17:45Z
urls:
reference: https://mtmorgan.github.io/rjsoncons/reference
article: https://mtmorgan.github.io/rjsoncons/articles
diff --git a/reference/as_r.html b/reference/as_r.html
index ba59862..120e466 100644
--- a/reference/as_r.html
+++ b/reference/as_r.html
@@ -7,7 +7,7 @@
rjsoncons
- 1.3.1.9001
+ 1.3.1.9100
diff --git a/reference/flatten.html b/reference/flatten.html
index b9c8e4c..db0e37b 100644
--- a/reference/flatten.html
+++ b/reference/flatten.html
@@ -31,7 +31,7 @@
rjsoncons
- 1.3.1.9001
+ 1.3.1.9100
diff --git a/reference/index.html b/reference/index.html
index 4ce2415..1c0dacd 100644
--- a/reference/index.html
+++ b/reference/index.html
@@ -7,7 +7,7 @@
rjsoncons
- 1.3.1.9001
+ 1.3.1.9100
diff --git a/reference/j_data_type.html b/reference/j_data_type.html
index 7ce9dcf..81ccdd4 100644
--- a/reference/j_data_type.html
+++ b/reference/j_data_type.html
@@ -15,7 +15,7 @@
rjsoncons
- 1.3.1.9001
+ 1.3.1.9100
diff --git a/reference/patch.html b/reference/patch.html
index 4b88302..cf93377 100644
--- a/reference/patch.html
+++ b/reference/patch.html
@@ -21,7 +21,7 @@
rjsoncons
- 1.3.1.9001
+ 1.3.1.9100
diff --git a/reference/rquerypivot.html b/reference/rquerypivot.html
index b4b7cc4..ff15a80 100644
--- a/reference/rquerypivot.html
+++ b/reference/rquerypivot.html
@@ -17,7 +17,7 @@
rjsoncons
- 1.3.1.9001
+ 1.3.1.9100
diff --git a/reference/schema.html b/reference/schema.html
index 6270d6c..4fec06a 100644
--- a/reference/schema.html
+++ b/reference/schema.html
@@ -19,7 +19,7 @@
rjsoncons
- 1.3.1.9001
+ 1.3.1.9100
diff --git a/reference/version.html b/reference/version.html
index cea048e..83ceb4e 100644
--- a/reference/version.html
+++ b/reference/version.html
@@ -9,7 +9,7 @@
rjsoncons
- 1.3.1.9001
+ 1.3.1.9100
diff --git a/reference/zzz_paths_and_pointer.html b/reference/zzz_paths_and_pointer.html
index 422a77d..e01e4dc 100644
--- a/reference/zzz_paths_and_pointer.html
+++ b/reference/zzz_paths_and_pointer.html
@@ -17,7 +17,7 @@
rjsoncons
- 1.3.1.9001
+ 1.3.1.9100
diff --git a/search.json b/search.json
index 7f3a726..f9a0c5d 100644
--- a/search.json
+++ b/search.json
@@ -1 +1 @@
-[{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"introduction-installation","dir":"Articles","previous_headings":"","what":"Introduction & installation","title":"Transform and Validate JSON and NDJSON","text":"Use rjsoncons querying, transforming, searching JSON, NDJSON, R objects using JMESpath, JSONpath, JSONpointer. rjsoncons supports JSON patch document editing, JSON schema validation. Link package direct access additional features jsoncons C++ library. Install released package version CRAN Install development version Attach installed package R session, check version C++ library use","code":"install.packages(\"rjsoncons\", repos = \"https://CRAN.R-project.org\") if (!requireNamespace(\"remotes\", quiety = TRUE)) install.packages(\"remotes\", repos = \"https://CRAN.R-project.org\") remotes::install_github(\"mtmorgan/rjsoncons\") library(rjsoncons) rjsoncons::version() ## [1] \"0.173.4 [+57967655d]\""},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"query-and-pivot","dir":"Articles","previous_headings":"","what":"Query and pivot","title":"Transform and Validate JSON and NDJSON","text":"Functions package work JSON NDJSON character vectors, file paths URLs JSON NDJSON documents, R objects can transformed JSON string.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"select-filter-and-transform-with-j_query","dir":"Articles","previous_headings":"Query and pivot","what":"Select, filter and transform with j_query()","title":"Transform and Validate JSON and NDJSON","text":"simple JSON example document several common use cases. Use rjsoncons query JSON string using JSONpath, JMESpath JSONpointer syntax filter larger documents records interest, e.g., cities New York state, using ‘JMESpath’ syntax. Use = \"R\" argument extract deeply nested elements R objects, e.g., character vector city names Washington state. JSON Pointer specification simpler, indexing single object document. JSON arrays 0-based. examples use j_query(), automatically infers query specification form path using j_path_type(). may useful indicate query specification explicitly using jsonpointer(), jsonpath(), jmespath(); examples illustrating features available query specification help pages ?jsonpointer, ?jsonpath, ?jmespath.","code":"json <- '{ \"locations\": [ {\"name\": \"Seattle\", \"state\": \"WA\"}, {\"name\": \"New York\", \"state\": \"NY\"}, {\"name\": \"Bellevue\", \"state\": \"WA\"}, {\"name\": \"Olympia\", \"state\": \"WA\"} ] }' j_query(json, \"locations[?state == 'NY']\") |> cat(\"\\n\") ## [{\"name\":\"New York\",\"state\":\"NY\"}] j_query(json, \"locations[?state == 'WA'].name\", as = \"R\") ## [1] \"Seattle\" \"Bellevue\" \"Olympia\" j_query(json, \"/locations/0/state\") ## [1] \"WA\""},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"array-of-objects-to-r-data-frame-with-j_pivot","dir":"Articles","previous_headings":"Query and pivot","what":"Array-of-objects to R data.frame with j_pivot()","title":"Transform and Validate JSON and NDJSON","text":"following transforms nested JSON document format can incorporated directly R data.frame. transformation JSON ‘array--objects’ ‘object--arrays’ suitable direct representation data.frame common, implemented directly j_pivot() j_pivot() also support = \"tibble\" dplyr package installed.","code":"path <- '{ name: locations[].name, state: locations[].state }' j_query(json, path, as = \"R\") |> data.frame() ## name state ## 1 Seattle WA ## 2 New York NY ## 3 Bellevue WA ## 4 Olympia WA j_pivot(json, \"locations\", as = \"data.frame\") ## name state ## 1 Seattle WA ## 2 New York NY ## 3 Bellevue WA ## 4 Olympia WA"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"ndjson-support","dir":"Articles","previous_headings":"Query and pivot","what":"NDJSON support","title":"Transform and Validate JSON and NDJSON","text":"rjsoncons supports NDJSON (new-line delimited JSON). NDJSON consists file character vector line / element represents JSON record. example uses data GitHub Archive project recording actions public GitHub repositories. data included package first 10 lines https://data.gharchive.org/2023-02-08-0.json.gz. NDJSON can read R (ndjson <- readLines(ndjson_file)) used j_query() / j_pivot(), often better leave full NDJSON files disk. Thus first argument j_query() j_pivot() usually (text gz-compressed) file path URL. Two additional options available working NDJSON. n_records limits number records processed. Using n_records can useful exploring data. instance, first record file can viewed interactively option verbose = TRUE adds progress indicator, provides confidence progress made parsing large files. progress bar requires cli package. j_query() provides one--one mapping NDJSON lines / elements return value, e.g., j_query(ndjson_file, \"@\", = \"string\") NDJSON file 1000 lines return character vector 1000 elements, j_query(ndjson, \"@\", = \"R\") R list length 1000. j_pivot() transforms NDJSON file character vector objects format convenient input R. j_pivot() NDJSON files JMESpath paths work particularly well together, JMESpath provides flexibility creating JSON objects pivoted. Filtering NDJSON files can require relatively complicated paths, e.g., filter ‘PushEvent’ types organizations, construct query acts NDJSON record return array single object, apply filter replace uninteresting elements 0-length arrays (using = \"tibble\" often transforms R list--vectors tibble pleasing robust manner compared = \"data.frame\"). complete example used NDJSON extended vignette","code":"ndjson_file <- system.file(package = \"rjsoncons\", \"extdata\", \"2023-02-08-0.json\") j_query(ndjson_file, n_records = 1) |> listviewer::jsonedit() j_query(ndjson_file, \"{id: id, type: type}\", n_records = 5) ## [1] \"{\\\"id\\\":\\\"26939254345\\\",\\\"type\\\":\\\"DeleteEvent\\\"}\" ## [2] \"{\\\"id\\\":\\\"26939254358\\\",\\\"type\\\":\\\"PushEvent\\\"}\" ## [3] \"{\\\"id\\\":\\\"26939254361\\\",\\\"type\\\":\\\"CreateEvent\\\"}\" ## [4] \"{\\\"id\\\":\\\"26939254365\\\",\\\"type\\\":\\\"CreateEvent\\\"}\" ## [5] \"{\\\"id\\\":\\\"26939254366\\\",\\\"type\\\":\\\"PushEvent\\\"}\" j_pivot(ndjson_file, \"{id: id, type: type}\", as = \"data.frame\") ## id type ## 1 26939254345 DeleteEvent ## 2 26939254358 PushEvent ## 3 26939254361 CreateEvent ## 4 26939254365 CreateEvent ## 5 26939254366 PushEvent ## 6 26939254367 PushEvent ## 7 26939254379 PushEvent ## 8 26939254380 IssuesEvent ## 9 26939254382 PushEvent ## 10 26939254383 PushEvent path <- \"[{id: id, type: type, org: org}] [?@.type == 'PushEvent' && @.org != null] | [0]\" j_pivot(ndjson_file, path, as = \"data.frame\") ## id type org.id org.login org.gravatar_id ## 1 26939254358 PushEvent 123667276 johnbieren-testing ## 2 26939254382 PushEvent 123667276 johnbieren-testing ## org.url ## 1 https://api.github.com/orgs/johnbieren-testing ## 2 https://api.github.com/orgs/johnbieren-testing ## org.avatar_url org.id.1 org.login.1 ## 1 https://avatars.githubusercontent.com/u/123667276? 120284018 mornystannit ## 2 https://avatars.githubusercontent.com/u/123667276? 120284018 mornystannit ## org.gravatar_id.1 org.url.1 ## 1 https://api.github.com/orgs/mornystannit ## 2 https://api.github.com/orgs/mornystannit ## org.avatar_url.1 ## 1 https://avatars.githubusercontent.com/u/120284018? ## 2 https://avatars.githubusercontent.com/u/120284018?"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"r-objects-as-input","dir":"Articles","previous_headings":"Query and pivot","what":"R objects as input","title":"Transform and Validate JSON and NDJSON","text":"rjsoncons can filter transform R objects. converted JSON using jsonlite::toJSON() queries made; toJSON() arguments like auto_unbox = TRUE can added function call.","code":"## `lst` is an *R* list lst <- jsonlite::fromJSON(json, simplifyVector = FALSE) j_query(lst, \"locations[?state == 'WA'].name | sort(@)\", auto_unbox = TRUE) |> cat(\"\\n\") ## [\"Bellevue\",\"Olympia\",\"Seattle\"]"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"patch","dir":"Articles","previous_headings":"","what":"Patch","title":"Transform and Validate JSON and NDJSON","text":"JSON Patch provides simple way edit transform JSON document using JSON commands.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"applying-a-patch-with-j_patch_apply","dir":"Articles","previous_headings":"Patch","what":"Applying a patch with j_patch_apply()","title":"Transform and Validate JSON and NDJSON","text":"Starting JSON document one can \"add\" another biscuit, copy favorite biscuit new locations using following patch paths specified using JSONpointer notation; remember JSON arrays 0-based, compared 1-based R arrays. Applying patch results new JSON document. Patches can also created R objects helper function j_patch_op(). j_patch_op() takes care unboxing op=, path=, =, care must taken ‘unboxing’ value= argument operations ‘add’; may also appropriate unbox specific fields, e.g., JSON patch web site, available operations example JSON : add – add elements existing document. remove – remove elements document. replace – replace one element another copy – copy path another location. move – move path another location. test – test existence path; path exist, apply patch. Formal description operations provided Section 4 RFC6902. patch command always array, even single operation involved.","code":"json <- '{ \"biscuits\": [ { \"name\": \"Digestive\" }, { \"name\": \"Choco Leibniz\" } ] }' patch <- '[ {\"op\": \"add\", \"path\": \"/biscuits/1\", \"value\": { \"name\": \"Ginger Nut\" }}, {\"op\": \"copy\", \"from\": \"/biscuits/2\", \"path\": \"/best_biscuit\"} ]' j_patch_apply(json, patch) ## [1] \"{\\\"biscuits\\\":[{\\\"name\\\":\\\"Digestive\\\"},{\\\"name\\\":\\\"Ginger Nut\\\"},{\\\"name\\\":\\\"Choco Leibniz\\\"}],\\\"best_biscuit\\\":{\\\"name\\\":\\\"Choco Leibniz\\\"}}\" ops <- c( j_patch_op( \"add\", \"/biscuits/1\", value = list(name = \"Ginger Nut\"), auto_unbox = TRUE ), j_patch_op(\"copy\", \"/best_biscuit\", from = \"/biscuits/2\") ) identical(j_patch_apply(json, patch), j_patch_apply(json, ops)) ## [1] TRUE value <- list(name = jsonlite::unbox(\"Ginger Nut\")) j_patch_op(\"add\", \"/biscuits/1\", value = value) ## [ ## {\"op\": \"add\", \"path\": \"/biscuits/1\", \"value\": {\"name\": \"Ginger Nut\"}} ## ] {\"op\": \"add\", \"path\": \"/biscuits/1\", \"value\": {\"name\": \"Ginger Nut\"}} {\"op\": \"remove\", \"path\": \"/biscuits/0\"} { \"op\": \"replace\", \"path\": \"/biscuits/0/name\", \"value\": \"Chocolate Digestive\" } {\"op\": \"copy\", \"from\": \"/biscuits/0\", \"path\": \"/best_biscuit\"} {\"op\": \"move\", \"from\": \"/biscuits\", \"path\": \"/cookies\"} {\"op\": \"test\", \"path\": \"/best_biscuit/name\", \"value\": \"Choco Leibniz\"}"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"difference-between-documents-with-j_patch_from","dir":"Articles","previous_headings":"Patch","what":"Difference between documents with j_patch_from()","title":"Transform and Validate JSON and NDJSON","text":"j_patch_from() function constructs patch difference two documents","code":"j_patch_from(j_patch_apply(json, patch), json) ## [1] \"[{\\\"op\\\":\\\"replace\\\",\\\"path\\\":\\\"/biscuits/1/name\\\",\\\"value\\\":\\\"Choco Leibniz\\\"},{\\\"op\\\":\\\"remove\\\",\\\"path\\\":\\\"/biscuits/2\\\"},{\\\"op\\\":\\\"remove\\\",\\\"path\\\":\\\"/best_biscuit\\\"}]\""},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"schema-validation","dir":"Articles","previous_headings":"","what":"Schema validation","title":"Transform and Validate JSON and NDJSON","text":"JSON schema provides structure JSON documents. j_schema_is_valid() checks JSON document valid specified schema, j_schema_validate() tries illustrate document deviates schema. example consider j_patch_op(), operation supposed conform JSON patch schema. convenience, copy schema available rjsoncons. well-formed ‘op’ valid, j_schema_validate() produces output Introduce invalid ‘op’, \"op\": \"invalid_op\", schema longer valid. reason can understood (careful!) consideration output j_schema_validate(), reference schema . validation indicates schema evaluationPath ‘/items/oneOf’ satisfied, error ‘schema [.e., ’oneOf’ elements] matched, …’. ‘details’ column summarizes 3 elements /items/oneOf fails schema specification; use = \"details\" extract directly indicates first item schema rejected ‘invalid_op’ valid enum Reasons rejecting items can explored using similar steps.","code":"## alternatively: schema <- \"https://json.schemastore.org/json-patch\" schema <- system.file(package = \"rjsoncons\", \"extdata\", \"json-patch.json\") cat(readLines(schema), sep = \"\\n\") ## { ## \"$schema\": \"http://json-schema.org/draft-04/schema#\", ## \"definitions\": { ## \"path\": { ## \"description\": \"A JSON Pointer path.\", ## \"type\": \"string\" ## } ## }, ## \"id\": \"https://json.schemastore.org/json-patch.json\", ## \"items\": { ## \"oneOf\": [ ## { ## \"additionalProperties\": false, ## \"required\": [\"value\", \"op\", \"path\"], ## \"properties\": { ## \"path\": { ## \"$ref\": \"#/definitions/path\" ## }, ## \"op\": { ## \"description\": \"The operation to perform.\", ## \"type\": \"string\", ## \"enum\": [\"add\", \"replace\", \"test\"] ## }, ## \"value\": { ## \"description\": \"The value to add, replace or test.\" ## } ## } ## }, ## { ## \"additionalProperties\": false, ## \"required\": [\"op\", \"path\"], ## \"properties\": { ## \"path\": { ## \"$ref\": \"#/definitions/path\" ## }, ## \"op\": { ## \"description\": \"The operation to perform.\", ## \"type\": \"string\", ## \"enum\": [\"remove\"] ## } ## } ## }, ## { ## \"additionalProperties\": false, ## \"required\": [\"from\", \"op\", \"path\"], ## \"properties\": { ## \"path\": { ## \"$ref\": \"#/definitions/path\" ## }, ## \"op\": { ## \"description\": \"The operation to perform.\", ## \"type\": \"string\", ## \"enum\": [\"move\", \"copy\"] ## }, ## \"from\": { ## \"$ref\": \"#/definitions/path\", ## \"description\": \"A JSON Pointer path pointing to the location to move/copy from.\" ## } ## } ## } ## ] ## }, ## \"title\": \"JSON schema for JSONPatch files\", ## \"type\": \"array\" ## } op <- '[{ \"op\": \"add\", \"path\": \"/biscuits/1\", \"value\": { \"name\": \"Ginger Nut\" } }]' j_schema_is_valid(op, schema) ## [1] TRUE j_schema_validate(op, schema) ## [1] \"[]\" op <- '[{ \"op\": \"invalid_op\", \"path\": \"/biscuits/1\", \"value\": { \"name\": \"Ginger Nut\" } }]' j_schema_is_valid(op, schema) ## [1] FALSE j_schema_validate(op, schema, as = \"tibble\") |> tibble::glimpse() ## Rows: 1 ## Columns: 6 ## $ valid FALSE ## $ evaluationPath \"/items/oneOf\" ## $ schemaLocation \"https://json.schemastore.org/json-patch.json#/items/… ## $ instanceLocation \"/0\" ## $ error \"No schema matched, but exactly one of them is requir… ## $ details [[FALSE, \"/items/oneOf/0/properties/op/enum\", \"https:… j_schema_validate(op, schema, as = \"details\") |> tibble::glimpse() ## Rows: 6 ## Columns: 5 ## $ valid FALSE, FALSE, FALSE, FALSE, FALSE, FALSE ## $ evaluationPath \"/items/oneOf/0/properties/op/enum\", \"/items/oneOf/1/… ## $ schemaLocation \"https://json.schemastore.org/json-patch.json#/items/… ## $ instanceLocation \"/0/op\", \"/0/op\", \"/0/value\", \"/0\", \"/0/op\", \"/0/valu… ## $ error \"'invalid_op' is not a valid enum value.\", \"'invalid_… j_query(schema, \"/items/oneOf/0/properties/op/enum\") |> noquote() ## [1] [\"add\",\"replace\",\"test\"]"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"flatten-and-find","dir":"Articles","previous_headings":"","what":"Flatten and find","title":"Transform and Validate JSON and NDJSON","text":"can sometimes helpful explore JSON documents ‘flattening’ JSON object path / value pairs, path JSONpointer path corresponding value. straight-forward search flattened object , e.g., path known field value. example, consider object ‘flat’ JSON can represented named list (using str() provide compact visual representation) names list JSONpointer (default) JSONpath, can used j_query() j_pivot() appropriate two ways find known keys values. first use exact matching one keys values, e.g., also possible match using regular expression. Keys always character vectors, values can different type; j_find_values() supports searches . common operation might find path know value, query original JSON find object value contained. JSONpointer JSONpath supported; advantage latter path distinguishes integer-valued (unquoted) string-valued (quoted) keys first argument j_find_*() can R object, JSON NDJSON string, file, URL. Using j_find_values() R object JSONpath path_type leads path easily converted R index: double [ ] path increment numerical index 1: NDJSON files flattened character vectors, element flattened version corresponding NDJSON record.","code":"codes <- '{ \"discards\": { \"1000\": \"Record does not exist\", \"1004\": \"Queue limit exceeded\", \"1010\": \"Discarding timed-out partial msg\" }, \"warnings\": { \"0\": \"Phone number missing country code\", \"1\": \"State code missing\", \"2\": \"Zip code missing\" } }' j_flatten(codes, as = \"R\") |> str() ## List of 6 ## $ /discards/1000: chr \"Record does not exist\" ## $ /discards/1004: chr \"Queue limit exceeded\" ## $ /discards/1010: chr \"Discarding timed-out partial msg\" ## $ /warnings/0 : chr \"Phone number missing country code\" ## $ /warnings/1 : chr \"State code missing\" ## $ /warnings/2 : chr \"Zip code missing\" j_query(codes, \"/discards/1010\") ## [1] \"Discarding timed-out partial msg\" j_find_values( codes, c(\"Record does not exist\", \"State code missing\"), as = \"tibble\" ) ## # A tibble: 2 × 2 ## path value ## ## 1 /discards/1000 Record does not exist ## 2 /warnings/1 State code missing j_find_keys(codes, \"warnings\", as = \"tibble\") ## # A tibble: 3 × 2 ## path value ## ## 1 /warnings/0 Phone number missing country code ## 2 /warnings/1 State code missing ## 3 /warnings/2 Zip code missing j_find_values_grep(codes, \"missing\", as = \"tibble\") ## # A tibble: 3 × 2 ## path value ## ## 1 /warnings/0 Phone number missing country code ## 2 /warnings/1 State code missing ## 3 /warnings/2 Zip code missing j_find_keys_grep(codes, \"card.*/100\", as = \"tibble\") # span key delimiters ## # A tibble: 2 × 2 ## path value ## ## 1 /discards/1000 Record does not exist ## 2 /discards/1004 Queue limit exceeded j <- '{\"x\":[1,[2, 3]],\"y\":{\"a\":4}}' j_flatten(j, as = \"R\") |> str() ## List of 4 ## $ /x/0 : int 1 ## $ /x/1/0: int 2 ## $ /x/1/1: int 3 ## $ /y/a : int 4 j_find_values(j, c(2, 4), as = \"tibble\") ## # A tibble: 2 × 2 ## path value ## ## 1 /x/1/0 2 ## 2 /y/a 4 j_find_values(j, 3, as = \"tibble\") ## # A tibble: 1 × 2 ## path value ## ## 1 /x/1/1 3 ## path to '3' is '/x/1/1', so containing object is at '/x/1' j_query(j, \"/x/1\") ## [1] \"[2,3]\" j_query(j, \"/x/1\", as = \"R\") ## [1] 2 3 j_find_values(j, 3, as = \"tibble\", path_type = \"JSONpath\") ## # A tibble: 1 × 2 ## path value ## ## 1 $['x'][1][1] 3 l <- j |> as_r() j_find_values(l, 3, auto_unbox = TRUE, path_type = \"JSONpath\", as = \"tibble\") ## # A tibble: 1 × 2 ## path value ## ## 1 $['x'][1][1] 3 l[['x']][[2]] # siblings ## [1] 2 3"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"the-json-parser","dir":"Articles","previous_headings":"","what":"The JSON parser","title":"Transform and Validate JSON and NDJSON","text":"package includes JSON parser, used argument = \"R\" directly as_r() main rules transformation outlined . JSON arrays single type (boolean, integer, double, string) transformed R vectors length corresponding type. JSON arrays mixing integer double values transformed R numeric vectors. JSON integer array contains value larger R’s 32-bit integer representation, array transformed R numeric vector. NOTE results loss precision JSON integer values greater 2^53. JSON objects transformed R named lists. several additional details. JSON scalar JSON vector length 1 represented way R. JSON arrays mixing types integer double transformed R lists JSON null values represented R NULL values; arrays null transformed lists Ordering object members controlled object_names= argument. default preserves names appear JSON definition; use \"sort\" sort names alphabetically. argument applied recursively. parser corresponds approximately jsonlite::fromJSON() arguments simplifyVector = TRUE, simplifyDataFrame = FALSE, simplifyMatrix = FALSE). Unit tests (using tinytest framework) providing additional details available ","code":"as_r('{\"a\": 1.0, \"b\": [2, 3, 4]}') |> str() #> List of 2 #> $ a: num 1 #> $ b: int [1:3] 2 3 4 as_r('[true, false, true]') # boolean -> logical ## [1] TRUE FALSE TRUE as_r('[1, 2, 3]') # integer -> integer ## [1] 1 2 3 as_r('[1.0, 2.0, 3.0]') # double -> numeric ## [1] 1 2 3 as_r('[\"a\", \"b\", \"c\"]') # string -> character ## [1] \"a\" \"b\" \"c\" as_r('[1, 2.0]') |> class() # numeric ## [1] \"numeric\" as_r('[1, 2147483648]') |> class() # 64-bit integers -> numeric ## [1] \"numeric\" as_r('{}') ## named list() as_r('{\"a\": 1.0, \"b\": [2, 3, 4]}') |> str() ## List of 2 ## $ a: num 1 ## $ b: int [1:3] 2 3 4 identical(as_r(\"3.14\"), as_r(\"[3.14]\")) ## [1] TRUE as_r('[true, 1, \"a\"]') |> str() ## List of 3 ## $ : logi TRUE ## $ : int 1 ## $ : chr \"a\" as_r('null') # NULL ## NULL as_r('[null]') |> str() # list(NULL) ## List of 1 ## $ : NULL as_r('[null, null]') |> str() # list(NULL, NULL) ## List of 2 ## $ : NULL ## $ : NULL json <- '{\"b\": 1, \"a\": {\"d\": 2, \"c\": 3}}' as_r(json) |> str() ## List of 2 ## $ b: int 1 ## $ a:List of 2 ## ..$ d: int 2 ## ..$ c: int 3 as_r(json, object_names = \"sort\") |> str() ## List of 2 ## $ a:List of 2 ## ..$ c: int 3 ## ..$ d: int 2 ## $ b: int 1 system.file(package = \"rjsoncons\", \"tinytest\", \"test_as_r.R\")"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"using-jsonlitefromjson","dir":"Articles","previous_headings":"The JSON parser","what":"Using jsonlite::fromJSON()","title":"Transform and Validate JSON and NDJSON","text":"built-parser can replaced alternative parsers returning query JSON string, e.g., using fromJSON() jsonlite package. rjsoncons package particularly useful accessing elements might otherwise require complicated application nested lapply(), purrr expressions, tidyr unnest_*() (see R Data Science chapter ‘Hierarchical data’).","code":"json <- '{ \"locations\": [ {\"name\": \"Seattle\", \"state\": \"WA\"}, {\"name\": \"New York\", \"state\": \"NY\"}, {\"name\": \"Bellevue\", \"state\": \"WA\"}, {\"name\": \"Olympia\", \"state\": \"WA\"} ] }' j_query(json, \"locations[?state == 'WA']\") |> ## `fromJSON()` simplifies list-of-objects to data.frame jsonlite::fromJSON() ## name state ## 1 Seattle WA ## 2 Bellevue WA ## 3 Olympia WA"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"c-library-use-in-other-packages","dir":"Articles","previous_headings":"","what":"C++ library use in other packages","title":"Transform and Validate JSON and NDJSON","text":"package includes complete ‘jsoncons’ C++ header-library, available R packages adding DESCRIPTION file. Typical use R package also include LinkingTo: specifications cpp11 Rcpp (package uses cpp11) packages provide C / C++ interface R C++ ‘jsoncons’ library.","code":"LinkingTo: rjsoncons SystemRequirements: C++11"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"session-information","dir":"Articles","previous_headings":"","what":"Session information","title":"Transform and Validate JSON and NDJSON","text":"vignette compiled using following software versions","code":"sessionInfo() ## R version 4.4.1 (2024-06-14) ## Platform: x86_64-pc-linux-gnu ## Running under: Ubuntu 22.04.4 LTS ## ## Matrix products: default ## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 ## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 ## ## locale: ## [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8 ## [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8 ## [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C ## [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C ## ## time zone: UTC ## tzcode source: system (glibc) ## ## attached base packages: ## [1] stats graphics grDevices utils datasets methods base ## ## other attached packages: ## [1] rjsoncons_1.3.1.9001 BiocStyle_2.32.1 ## ## loaded via a namespace (and not attached): ## [1] vctrs_0.6.5 cli_3.6.3 knitr_1.48 ## [4] rlang_1.1.4 xfun_0.47 textshaping_0.4.0 ## [7] jsonlite_1.8.8 glue_1.7.0 htmltools_0.5.8.1 ## [10] ragg_1.3.2 sass_0.4.9 fansi_1.0.6 ## [13] rmarkdown_2.28 evaluate_0.24.0 jquerylib_0.1.4 ## [16] tibble_3.2.1 fastmap_1.2.0 yaml_2.3.10 ## [19] lifecycle_1.0.4 bookdown_0.40 BiocManager_1.30.25 ## [22] compiler_4.4.1 fs_1.6.4 pkgconfig_2.0.3 ## [25] systemfonts_1.1.0 digest_0.6.37 R6_2.5.1 ## [28] utf8_1.2.4 pillar_1.9.0 magrittr_2.0.3 ## [31] bslib_0.8.0 tools_4.4.1 pkgdown_2.1.0 ## [34] cachem_1.1.0 desc_1.4.3"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/b_ndjson_extended.html","id":"installation-setup","dir":"Articles","previous_headings":"","what":"Installation & setup","title":"Processing NDJSON","text":"article assumes rjsoncons, listviewer (interactively exploring JSON), dplyr (manipulating results tibble) tidyr (unnesting columns tibble) cli (providing progress indicator) installed. Start loading rjsoncons dplyr packages current session. use data GH Archive, project record activity public GitHub repositories. Create location file system-wide ‘cache’ directory rjsoncons package. necessary, download single file (1 hour activity, 170,000 events, 100 Mb) GH Archive.","code":"pkgs <- c(\"rjsoncons\", \"dplyr\", \"tidyr\", \"cli\") needed <- pkgs[!pkgs %in% rownames(installed.packages())] install.packages(needed, repos = \"https://CRAN.R-project.org\") library(rjsoncons) library(dplyr) cache <- tools::R_user_dir(\"rjsoncons\", \"cache\") if (!dir.exists(cache)) dir.create(cache, recursive = TRUE) archive_file <- \"https://data.gharchive.org/2023-02-08-0.json.gz\" ndjson_file <- file.path(cache, \"2023-02-08-0.json.gz\") if (!file.exists(ndjson_file)) download.file(archive_file, ndjson_file)"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/b_ndjson_extended.html","id":"data-exploration","dir":"Articles","previous_headings":"","what":"Data exploration","title":"Processing NDJSON","text":"Ensure ndjson_file defined exists get sense data, read visualize first record query uses default path = \"@\", JMESpath expression returns current element. n_records = argument available processing NDJSON, restricts number records input. useful exploring data. record contains information . Records general structure, information can differ, e.g., actions org field. work \"id\" \"type\" top-level fields, available using JMESpath elaborate query might combine , nested, elements, e.g., Note records 3-5 organization.","code":"stopifnot( file.exists(ndjson_file) ) j_query(ndjson_file, n_records = 1) |> listviewer::jsonedit() { \"id\": \"26939254345\", \"type\": \"DeleteEvent\", \"actor\": { \"id\": 19908762, \"login\": \"lucianHymer\", \"display_login\": \"lucianHymer\", \"gravatar_id\": \"\", \"url\": \"https://api.github.com/users/lucianHymer\", \"avatar_url\": \"https://avatars.githubusercontent.com/u/19908762?\" }, \"repo\": { \"id\": 469847426, \"name\": \"gitcoinco/passport\", \"url\": \"https://api.github.com/repos/gitcoinco/passport\" }, \"payload\": { \"ref\": \"format-alert-messages\", \"ref_type\": \"branch\", \"pusher_type\": \"user\" }, \"public\": true, \"created_at\": \"2023-02-08T00:00:00Z\", \"org\": { \"id\": 30044474, \"login\": \"gitcoinco\", \"gravatar_id\": \"\", \"url\": \"https://api.github.com/orgs/gitcoinco\", \"avatar_url\": \"https://avatars.githubusercontent.com/u/30044474?\" } } j_query(ndjson_file, '{id: id, type: type}', n_records = 5) ## [1] \"{\\\"id\\\":\\\"26939254345\\\",\\\"type\\\":\\\"DeleteEvent\\\"}\" ## [2] \"{\\\"id\\\":\\\"26939254358\\\",\\\"type\\\":\\\"PushEvent\\\"}\" ## [3] \"{\\\"id\\\":\\\"26939254361\\\",\\\"type\\\":\\\"CreateEvent\\\"}\" ## [4] \"{\\\"id\\\":\\\"26939254365\\\",\\\"type\\\":\\\"CreateEvent\\\"}\" ## [5] \"{\\\"id\\\":\\\"26939254366\\\",\\\"type\\\":\\\"PushEvent\\\"}\" j_query(ndjson_file, '{id: id, type: type, \"org.id\": org.id}', n_records = 5) ## [1] \"{\\\"id\\\":\\\"26939254345\\\",\\\"type\\\":\\\"DeleteEvent\\\",\\\"org.id\\\":30044474}\" ## [2] \"{\\\"id\\\":\\\"26939254358\\\",\\\"type\\\":\\\"PushEvent\\\",\\\"org.id\\\":123667276}\" ## [3] \"{\\\"id\\\":\\\"26939254361\\\",\\\"type\\\":\\\"CreateEvent\\\",\\\"org.id\\\":null}\" ## [4] \"{\\\"id\\\":\\\"26939254365\\\",\\\"type\\\":\\\"CreateEvent\\\",\\\"org.id\\\":null}\" ## [5] \"{\\\"id\\\":\\\"26939254366\\\",\\\"type\\\":\\\"PushEvent\\\",\\\"org.id\\\":null}\""},{"path":"https://mtmorgan.github.io/rjsoncons/articles/b_ndjson_extended.html","id":"use-jmespath-for-queries","dir":"Articles","previous_headings":"Data exploration","what":"Use JMESpath for queries","title":"Processing NDJSON","text":"JMESpath seems appropriate working NDJSON files. ’s JMESpath query extracting just org information; query processes five records returns five results; records 3-5 key, \"null\". JSONpointer path used, error key exist, third record processed Also, JSONpointer allow one create new objects components data, one assemble id type keys original object new object. JSONpath allows missing keys straight-forward assemble new objects, e.g., placing top-level \"id\" \"type\" keys single object.","code":"j_query(ndjson_file, 'org', n_records = 5) ## [1] \"{\\\"id\\\":30044474,\\\"login\\\":\\\"gitcoinco\\\",\\\"gravatar_id\\\":\\\"\\\",\\\"url\\\":\\\"https://api.github.com/orgs/gitcoinco\\\",\\\"avatar_url\\\":\\\"https://avatars.githubusercontent.com/u/30044474?\\\"}\" ## [2] \"{\\\"id\\\":123667276,\\\"login\\\":\\\"johnbieren-testing\\\",\\\"gravatar_id\\\":\\\"\\\",\\\"url\\\":\\\"https://api.github.com/orgs/johnbieren-testing\\\",\\\"avatar_url\\\":\\\"https://avatars.githubusercontent.com/u/123667276?\\\"}\" ## [3] \"null\" ## [4] \"null\" ## [5] \"null\" try( ## fails: 'b' does not exist j_query('{\"a\": 1}', '/b') ) ## Error : Key not found try( ## fails: record 3 does not have 'org' key j_query(ndjson_file, '/org', n_records = 5) ) ## Error : Key not found j_query(ndjson_file, \"$.org\", n_records = 5) ## [1] \"[{\\\"id\\\":30044474,\\\"login\\\":\\\"gitcoinco\\\",\\\"gravatar_id\\\":\\\"\\\",\\\"url\\\":\\\"https://api.github.com/orgs/gitcoinco\\\",\\\"avatar_url\\\":\\\"https://avatars.githubusercontent.com/u/30044474?\\\"}]\" ## [2] \"[{\\\"id\\\":123667276,\\\"login\\\":\\\"johnbieren-testing\\\",\\\"gravatar_id\\\":\\\"\\\",\\\"url\\\":\\\"https://api.github.com/orgs/johnbieren-testing\\\",\\\"avatar_url\\\":\\\"https://avatars.githubusercontent.com/u/123667276?\\\"}]\" ## [3] \"[]\" ## [4] \"[]\" ## [5] \"[]\""},{"path":"https://mtmorgan.github.io/rjsoncons/articles/b_ndjson_extended.html","id":"use-tibble-with-j_pivot","dir":"Articles","previous_headings":"Data exploration","what":"Use tibble with j_pivot()","title":"Processing NDJSON","text":"j_pivot() useful extracting tabular data JSON NDJSON representations. Recall j_pivot() transforms JSON array file records objects object arrays can represented data.frame tibble ‘hood’, j_pivot() simply calling = \"R\" .data.frame() result. Unfortunately, .data.frame() fails keys translated NULL, e.g., org absent coercion R representation tibble robust missing data Hierarchical data chapter R Data Science suggests using tidyr::unnest_wider() tidyr::unnest_longer()` working nested data. result pivot can flattened one interested keys nested org element, incorporated directly path. Note keys containing . need quoted \"org.id\": org.id.","code":"path <- '{id: id, type: type}' j_pivot(ndjson_file, path, n_records = 5, as = \"R\") |> str() ## List of 2 ## $ id : chr [1:5] \"26939254345\" \"26939254358\" \"26939254361\" \"26939254365\" ... ## $ type: chr [1:5] \"DeleteEvent\" \"PushEvent\" \"CreateEvent\" \"CreateEvent\" ... j_pivot(ndjson_file, path, n_records = 5, as = \"data.frame\") ## id type ## 1 26939254345 DeleteEvent ## 2 26939254358 PushEvent ## 3 26939254361 CreateEvent ## 4 26939254365 CreateEvent ## 5 26939254366 PushEvent path <- '{id: id, type: type, org: org}' try( j_pivot(ndjson_file, path, n_records = 5, as = \"data.frame\") ) ## Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : ## arguments imply differing number of rows: 1, 0 tbl <- j_pivot(ndjson_file, path, n_records = 5, as = \"tibble\") tbl ## # A tibble: 5 × 3 ## id type org ## ## 1 26939254345 DeleteEvent ## 2 26939254358 PushEvent ## 3 26939254361 CreateEvent ## 4 26939254365 CreateEvent ## 5 26939254366 PushEvent tbl |> tidyr::unnest_wider(\"org\", names_sep = \".\") ## # A tibble: 5 × 7 ## id type org.id org.login org.gravatar_id org.url org.avatar_url ## ## 1 26939254345 DeleteEv… 3.00e7 gitcoinco \"\" https:… https://avata… ## 2 26939254358 PushEvent 1.24e8 johnbier… \"\" https:… https://avata… ## 3 26939254361 CreateEv… NA NA NA NA NA ## 4 26939254365 CreateEv… NA NA NA NA NA ## 5 26939254366 PushEvent NA NA NA NA NA path <- '{id: id, type: type, \"org.id\": org.id}' j_pivot(ndjson_file, path, n_records = 5, as = \"tibble\") ## # A tibble: 5 × 3 ## id type org.id ## ## 1 26939254345 DeleteEvent ## 2 26939254358 PushEvent ## 3 26939254361 CreateEvent ## 4 26939254365 CreateEvent ## 5 26939254366 PushEvent "},{"path":"https://mtmorgan.github.io/rjsoncons/articles/b_ndjson_extended.html","id":"filters-with-jmespath","dir":"Articles","previous_headings":"Data exploration","what":"Filters with JMESpath","title":"Processing NDJSON","text":"strategy filtering NDJSON JMESpath create length 1 array containing object interest, filter array. Thus discover PushEvents organizations, form array object containing relevant information [{id: id, type: type, org: org}] filter array using JMESpath’s query syntax [?@.type == 'PushEvent' && org != null]. type quotation (single-quote, ') important query, use double quotes define path j_pivot() removes empty records","code":"path <- \"[{id: id, type: type, org: org}] [?@.type == 'PushEvent' && org != null] | [0]\" j_query(ndjson_file, path, n_records = 5) ## [1] \"null\" ## [2] \"{\\\"id\\\":\\\"26939254358\\\",\\\"type\\\":\\\"PushEvent\\\",\\\"org\\\":{\\\"id\\\":123667276,\\\"login\\\":\\\"johnbieren-testing\\\",\\\"gravatar_id\\\":\\\"\\\",\\\"url\\\":\\\"https://api.github.com/orgs/johnbieren-testing\\\",\\\"avatar_url\\\":\\\"https://avatars.githubusercontent.com/u/123667276?\\\"}}\" ## [3] \"null\" ## [4] \"null\" ## [5] \"null\" path <- \"[{id: id, type: type, org: org}] [?@.type == 'PushEvent' && org != null] | [0]\" j_pivot(ndjson_file, path, n_records = 5, as = \"tibble\") ## # A tibble: 1 × 3 ## id type org ## ## 1 26939254358 PushEvent "},{"path":"https://mtmorgan.github.io/rjsoncons/articles/b_ndjson_extended.html","id":"performance","dir":"Articles","previous_headings":"","what":"Performance","title":"Processing NDJSON","text":"rjsoncons relatively performant processing large files. Use verbose = TRUE get progress indicators. system, takes approximately 13s. Memory use extensive, R level file processed chunks final result represented R data structures. performance rjsoncons comparable purpose-built jq command-line tool. jq takes 9s run command line. additional 3s required input command-line output R. jq provides greater flexibility JMESpath, widely used. CRAN package jqr provides R interface jq library. Linux macOS users required jq library installed. straight-forward use library takes 22 seconds; additional steps required translate result R data.frame. use case outlined compares favorably performance ndjson CRAN package, took 600s complete task . ndjson reads entire data set R, whereas rjsoncons represents final object columns id type R. DuckDB offers CRAN package supports SQL interface JSON, performant. following code takes just 3.7s deliver data.frame R. DuckDB SQL interface allows flexible selection, filtering, data summary. also treats collection JSON files single ‘database’, scales favorably automatically number files processed. DuckDB require additional software, duckdb CRAN package. blog post provides additional details comparison solutions, including discussion design decisions rjsoncons adopted achieve reasonable performance.","code":"system.time({ tbl <- j_pivot( ndjson_file, '{id: id, type: type}', as = \"tibble\", verbose = TRUE ) }) ## processing 33511 records ## processing 68615 records ## processing 106749 records ## processing 142963 records ## user system elapsed ## 13.833 0.110 13.944 tbl ## # A tibble: 172,049 × 2 ## id type ## ## 1 26939254345 DeleteEvent ## 2 26939254358 PushEvent ## 3 26939254361 CreateEvent ## 4 26939254365 CreateEvent ## 5 26939254366 PushEvent ## 6 26939254367 PushEvent ## 7 26939254379 PushEvent ## 8 26939254380 IssuesEvent ## 9 26939254382 PushEvent ## 10 26939254383 PushEvent ## # ℹ 172,039 more rows tbl |> count(type, sort = TRUE) ## # A tibble: 15 × 2 ## type n ## ## 1 PushEvent 90250 ## 2 CreateEvent 25311 ## 3 PullRequestEvent 18326 ## 4 IssueCommentEvent 9610 ## 5 DeleteEvent 9065 ## 6 WatchEvent 5620 ## 7 PullRequestReviewEvent 3823 ## 8 IssuesEvent 2744 ## 9 PullRequestReviewCommentEvent 2098 ## 10 ForkEvent 1900 ## 11 CommitCommentEvent 1257 ## 12 ReleaseEvent 917 ## 13 PublicEvent 491 ## 14 MemberEvent 388 ## 15 GollumEvent 249 system.time({ jqr <- jqr::jq(gzfile(ndjson_file), '{id, type}') |> j_pivot(as = \"tibble\") }) ## user system elapsed ## 20.887 0.032 20.920 library(glue) library(duckdb) library(DBI) con <- dbConnect(duckdb()) dbExecute(con, \"INSTALL 'json';\") # only required once dbExecute(con, \"LOAD 'json';\") sql <- glue( \"SELECT id, type FROM read_ndjson_auto('{ndjson_file}');\" ) system.time({ res <- dbGetQuery(con, sql) }) # 3.7 seconds!"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/b_ndjson_extended.html","id":"other-packages","dir":"Articles","previous_headings":"","what":"Other packages","title":"Processing NDJSON","text":"two fast JSON parsers available via CRAN, RcppSimdJson yyjsonr. RcppSimdJson supports JSONpointer queries, noted NDJSON useful records contain endpoint. yyjsonr support queries NDJSON time writing (18 February, 2024).","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/articles/b_ndjson_extended.html","id":"session-information","dir":"Articles","previous_headings":"","what":"Session information","title":"Processing NDJSON","text":"","code":"sessionInfo() ## R version 4.4.1 (2024-06-14) ## Platform: x86_64-pc-linux-gnu ## Running under: Ubuntu 22.04.4 LTS ## ## Matrix products: default ## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 ## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 ## ## locale: ## [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8 ## [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8 ## [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C ## [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C ## ## time zone: UTC ## tzcode source: system (glibc) ## ## attached base packages: ## [1] stats graphics grDevices utils datasets methods base ## ## other attached packages: ## [1] dplyr_1.1.4 rjsoncons_1.3.1.9001 BiocStyle_2.32.1 ## ## loaded via a namespace (and not attached): ## [1] jsonlite_1.8.8 compiler_4.4.1 BiocManager_1.30.25 ## [4] tidyselect_1.2.1 tidyr_1.3.1 jquerylib_0.1.4 ## [7] systemfonts_1.1.0 textshaping_0.4.0 yaml_2.3.10 ## [10] fastmap_1.2.0 R6_2.5.1 generics_0.1.3 ## [13] knitr_1.48 tibble_3.2.1 bookdown_0.40 ## [16] desc_1.4.3 bslib_0.8.0 pillar_1.9.0 ## [19] rlang_1.1.4 utf8_1.2.4 cachem_1.1.0 ## [22] xfun_0.47 fs_1.6.4 sass_0.4.9 ## [25] lazyeval_0.2.2 cli_3.6.3 pkgdown_2.1.0 ## [28] withr_3.0.1 magrittr_2.0.3 digest_0.6.37 ## [31] lifecycle_1.0.4 jqr_1.3.4 vctrs_0.6.5 ## [34] evaluate_0.24.0 glue_1.7.0 ragg_1.3.2 ## [37] fansi_1.0.6 rmarkdown_2.28 purrr_1.0.2 ## [40] tools_4.4.1 pkgconfig_2.0.3 htmltools_0.5.8.1"},{"path":[]},{"path":"https://mtmorgan.github.io/rjsoncons/articles/c_examples.html","id":"data-frame-columns-as-ndjson","dir":"Articles","previous_headings":"Query and pivot","what":"Data frame columns as NDJSON","title":"Examples","text":"https://stackoverflow.com/questions/76447100 question presented tibble one column contained JSON expressions. goal extract fields P0_Q0 element new columns. response column can viewed NDJSON, can use pivot(df$response, \"P0_Q0\") hard work, bind_cols() prepend subject initial response early package development, motivated j_pivot() easier way perform common operation transforming JSON array--objects R data frame.","code":"df #> # A tibble: 2 × 2 #> subject response #> * #> 1 dtv85251vucquc45 \"{\\\"P0_Q0\\\":{\\\"aktiv\\\":2,\\\"bekümmert\\\":3,\\\"interessiert\\\":4,… #> 2 mcj8vdqz7sxmjcr0 \"{\\\"P0_Q0\\\":{\\\"aktiv\\\":1,\\\"bekümmert\\\":3,\\\"interessiert\\\":1,… bind_cols( df |> select(subject), df |> pull(response) |> j_pivot(\"P0_Q0\", as = \"tibble\") ) #> # A tibble: 2 × 11 #> subject aktiv bekümmert interessiert `freudig erregt` verärgert stark schuldig #> #> 1 dtv852… 2 3 4 2 2 0 1 #> 2 mcj8vd… 1 3 1 1 0 0 2 #> # ℹ 3 more variables: erschrocken , feindselig , angeregt "},{"path":"https://mtmorgan.github.io/rjsoncons/articles/c_examples.html","id":"constructing-a-pivot-object-using-jmespath","dir":"Articles","previous_headings":"Query and pivot","what":"Constructing a pivot object using JMESPath","title":"Examples","text":"https://stackoverflow.com/questions/78029215 question array objects, single unique key-value pair. goal create tibble keys one column, values another. jsonlite::fromJSON() j_pivot(json, \"encrypted_values\") simplify result tibble column object key, desired. Instead, write JMESPath query extracts object keys one element, values another. uses @ represent current mode, keys() values() functions extract associated elements. trailing [] converts array--arrays keys (example) simple array keys.","code":"json <- '{ \"encrypted_values\":[ {\"name_a\":\"value_a\"}, {\"name_b\":\"value_b\"}, {\"name_c\":\"value_c\"} ] }' jsonlite::fromJSON(json)$encrypted_values #> name_a name_b name_c #> 1 value_a #> 2 value_b #> 3 value_c j_pivot(json, \"encrypted_values\", as = \"tibble\") #> # A tibble: 3 × 3 #> name_a name_b name_c #> #> 1 #> 2 #> 3 query <- '{ name : encrypted_values[].keys(@)[], value: encrypted_values[].values(@)[] }' j_pivot(json, query, as = \"tibble\") #> # A tibble: 3 × 2 #> name value #> #> 1 name_a value_a #> 2 name_b value_b #> 3 name_c value_c"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/c_examples.html","id":"constructing-a-pivot-object-using-jmespath-a-second-example","dir":"Articles","previous_headings":"Query and pivot","what":"Constructing a pivot object using JMESPath: a second example","title":"Examples","text":"https://stackoverflow.com/questions/78727724 question asks creating tibble complex JSON data structure. reproducible example retrieve JSON; unfortunately host resolve run GitHub action, example fully evaluated. Explore JSON using listviewer. Write query extracts, directly JSON, fields interest. queries written using JMESPath. simple example extracts ‘date’ path return.BcIntertie.Allocations[].date Expand querying several different fields, re-formating query new JSON object. Develop code querying / viewing things look like JSON array--objects Finally, run j_pivot() transform JSON tibble. result ","code":"start_date <- end_date <- Sys.Date() res <- httr::GET( url = \"https://itc.aeso.ca/itc/public/api/v2/interchange\", query = list( beginDate = format(start_date, \"%Y%m%d\"), endDate = format(end_date, \"%Y%m%d\"), Accept = \"application/json\" ) ) json <- httr::content(res, as = \"text\", encoding = \"UTF-8\") listviewer::jsonedit(json) path <- 'return.BcIntertie.Allocations[].date' j_query(json, path) |> str() ## chr \"[\\\"2024-07-10\\\",\\\"2024-07-10\\\",\\\"2024-07-10\\\",\\\"2024-07-10\\\",\\\"2024-07-10\\\",\\\"2024-07-10\\\",\\\"2024-07-10\\\",\\\"202\"| __truncated__ path <- paste0( 'return.{', 'date: BcIntertie.Allocations[].date,', 'he: BcIntertie.Allocations[].he,', 'bc_import: BcIntertie.Allocations[].import.atc,', 'bc_export: BcIntertie.Allocations[].export.atc,', 'matl_import: MatlIntertie.Allocations[].import.atc,', 'matl_export: MatlIntertie.Allocations[].export.atc', '}' ) j_query(json, path) |> listviewer::jsonedit() j_pivot(json, path, as = \"tibble\") ## # A tibble: 48 × 6 ## date he bc_import bc_export matl_import matl_export ## ## 1 2024-07-10 4 750 950 295 300 ## 2 2024-07-10 5 750 950 295 300 ## 3 2024-07-10 6 750 950 295 300 ## 4 2024-07-10 7 750 950 295 300 ## 5 2024-07-10 8 750 950 295 300 ## 6 2024-07-10 9 750 950 295 300 ## 7 2024-07-10 10 750 950 295 300 ## 8 2024-07-10 11 750 950 295 300 ## 9 2024-07-10 12 750 950 295 300 ## 10 2024-07-10 13 750 950 295 300 ## # ℹ 38 more rows ## # ℹ Use `print(n = ...)` to see more rows"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/c_examples.html","id":"reshaping-nested-records","dir":"Articles","previous_headings":"Query and pivot","what":"Reshaping nested records","title":"Examples","text":"https://stackoverflow.com/questions/78952424 question wants transform JSON data.table. previous answer uses rbindlist() (similar dplyr::bind_rows()) transform structured lists data.tables. sample data desired data.table flattened include version_id field data[] columns table, complication version_id needs replicated element data[]. rjsoncons answer illustrates several approaches. approach cleanly separating data transformation data.table construction using JMESPath create array objects version_id associated vectors review_id, etc., corresponding version. R object, exactly handled rbindlist(). Note pivot involved. dplyr’s bind_rows() behaves similarly:","code":"json <- '[ { \"version_id\": \"123456\", \"data\": [ { \"review_id\": \"1\", \"rating\": 5, \"review\": \"This app is great\", \"date\": \"2024-09-01\" }, { \"review_id\": \"2\", \"rating\": 1, \"review\": \"This app is terrible\", \"date\": \"2024-09-01\" } ] }, { \"version_id\": \"789101\", \"data\": [ { \"review_id\": \"3\", \"rating\": 3, \"review\": \"This app is OK\", \"date\": \"2024-09-01\" } ] } ]' query <- \"[].{ version_id: version_id, review_id: data[].review_id, rating: data[].rating, review: data[].review, date: data[].date }\" records <- j_query(json, query, as = \"R\") data.table::rbindlist(records) #> version_id review_id rating review date #> #> 1: 123456 1 5 This app is great 2024-09-01 #> 2: 123456 2 1 This app is terrible 2024-09-01 #> 3: 789101 3 3 This app is OK 2024-09-01 dplyr::bind_rows(records) #> # A tibble: 3 × 5 #> version_id review_id rating review date #> #> 1 123456 1 5 This app is great 2024-09-01 #> 2 123456 2 1 This app is terrible 2024-09-01 #> 3 789101 3 3 This app is OK 2024-09-01"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/c_examples.html","id":"reading-from-urls","dir":"Articles","previous_headings":"Query and pivot","what":"Reading from URLs","title":"Examples","text":"https://stackoverflow.com/questions/78023560 question illustrates rjsoncons ability read URLs; query extracts fixtures array objects specific nested elements, similar previous question. practice, used json <- readLines(url) create local copy data use developing query. easiest path general answer (extract members ‘homeTeam’ ‘awayTeam’ tibble) might, like posted answer, combine JSON extraction tidyr.","code":"url <- \"https://www.nrl.com/draw//data?competition=111&season=2024\" query <- 'fixtures[].{ homeTeam: homeTeam.nickName, awayTeam: awayTeam.nickName }' j_pivot(url, query, as = \"tibble\") #> # A tibble: 4 × 2 #> homeTeam awayTeam #> #> 1 Panthers Roosters #> 2 Storm Sharks #> 3 Cowboys Knights #> 4 Bulldogs Sea Eagles query <- 'fixtures[].{ homeTeam: homeTeam, awayTeam: awayTeam }' j_pivot(url, query, as = \"tibble\") |> tidyr::unnest_wider(c(\"homeTeam\", \"awayTeam\"), names_sep = \"_\") #> # A tibble: 4 × 8 #> homeTeam_teamId homeTeam_nickName homeTeam_teamPosition homeTeam_theme #> #> 1 500014 Panthers 2nd #> 2 500021 Storm 1st #> 3 500012 Cowboys 5th #> 4 500010 Bulldogs 6th #> # ℹ 4 more variables: awayTeam_teamId , awayTeam_nickName , #> # awayTeam_teamPosition , awayTeam_theme "},{"path":"https://mtmorgan.github.io/rjsoncons/articles/c_examples.html","id":"deeply-nested-objects","dir":"Articles","previous_headings":"Query and pivot","what":"Deeply nested objects","title":"Examples","text":"https://stackoverflow.com/questions/77998013 details question StackOverflow, following code chunks evaluated directly. example several interesting elements: JSON quite large (90 Mb), processing immediate. developing query, focused subset data interactive experience. JSON array indexing 0-based, contrast 1-based R indexing. developing JSON query, spent quite bit time viewing results using listviewer::jsonedit(), e.g., objects interest polygon coordinates nested deeply JSON, location implied JMESPath. One polygon 3618 polygons, extracted using wild-card * place index 0 particular place path. tidyr::unnest() used create ‘long’ version result, illustrated response. exploring data, JMESPath function length() suggested two anomalies data (paths Crime2013 just one polygon; path used identify polygons 2022 elements, 3618 polygons); well misunderstanding part.","code":"Crime2013 <- j_query(json, \"x.calls[9].args\") listviewer::jsonedit(j_query(Crime2013, \"[0][*]\")) query <- \"x.calls[9].args[0][0][0][0]\" j_pivot(json, query, as = \"tibble\") ## # A tibble: 27 × 2 ## lng lat ## ## 1 -43.3 -22.9 ## 2 -43.3 -22.9 ## 3 -43.3 -22.9 ## 4 -43.3 -22.9 ## 5 -43.3 -22.9 ## 6 -43.3 -22.9 ## 7 -43.3 -22.9 ## 8 -43.3 -22.9 ## 9 -43.3 -22.9 ## 10 -43.3 -22.9 ## # ℹ 17 more rows ## # ℹ Use `print(n = ...)` to see more rows query <- \"x.calls[9].args[0][*][0][0]\" j_pivot(json, query, as = \"tibble\") ## # A tibble: 3,618 × 2 ## lng lat ## ## 1 ## 2 ## 3 ## 4 ## 5 ## 6 ## 7 ## 8 ## 9 ## 10 ## # ℹ 3,608 more rows ## # ℹ Use `print(n = ...)` to see more rows"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/c_examples.html","id":"jsonpath-wildcards","dir":"Articles","previous_headings":"Query and pivot","what":"JSONPath wildcards","title":"Examples","text":"https://stackoverflow.com/questions/78029215 question retrieves JSON representation hierarchy departments within German research institutions. interest finding path two departments. answer posted StackOverflow translates JSON list--lists structure describing hierarchy R list--lists uses series complicated manipulations form tibble suitable querying. approach recognizes graph-based problem. goal construct graph JSON, use graph algorithms identify shortest path nodes. followed @margusl extract JSON web page. used listviewer explore JSON interactively, get feel structure. Obtain root institutional hierarchy simple path traversal, using JSONPath syntax. JSONPath provides ‘wild-card’ syntax, convenient querying hierarchical data representations. looks like institution (directed, acyclic) graph, nodes representing division institution. nodes easy reconstruct. Look keys id, name.de, name.enusing wild-card .., meaning ‘depth’. queries returns vector. Use construct tibble. Although id key integer JSON, seems appropriate think character-valued. 497 nodes institution. edges little tricky reconstruct nested structure JSON. Start id number children node. developed edgelist() function (folded code chunk ) transform id children_per_node two-column matrix -relations. function, parent_id n_children stacks used capture hierarchical structure data. level represents current level hierarchy. algorithm walks along id input, records /relationship implied n, completes level hierarchy (n_children[level] == 0), pushes next level onto stack, continues next id. matrix edges, columns , computation tricky, inefficient. two queries JSON object (id children_per_node) R iteration elements id extensive. used tidygraph package represent graph nodes edges. used convert() find graph shortest path two nodes, extracted tibble nodes. answer question posed StackOverflow post. can fun try visualize graph, e.g., using ggraph","code":"library(rvest) html <- read_html(\"https://www.gerit.org/en/institutiondetail/10282\") ## scrape JSON xpath <- '//script[contains(text(),\"window.__PRELOADED_STATE__\")]' json <- html |> html_element(xpath = xpath) |> html_text() |> sub(pattern = \"window.__PRELOADED_STATE__ = \", replacement = \"\", fixed = TRUE) listviewer::jsonedit(json) library(rjsoncons) library(dplyr) tree <- j_query(json, \"$.institutionDetail.institution.tree\") ## these are all nodes, including the query node; each node has an ## 'id' and german ('de') and english ('en') name. nodes <- tibble( ## all keys 'id' , 'name.de', and 'name.en', under 'tree' with ## wild-card '..' matching id = j_query(tree, \"$..id\", as = \"R\") |> as.character(), de = j_query(tree, \"$..name.de\", as = \"R\"), en = j_query(tree, \"$..name.en\", as = \"R\") ) nodes #> # A tibble: 497 × 3 #> id de en #> #> 1 10282 \"Universität zu Köln\" Univ… #> 2 14036 \"Fakultät 1: Wirtschafts- und Sozialwissenschaftliche Fakult… Facu… #> 3 176896857 \"Cologne Graduate School in Management, \\rEconomics and Soci… Colo… #> 4 555102855 \"Cologne Institute for Information Systems (CIIS)\" Colo… #> 5 555936524 \"Chair of Business Analytics\" Chai… #> 6 537375237 \"ECONtribute: Markets & Public Policy\" ECON… #> 7 439201502 \"Fachbereich Volkswirtschaftslehre\" Econ… #> 8 202057753 \"Center for Macroeconomic Research (CMR)\" Cent… #> 9 228480345 \"Seminar für Energiewissenschaft\" Chai… #> 10 237641982 \"Seminar für Experimentelle Wirtschafts- und Verhaltensforsc… Semi… #> # ℹ 487 more rows id <- j_query(tree, \"$..id\", as = \"R\") |> as.character() children_per_node <- j_query(tree, \"$..children.length\", as = \"R\") edgelist <- function(id, n) { stopifnot(identical(length(id), length(n))) parent_id <- n_children <- integer() from <- to <- integer(length(id) - 1L) level <- 0L for (i in seq_along(id)) { if (i > 1) { ## record link from parent to child from[i - 1] <- tail(parent_id, 1L) to[i - 1] <- id[i] n_children[level] <- n_children[level] - 1L } if (level > 0 && n_children[level] == 0L) { ## 'pop' level level <- level - 1L parent_id <- head(parent_id, -1L) n_children <- head(n_children, -1L) } if (n[i] != 0) { ## 'push' level level <- level + 1L parent_id <- c(parent_id, id[i]) n_children <- c(n_children, n[i]) } } tibble(from, to) } edges <- edgelist(id, children_per_node) edges #> # A tibble: 496 × 2 #> from to #> #> 1 10282 14036 #> 2 14036 176896857 #> 3 14036 555102855 #> 4 555102855 555936524 #> 5 14036 537375237 #> 6 14036 439201502 #> 7 439201502 202057753 #> 8 439201502 228480345 #> 9 439201502 237641982 #> 10 439201502 352878120 #> # ℹ 486 more rows tg <- tidygraph::tbl_graph(nodes = nodes, edges = edges, node_key = \"id\") tg #> # A tbl_graph: 497 nodes and 496 edges #> # #> # A rooted tree #> # #> # Node Data: 497 × 3 (active) #> id de en #> #> 1 10282 \"Universität zu Köln\" Univ… #> 2 14036 \"Fakultät 1: Wirtschafts- und Sozialwissenschaftliche Fakult… Facu… #> 3 176896857 \"Cologne Graduate School in Management, \\rEconomics and Soci… Colo… #> 4 555102855 \"Cologne Institute for Information Systems (CIIS)\" Colo… #> 5 555936524 \"Chair of Business Analytics\" Chai… #> 6 537375237 \"ECONtribute: Markets & Public Policy\" ECON… #> 7 439201502 \"Fachbereich Volkswirtschaftslehre\" Econ… #> 8 202057753 \"Center for Macroeconomic Research (CMR)\" Cent… #> 9 228480345 \"Seminar für Energiewissenschaft\" Chai… #> 10 237641982 \"Seminar für Experimentelle Wirtschafts- und Verhaltensforsc… Semi… #> # ℹ 487 more rows #> # #> # Edge Data: 496 × 2 #> from to #> #> 1 1 2 #> 2 2 3 #> 3 2 4 #> # ℹ 493 more rows tg |> ## find the shortest path between two nodes... tidygraph::convert( tidygraph::to_shortest_path, de == \"Fakultät 1: Wirtschafts- und Sozialwissenschaftliche Fakultät\", de == \"Professur Hölzl\" ) |> ## extract the nodes from the resulting graph as_tibble(\"nodes\") |> select(id, de) #> # A tibble: 5 × 2 #> id de #> #> 1 14036 Fakultät 1: Wirtschafts- und Sozialwissenschaftliche Fakultät #> 2 222785602 Fakultätsbereich Soziologie und Sozialpsychologie #> 3 18065 Institut für Soziologie und Sozialpsychologie (ISS) #> 4 18068 Lehrstuhl für Wirtschafts- und Sozialpsychologie #> 5 309061914 Professur Hölzl ggraph::ggraph(tg, \"tree\", circular = TRUE) + ggraph::geom_edge_elbow()"},{"path":[]},{"path":"https://mtmorgan.github.io/rjsoncons/articles/c_examples.html","id":"moving-elements","dir":"Articles","previous_headings":"Patch","what":"Moving elements","title":"Examples","text":"https://stackoverflow.com/questions/78047988 example completely reproducible, challenge igraph package produces JSON like desired data moves ‘directed’ attribute top-level field. JSON patch documentation, patch single ‘move’ operation /attributes/directed top-level /: patch accomplished JSON string visualized listviwer::jsonedit(patched_data), patched_data |> as_r() |> str(), ","code":"data <- ' { \"nodes\": [ { \"name\": \"something\" }, { \"name\": \"something_else\" } ], \"links\": [ { \"source\": \"something\", \"target\": \"something_else\" } ], \"attributes\": { \"directed\": false } }' patch <- '[ {\"op\": \"move\", \"from\": \"/attributes/directed\", \"path\": \"/directed\"} ]' patched_data <- j_patch_apply(data, patch) patched_data |> jsonlite::prettify() #> { #> \"nodes\": [ #> { #> \"name\": \"something\" #> }, #> { #> \"name\": \"something_else\" #> } #> ], #> \"links\": [ #> { #> \"source\": \"something\", #> \"target\": \"something_else\" #> } #> ], #> \"attributes\": { #> #> }, #> \"directed\": false #> } #>"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/c_examples.html","id":"session-information","dir":"Articles","previous_headings":"","what":"Session information","title":"Examples","text":"","code":"sessionInfo() #> R version 4.4.1 (2024-06-14) #> Platform: x86_64-pc-linux-gnu #> Running under: Ubuntu 22.04.4 LTS #> #> Matrix products: default #> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 #> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 #> #> locale: #> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8 #> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8 #> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C #> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C #> #> time zone: UTC #> tzcode source: system (glibc) #> #> attached base packages: #> [1] stats graphics grDevices utils datasets methods base #> #> other attached packages: #> [1] rvest_1.0.4 dplyr_1.1.4 rjsoncons_1.3.1.9001 #> #> loaded via a namespace (and not attached): #> [1] viridis_0.6.5 sass_0.4.9 utf8_1.2.4 generics_0.1.3 #> [5] tidyr_1.3.1 xml2_1.3.6 digest_0.6.37 magrittr_2.0.3 #> [9] evaluate_0.24.0 grid_4.4.1 fastmap_1.2.0 jsonlite_1.8.8 #> [13] ggrepel_0.9.6 gridExtra_2.3 httr_1.4.7 purrr_1.0.2 #> [17] fansi_1.0.6 viridisLite_0.4.2 scales_1.3.0 tweenr_2.0.3 #> [21] textshaping_0.4.0 jquerylib_0.1.4 cli_3.6.3 graphlayouts_1.1.1 #> [25] rlang_1.1.4 polyclip_1.10-7 tidygraph_1.3.1 munsell_0.5.1 #> [29] withr_3.0.1 cachem_1.1.0 yaml_2.3.10 tools_4.4.1 #> [33] memoise_2.0.1 colorspace_2.1-1 ggplot2_3.5.1 curl_5.2.2 #> [37] vctrs_0.6.5 R6_2.5.1 lifecycle_1.0.4 fs_1.6.4 #> [41] MASS_7.3-60.2 ragg_1.3.2 ggraph_2.2.1 pkgconfig_2.0.3 #> [45] desc_1.4.3 pkgdown_2.1.0 pillar_1.9.0 bslib_0.8.0 #> [49] gtable_0.3.5 data.table_1.16.0 glue_1.7.0 Rcpp_1.0.13 #> [53] ggforce_0.4.2 systemfonts_1.1.0 highr_0.11 xfun_0.47 #> [57] tibble_3.2.1 tidyselect_1.2.1 knitr_1.48 farver_2.1.2 #> [61] htmltools_0.5.8.1 igraph_2.0.3 labeling_0.4.3 rmarkdown_2.28 #> [65] compiler_4.4.1"},{"path":"https://mtmorgan.github.io/rjsoncons/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Martin Morgan. Author, maintainer. Marcel Ramos. Author. Daniel Parker. Author, copyright holder. jsoncons C++ library maintainer","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"Morgan M, Ramos M, Parker D (2024). rjsoncons: Query, Pivot, Patch, Validate 'JSON' 'NDJSON'. R package version 1.3.1.9001, https://mtmorgan.github.io/rjsoncons/.","code":"@Manual{, title = {rjsoncons: Query, Pivot, Patch, and Validate 'JSON' and 'NDJSON'}, author = {Martin Morgan and Marcel Ramos and Daniel Parker}, year = {2024}, note = {R package version 1.3.1.9001}, url = {https://mtmorgan.github.io/rjsoncons/}, }"},{"path":"https://mtmorgan.github.io/rjsoncons/index.html","id":"rjsoncons","dir":"","previous_headings":"","what":"Query, Pivot, Patch, and Validate JSON and NDJSON","title":"Query, Pivot, Patch, and Validate JSON and NDJSON","text":"package provides functions query (filter transform), pivot (convert array--objects object--arrays, easy import ‘R’ data frame), search, patch (edit), validate (JSON Schema) ‘JSON’ ‘NDJSON’ strings, files, URLs. Query pivot support JSONpointer, JSONpath JMESpath expressions. implementation uses jsoncons header-library; library easily linked packages direct access ‘C++’ functionality implemented .","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/index.html","id":"installation-and-loading","dir":"","previous_headings":"","what":"Installation and loading","title":"Query, Pivot, Patch, and Validate JSON and NDJSON","text":"Install released package version CRAN Install development version Attach installed package R session ","code":"install.packages(\"rjsoncons\", repos = \"https://CRAN.R-project.org\") if (!requireNamespace(\"remotes\", quiety = TRUE)) install.packages(\"remotes\", repos = \"https://CRAN.R-project.org\") remotes::install_github(\"mtmorgan/rjsoncons\") library(rjsoncons)"},{"path":"https://mtmorgan.github.io/rjsoncons/index.html","id":"use-cases","dir":"","previous_headings":"","what":"Use cases","title":"Query, Pivot, Patch, and Validate JSON and NDJSON","text":"introductory vignette outlines common use cases, including: Filter large JSON NDJSON documents extract records elements interest. Extract deeply nested elements. Transform data direct incorporation R data structures. ‘Patch’ JSON strings programmatically, e.g., update HTTP request payloads. Validate JSON documents JSON schemas jsoncons C++ header-library useful starting point advanced JSON manipulation. vignette outlines rjsoncons can used R packages wishing access C++ library.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/index.html","id":"next-steps","dir":"","previous_headings":"","what":"Next steps","title":"Query, Pivot, Patch, and Validate JSON and NDJSON","text":"See introductory vignette additional details.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/as_r.html","id":null,"dir":"Reference","previous_headings":"","what":"Parse JSON or NDJSON to R — as_r","title":"Parse JSON or NDJSON to R — as_r","text":"as_r() transforms JSON NDJSON R object.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/as_r.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Parse JSON or NDJSON to R — as_r","text":"","code":"as_r( data, object_names = \"asis\", ..., n_records = Inf, verbose = FALSE, data_type = j_data_type(data) )"},{"path":"https://mtmorgan.github.io/rjsoncons/reference/as_r.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Parse JSON or NDJSON to R — as_r","text":"data character() JSON string NDJSON records, name file URL containing JSON NDJSON, R object parsed JSON string using jsonlite::toJSON(). object_names character(1) order data object elements \"asis\" (default) \"sort\" filtering path. ... passed jsonlite::toJSON data R object. n_records numeric(1) maximum number NDJSON records parsed. verbose logical(1) report progress parsing large NDJSON files. data_type character(1) type data; one \"json\", \"ndjson\", value returned j_data_type().","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/as_r.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Parse JSON or NDJSON to R — as_r","text":"as_r() returns R object.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/as_r.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Parse JSON or NDJSON to R — as_r","text":"= \"R\" argument j_query(), j_pivot(), as_r() function transform JSON NDJSON R object. JSON NDJSON can character vector, file, url, R object (first translated JSON string). Main rules : JSON arrays single type (boolean, integer, double, string) transformed R vectors length corresponding type. JSON scalar JSON vector length 1 represented way R. JSON 64-bit integer array contains value larger R's 32-bit integer representation, array transformed R numeric vector. NOTE results loss precision 64-bit integer values greater 2^53. JSON arrays mixing integer double values transformed R numeric vectors. JSON objects transformed R named lists. vignette reiterates information provides additional details.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/as_r.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Parse JSON or NDJSON to R — as_r","text":"","code":"## as_r() as_r('[1, 2, 3]') # JSON integer array -> R integer vector #> [1] 1 2 3 as_r('[1, 2.0, 3]') # JSON intger and double array -> R numeric vector #> [1] 1 2 3 as_r('[1, 2.0, \"3\"]') # JSON mixed array -> R list #> [[1]] #> [1] 1 #> #> [[2]] #> [1] 2 #> #> [[3]] #> [1] \"3\" #> as_r('[1, 2147483648]') # JSON integer > R integer max -> R numeric vector #> [1] 1 2147483648 json <- '{\"b\": 1, \"a\": [\"c\", \"d\"], \"e\": true, \"f\": [true], \"g\": {}}' as_r(json) |> str() # parsing complex objects #> List of 5 #> $ b: int 1 #> $ a: chr [1:2] \"c\" \"d\" #> $ e: logi TRUE #> $ f: logi TRUE #> $ g: Named list() identical( # JSON scalar and length 1 array identical in R as_r('{\"a\": 1}'), as_r('{\"a\": [1]}') ) #> [1] TRUE"},{"path":"https://mtmorgan.github.io/rjsoncons/reference/flatten.html","id":null,"dir":"Reference","previous_headings":"","what":"Flatten and find keys or values in JSON or NDJSON documents — j_flatten","title":"Flatten and find keys or values in JSON or NDJSON documents — j_flatten","text":"j_flatten() transforms JSON document list names JSONpointer 'paths' elements corresponding 'values' JSON document. j_find_values() finds paths exactly matching values. j_find_values_grep() finds paths values matching regular expression. j_find_keys() finds paths exactly matching keys. j_find_keys_grep() finds paths keys matching regular expression. NDJSON documents, result either character vector (= \"string\") list R objects, one element NDJSON record.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/flatten.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Flatten and find keys or values in JSON or NDJSON documents — j_flatten","text":"","code":"j_flatten( data, object_names = \"asis\", as = \"string\", ..., n_records = Inf, verbose = FALSE, data_type = j_data_type(data), path_type = \"JSONpointer\" ) j_find_values( data, values, object_names = \"asis\", as = \"R\", ..., n_records = Inf, verbose = FALSE, data_type = j_data_type(data), path_type = \"JSONpointer\" ) j_find_values_grep( data, pattern, object_names = \"asis\", as = \"R\", ..., grep_args = list(), n_records = Inf, verbose = FALSE, data_type = j_data_type(data), path_type = \"JSONpointer\" ) j_find_keys( data, keys, object_names = \"asis\", as = \"R\", ..., n_records = Inf, verbose = FALSE, data_type = j_data_type(data), path_type = \"JSONpointer\" ) j_find_keys_grep( data, pattern, object_names = \"asis\", as = \"R\", ..., grep_args = list(), n_records = Inf, verbose = FALSE, data_type = j_data_type(data), path_type = \"JSONpointer\" )"},{"path":"https://mtmorgan.github.io/rjsoncons/reference/flatten.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Flatten and find keys or values in JSON or NDJSON documents — j_flatten","text":"data character() JSON string NDJSON records, name file URL containing JSON NDJSON, R object parsed JSON string using jsonlite::toJSON(). object_names character(1) order data object elements \"asis\" (default) \"sort\" filtering path. character(1) describing return type. j_flatten(), either \"string\" \"R\". functions page, one \"R\", \"data.frame\", \"tibble\". ... passed jsonlite::toJSON data R object. n_records numeric(1) maximum number NDJSON records parsed. verbose logical(1) report progress parsing large NDJSON files. data_type character(1) type data; one \"json\", \"ndjson\", value returned j_data_type(). path_type character(1) type 'path' returned; one '\"JSONpointer\"', '\"JSONpath\"'; '\"JMESpath\"' supported. values vector one values matched exactly values JSON document. pattern character(1) regular expression match values paths. grep_args list() additional arguments passed grepl() searching values paths. keys character() vector one keys matched exactly path elements.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/flatten.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Flatten and find keys or values in JSON or NDJSON documents — j_flatten","text":"j_flatten(= \"string\") (default) returns JSON string representation flattened document, .e., object keys JSONpointer paths values value corresponding path original document. j_flatten(= \"R\") returns named list, names() JSONpointer paths element JSON document list elements corresponding values. j_find_values() j_find_values_grep() return list names JSONpointer paths list elements matching values, data.frame tibble columns path value. Values coerced common type data.frame tibble. j_find_keys() j_find_keys_grep() returns list, data.frame, tibble similar j_find_values() j_find_values_grep(). NDJSON documents, result vector paralleling NDJSON document, j_flatten() applied element NDJSON document.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/flatten.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Flatten and find keys or values in JSON or NDJSON documents — j_flatten","text":"Functions documented page expand data path / value pairs. suitable large JSON documents. j_find_keys(), key must exactly match one consecutive keys JSONpointer path returned j_flatten(). j_find_keys_grep(), key can define pattern spans across JSONpointer JSONpath elements.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/flatten.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Flatten and find keys or values in JSON or NDJSON documents — j_flatten","text":"","code":"json <- '{ \"discards\": { \"1000\": \"Record does not exist\", \"1004\": \"Queue limit exceeded\", \"1010\": \"Discarding timed-out partial msg\" }, \"warnings\": { \"0\": \"Phone number missing country code\", \"1\": \"State code missing\", \"2\": \"Zip code missing\" } }' ## JSONpointer j_flatten(json) |> cat(\"\\n\") #> {\"/discards/1000\":\"Record does not exist\",\"/discards/1004\":\"Queue limit exceeded\",\"/discards/1010\":\"Discarding timed-out partial msg\",\"/warnings/0\":\"Phone number missing country code\",\"/warnings/1\":\"State code missing\",\"/warnings/2\":\"Zip code missing\"} ## JSONpath j_flatten(json, as = \"R\", path_type = \"JSONpath\") |> str() #> List of 6 #> $ $['discards']['1000']: chr \"Record does not exist\" #> $ $['discards']['1004']: chr \"Queue limit exceeded\" #> $ $['discards']['1010']: chr \"Discarding timed-out partial msg\" #> $ $['warnings']['0'] : chr \"Phone number missing country code\" #> $ $['warnings']['1'] : chr \"State code missing\" #> $ $['warnings']['2'] : chr \"Zip code missing\" j_find_values(json, \"Zip code missing\", as = \"tibble\") #> # A tibble: 1 × 2 #> path value #> #> 1 /warnings/2 Zip code missing j_find_values( json, c(\"Queue limit exceeded\", \"Zip code missing\"), as = \"tibble\" ) #> # A tibble: 2 × 2 #> path value #> #> 1 /discards/1004 Queue limit exceeded #> 2 /warnings/2 Zip code missing j_find_values_grep(json, \"missing\", as = \"tibble\") #> # A tibble: 3 × 2 #> path value #> #> 1 /warnings/0 Phone number missing country code #> 2 /warnings/1 State code missing #> 3 /warnings/2 Zip code missing ## JSONpath j_find_values_grep(json, \"missing\", as = \"tibble\", path_type = \"JSONpath\") #> # A tibble: 3 × 2 #> path value #> #> 1 $['warnings']['0'] Phone number missing country code #> 2 $['warnings']['1'] State code missing #> 3 $['warnings']['2'] Zip code missing j_find_keys(json, \"discards\", as = \"tibble\") #> # A tibble: 3 × 2 #> path value #> #> 1 /discards/1000 Record does not exist #> 2 /discards/1004 Queue limit exceeded #> 3 /discards/1010 Discarding timed-out partial msg j_find_keys(json, \"1\", as = \"tibble\") #> # A tibble: 1 × 2 #> path value #> #> 1 /warnings/1 State code missing j_find_keys(json, c(\"discards\", \"warnings\"), as = \"tibble\") #> # A tibble: 6 × 2 #> path value #> #> 1 /discards/1000 Record does not exist #> 2 /discards/1004 Queue limit exceeded #> 3 /discards/1010 Discarding timed-out partial msg #> 4 /warnings/0 Phone number missing country code #> 5 /warnings/1 State code missing #> 6 /warnings/2 Zip code missing ## JSONpath j_find_keys(json, \"discards\", as = \"tibble\", path_type = \"JSONpath\") #> # A tibble: 3 × 2 #> path value #> #> 1 $['discards']['1000'] Record does not exist #> 2 $['discards']['1004'] Queue limit exceeded #> 3 $['discards']['1010'] Discarding timed-out partial msg j_find_keys_grep(json, \"discard\", as = \"tibble\") #> # A tibble: 3 × 2 #> path value #> #> 1 /discards/1000 Record does not exist #> 2 /discards/1004 Queue limit exceeded #> 3 /discards/1010 Discarding timed-out partial msg j_find_keys_grep(json, \"1\", as = \"tibble\") #> # A tibble: 4 × 2 #> path value #> #> 1 /discards/1000 Record does not exist #> 2 /discards/1004 Queue limit exceeded #> 3 /discards/1010 Discarding timed-out partial msg #> 4 /warnings/1 State code missing j_find_keys_grep(json, \"car.*/101\", as = \"tibble\") #> # A tibble: 1 × 2 #> path value #> #> 1 /discards/1010 Discarding timed-out partial msg ## JSONpath j_find_keys_grep(json, \"car.*\\\\['101\", as = \"tibble\", path_type = \"JSONpath\") #> # A tibble: 1 × 2 #> path value #> #> 1 $['discards']['1010'] Discarding timed-out partial msg ## NDJSON ndjson_file <- system.file(package = \"rjsoncons\", \"extdata\", \"example.ndjson\") j_flatten(ndjson_file) |> noquote() #> [1] {\"/name\":\"Seattle\",\"/state\":\"WA\"} {\"/name\":\"New York\",\"/state\":\"NY\"} #> [3] {\"/name\":\"Bellevue\",\"/state\":\"WA\"} {\"/name\":\"Olympia\",\"/state\":\"WA\"} j_find_values_grep(ndjson_file, \"e\") |> str() #> List of 4 #> $ :List of 1 #> ..$ /name: chr \"Seattle\" #> $ :List of 1 #> ..$ /name: chr \"New York\" #> $ :List of 1 #> ..$ /name: chr \"Bellevue\" #> $ : Named list()"},{"path":"https://mtmorgan.github.io/rjsoncons/reference/j_data_type.html","id":null,"dir":"Reference","previous_headings":"","what":"Detect JSON and NDJSON data and path types — j_data_type","title":"Detect JSON and NDJSON data and path types — j_data_type","text":"j_data_type() uses simple rules determine whether 'data' JSON, NDJSON, file, url, R. j_path_type() uses simple rules identify whether path JSONpointer, JSONpath, JMESpath expression.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/j_data_type.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Detect JSON and NDJSON data and path types — j_data_type","text":"","code":"j_data_type(data) j_path_type(path)"},{"path":"https://mtmorgan.github.io/rjsoncons/reference/j_data_type.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Detect JSON and NDJSON data and path types — j_data_type","text":"data character() JSON string NDJSON records, name file URL containing JSON NDJSON, R object parsed JSON string using jsonlite::toJSON(). path character(1) JSONpointer, JSONpath JMESpath query string.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/j_data_type.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Detect JSON and NDJSON data and path types — j_data_type","text":"j_data_type() without arguments reports possible return values: \"json\", \"ndjson\", \"file\", \"url\", \"R\". provided argument, j_data_type() infers (validate) type data based following rules: scalar (length 1) character data, either \"url\" (matching regular expression \"^https?://\", \"file\" (file.exists(data) returns TRUE), \"json\". \"file\" \"url\" inferred, return value length 2 vector, first element inferred type data (\"json\" \"ndjson\") obtained first 2 lines file. character data length(data) > 1, \"ndjson\" elements start square bracket curly brace, consistently (.e., agreeing start first record), otherwise \"json\". \"R\" non-character data. j_path_type() without argument reports possible values: \"JSONpointer\", \"JSONpath\", \"JMESpath\". provided argument, j_path_type() infers type path using simple incomplete classification: \"JSONpointer\" inferred path \"\" starts \"/\". \"JSONpath\" expressions start \"$\". \"JMESpath\" expressions satisfy neither JSONpointer JSONpath criteria. rules, valid JSONpointer path \"@\" interpreted JMESpath; use jsonpointer() JSONpointer behavior required.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/j_data_type.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Detect JSON and NDJSON data and path types — j_data_type","text":"","code":"j_data_type() # available types #> [[1]] #> [1] \"json\" #> #> [[2]] #> [1] \"ndjson\" #> #> [[3]] #> [1] \"json\" \"file\" #> #> [[4]] #> [1] \"ndjson\" \"file\" #> #> [[5]] #> [1] \"json\" \"url\" #> #> [[6]] #> [1] \"ndjson\" \"url\" #> #> [[7]] #> [1] \"R\" #> j_data_type(\"\") # json #> [1] \"R\" j_data_type('{\"a\": 1}') # json #> [1] \"json\" j_data_type(c('[{\"a\": 1}', '{\"a\": 2}]')) # json #> [1] \"json\" j_data_type(c('{\"a\": 1}', '{\"a\": 2}')) # ndjson #> [1] \"ndjson\" j_data_type(list(a = 1, b = 2)) # R #> [1] \"R\" fl <- system.file(package = \"rjsoncons\", \"extdata\", \"example.json\") j_data_type(fl) # c('json', 'file') #> [1] \"json\" \"file\" j_data_type(readLines(fl)) # json #> [1] \"json\" j_path_type() # available types #> [1] \"JSONpointer\" \"JSONpath\" \"JMESpath\" j_path_type(\"\") # JSONpointer #> [1] \"JSONpointer\" j_path_type(\"/locations/0/name\") # JSONpointer #> [1] \"JSONpointer\" j_path_type(\"$.locations[0].name\") # JSONpath #> [1] \"JSONpath\" j_path_type(\"locations[0].name\") # JMESpath #> [1] \"JMESpath\""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/patch.html","id":null,"dir":"Reference","previous_headings":"","what":"Patch or compute the difference between two JSON documents — j_patch_apply","title":"Patch or compute the difference between two JSON documents — j_patch_apply","text":"j_patch_apply() uses JSON Patch https://jsonpatch.com transform JSON 'data' according rules JSON 'patch'. j_patch_from() computes JSON patch describing difference two JSON documents. j_patch_op() translates R arguments JSON representation patch, validating 'unboxing' arguments necessary.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/patch.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Patch or compute the difference between two JSON documents — j_patch_apply","text":"","code":"j_patch_apply(data, patch, as = \"string\", ...) j_patch_from(data_x, data_y, as = \"string\", ...) j_patch_op(op, path, ...) # Default S3 method j_patch_op(op, path, ..., from = NULL, value = NULL) # S3 method for class 'j_patch_op' j_patch_op(op, ...) # S3 method for class 'j_patch_op' c(..., recursive = FALSE) # S3 method for class 'j_patch_op' print(x, ...)"},{"path":"https://mtmorgan.github.io/rjsoncons/reference/patch.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Patch or compute the difference between two JSON documents — j_patch_apply","text":"data JSON character vector, file, URL, R object converted JSON using jsonline::fromJSON(data, ...). patch JSON 'patch' character vector, file, URL, R object, result j_patch_op(). character(1) return type; \"string\" returns JSON string, \"R\" returns R object using rules as_r(). ... j_patch_apply() j_patch_diff(), arguments passed jsonlite::toJSON data, patch, data_x, / data_y R object. appropriate add jsonlite::toJSON() argument auto_unbox = TRUE patch R object 'value' fields JSON scalars; complicated scenarios 'value' fields marked jsonlite::unbox() passed j_patch_*(). j_patch_op() ... additional arguments patch operation, e.g., path = ', value = '. data_x data. data_y data. op patch operation (\"add\", \"remove\", \"replace\", \"copy\", \"move\", \"test\"), 'piping' object created j_patch_op(). path character(1) JSONPointer path location patched. character(1) JSONPointer path location object copied moved . value R object translated JSON used add, replace, test. recursive Ignored. x object produced j_patch_op().","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/patch.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Patch or compute the difference between two JSON documents — j_patch_apply","text":"j_patch_apply() returns JSON string R object representing 'data' patched according 'patch'. j_patch_from() returns JSON string R object representing difference 'data_x' 'data_y'. j_patch_op() returns character vector subclass can used j_patch_apply().","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/patch.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Patch or compute the difference between two JSON documents — j_patch_apply","text":"j_patch_apply(), 'patch' JSON array objects. object describes patch applied. Simple examples available https://jsonpatch.com, verbs 'add', 'remove', 'replace', 'copy' 'test'. 'path' element operation JSON pointer; remember JSON arrays 0-based. add – add elements existing document. remove – remove elements document. replace – replace one element another copy – copy path another location. move – move path another location. test – test existence path; path exist, apply patch. examples illustrate patch one (JSON array single object) several (JSON array several arguments) operations. j_patch_apply() fits naturally pipeline composed |> transform JSON representations. j_patch_op() function takes care ensure op, path, arguments 'unboxed' (represented JSON scalars rather arrays). user must ensure value represented correctly applying jsonlite::unbox() individual elements adding auto_unbox = TRUE .... Examples illustrate different scenarios.","code":"{\"op\": \"add\", \"path\": \"/biscuits/1\", \"value\": {\"name\": \"Ginger Nut\"}} {\"op\": \"remove\", \"path\": \"/biscuits/0\"} { \"op\": \"replace\", \"path\": \"/biscuits/0/name\", \"value\": \"Chocolate Digestive\" } {\"op\": \"copy\", \"path\": \"/best_biscuit\", \"from\": \"/biscuits/0\"} {\"op\": \"move\", \"path\": \"/cookies\", \"from\": \"/biscuits\"} {\"op\": \"test\", \"path\": \"/best_biscuit/name\", \"value\": \"Choco Leibniz\"}"},{"path":"https://mtmorgan.github.io/rjsoncons/reference/patch.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Patch or compute the difference between two JSON documents — j_patch_apply","text":"","code":"data_file <- system.file(package = \"rjsoncons\", \"extdata\", \"patch_data.json\") ## add a biscuit patch <- '[ {\"op\": \"add\", \"path\": \"/biscuits/1\", \"value\": {\"name\": \"Ginger Nut\"}} ]' j_patch_apply(data_file, patch, as = \"R\") |> str() #> List of 1 #> $ biscuits:List of 3 #> ..$ :List of 1 #> .. ..$ name: chr \"Digestive\" #> ..$ :List of 1 #> .. ..$ name: chr \"Ginger Nut\" #> ..$ :List of 1 #> .. ..$ name: chr \"Choco Leibniz\" ## add a biscuit and choose a favorite patch <- '[ {\"op\": \"add\", \"path\": \"/biscuits/1\", \"value\": {\"name\": \"Ginger Nut\"}}, {\"op\": \"copy\", \"path\": \"/best_biscuit\", \"from\": \"/biscuits/2\"} ]' biscuits <- j_patch_apply(data_file, patch) as_r(biscuits) |> str() #> List of 2 #> $ biscuits :List of 3 #> ..$ :List of 1 #> .. ..$ name: chr \"Digestive\" #> ..$ :List of 1 #> .. ..$ name: chr \"Ginger Nut\" #> ..$ :List of 1 #> .. ..$ name: chr \"Choco Leibniz\" #> $ best_biscuit:List of 1 #> ..$ name: chr \"Choco Leibniz\" j_patch_from(biscuits, data_file, as = \"R\") |> str() #> List of 3 #> $ :List of 3 #> ..$ op : chr \"replace\" #> ..$ path : chr \"/biscuits/1/name\" #> ..$ value: chr \"Choco Leibniz\" #> $ :List of 2 #> ..$ op : chr \"remove\" #> ..$ path: chr \"/biscuits/2\" #> $ :List of 2 #> ..$ op : chr \"remove\" #> ..$ path: chr \"/best_biscuit\" if (requireNamespace(\"jsonlite\", quietly = TRUE)) { ## helper for constructing patch operations from R objects j_patch_op( \"add\", path = \"/biscuits/1\", value = list(name = \"Ginger Nut\"), ## 'Ginger Nut' is a JSON scalar, so auto-unbox the 'value' argument auto_unbox = TRUE ) j_patch_op(\"remove\", \"/biscuits/0\") j_patch_op( \"replace\", \"/biscuits/0/name\", ## also possible to unbox arguments explicitly value = jsonlite::unbox(\"Chocolate Digestive\") ) j_patch_op(\"copy\", \"/best_biscuit\", from = \"/biscuits/0\") j_patch_op(\"move\", \"/cookies\", from = \"/biscuits\") j_patch_op( \"test\", \"/best_biscuit/name\", value = \"Choco Leibniz\", auto_unbox = TRUE ) ## several operations value <- list(name = jsonlite::unbox(\"Ginger Nut\")) ops <- c( j_patch_op(\"add\", \"/biscuits/1\", value = value), j_patch_op(\"copy\", path = \"/best_biscuit\", from = \"/biscuits/0\") ) ops ops <- j_patch_op(\"add\", \"/biscuits/1\", value = value) |> j_patch_op(\"copy\", path = \"/best_biscuit\", from = \"/biscuits/0\") ops } #> [ #> {\"op\": \"add\", \"path\": \"/biscuits/1\", \"value\": {\"name\": \"Ginger Nut\"}}, #> {\"op\": \"copy\", \"path\": \"/best_biscuit\", \"from\": \"/biscuits/0\"} #> ]"},{"path":"https://mtmorgan.github.io/rjsoncons/reference/rquerypivot.html","id":null,"dir":"Reference","previous_headings":"","what":"Query and pivot JSON and NDJSON documents — j_query","title":"Query and pivot JSON and NDJSON documents — j_query","text":"j_query() executes query JSON NDJSON document, automatically inferring type data path. j_pivot() transforms JSON array--objects object--arrays; can useful forming column-based tibble row-oriented JSON / NDJSON.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/rquerypivot.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Query and pivot JSON and NDJSON documents — j_query","text":"","code":"j_query( data, path = \"\", object_names = \"asis\", as = \"string\", ..., n_records = Inf, verbose = FALSE, data_type = j_data_type(data), path_type = j_path_type(path) ) j_pivot( data, path = \"\", object_names = \"asis\", as = \"string\", ..., n_records = Inf, verbose = FALSE, data_type = j_data_type(data), path_type = j_path_type(path) )"},{"path":"https://mtmorgan.github.io/rjsoncons/reference/rquerypivot.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Query and pivot JSON and NDJSON documents — j_query","text":"data character() JSON string NDJSON records, name file URL containing JSON NDJSON, R object parsed JSON string using jsonlite::toJSON(). path character(1) JSONpointer, JSONpath JMESpath query string. object_names character(1) order data object elements \"asis\" (default) \"sort\" filtering path. character(1) return type. j_query(), \"string\" returns JSON / NDJSON strings; \"R\" parses JSON / NDJSON R using rules as_r(). j_pivot() (JSON ), use = \"data.frame\" = \"tibble\" coerce result data.frame tibble. ... passed jsonlite::toJSON data R object. n_records numeric(1) maximum number NDJSON records parsed. verbose logical(1) report progress parsing large NDJSON files. data_type character(1) type data; one \"json\", \"ndjson\", value returned j_data_type(). path_type character(1) type path; one \"JSONpointer\", \"JSONpath\", \"JMESpath\". Inferred path using j_path_type().","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/rquerypivot.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Query and pivot JSON and NDJSON documents — j_query","text":"j_pivot() transforms 'array--objects' (typical JSON row-oriented representation table) 'object--arrays'. simple example transforms array two objects three fields '[{\"\": 1, \"b\": 2, \"c\": 3}, {\"\": 4, \"b\": 5, \"c\": 6}]' object three fields, vector length 2 '{\"\": [1, 4], \"b\": [2, 5], \"c\": [3, 6]}'. object--arrays representation corresponds closely R data.frame tibble, illustrated examples. j_pivot() JMESpath paths especially useful transforming NDJSON data.frame tibble","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/rquerypivot.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Query and pivot JSON and NDJSON documents — j_query","text":"","code":"json <- '{ \"locations\": [ {\"name\": \"Seattle\", \"state\": \"WA\"}, {\"name\": \"New York\", \"state\": \"NY\"}, {\"name\": \"Bellevue\", \"state\": \"WA\"}, {\"name\": \"Olympia\", \"state\": \"WA\"} ] }' j_query(json, \"/locations/0/name\") # JSONpointer #> [1] \"Seattle\" j_query(json, \"$.locations[*].name\", as = \"R\") # JSONpath #> [1] \"Seattle\" \"New York\" \"Bellevue\" \"Olympia\" j_query(json, \"locations[].state\", as = \"R\") # JMESpath #> [1] \"WA\" \"NY\" \"WA\" \"WA\" ## a few NDJSON records from ndjson_file <- system.file(package = \"rjsoncons\", \"extdata\", \"2023-02-08-0.json\") j_query(ndjson_file, \"{id: id, type: type}\") #> [1] \"{\\\"id\\\":\\\"26939254345\\\",\\\"type\\\":\\\"DeleteEvent\\\"}\" #> [2] \"{\\\"id\\\":\\\"26939254358\\\",\\\"type\\\":\\\"PushEvent\\\"}\" #> [3] \"{\\\"id\\\":\\\"26939254361\\\",\\\"type\\\":\\\"CreateEvent\\\"}\" #> [4] \"{\\\"id\\\":\\\"26939254365\\\",\\\"type\\\":\\\"CreateEvent\\\"}\" #> [5] \"{\\\"id\\\":\\\"26939254366\\\",\\\"type\\\":\\\"PushEvent\\\"}\" #> [6] \"{\\\"id\\\":\\\"26939254367\\\",\\\"type\\\":\\\"PushEvent\\\"}\" #> [7] \"{\\\"id\\\":\\\"26939254379\\\",\\\"type\\\":\\\"PushEvent\\\"}\" #> [8] \"{\\\"id\\\":\\\"26939254380\\\",\\\"type\\\":\\\"IssuesEvent\\\"}\" #> [9] \"{\\\"id\\\":\\\"26939254382\\\",\\\"type\\\":\\\"PushEvent\\\"}\" #> [10] \"{\\\"id\\\":\\\"26939254383\\\",\\\"type\\\":\\\"PushEvent\\\"}\" j_pivot(json, \"$.locations[?@.state=='WA']\", as = \"string\") #> [1] \"{\\\"name\\\":[\\\"Seattle\\\",\\\"Bellevue\\\",\\\"Olympia\\\"],\\\"state\\\":[\\\"WA\\\",\\\"WA\\\",\\\"WA\\\"]}\" j_pivot(json, \"locations[?@.state=='WA']\", as = \"R\") #> $name #> [1] \"Seattle\" \"Bellevue\" \"Olympia\" #> #> $state #> [1] \"WA\" \"WA\" \"WA\" #> j_pivot(json, \"locations[?@.state=='WA']\", as = \"data.frame\") #> name state #> 1 Seattle WA #> 2 Bellevue WA #> 3 Olympia WA j_pivot(json, \"locations[?@.state=='WA']\", as = \"tibble\") #> # A tibble: 3 × 2 #> name state #> #> 1 Seattle WA #> 2 Bellevue WA #> 3 Olympia WA ## use 'path' to pivot ndjson one record at at time j_pivot(ndjson_file, \"{id: id, type: type}\", as = \"data.frame\") #> id type #> 1 26939254345 DeleteEvent #> 2 26939254358 PushEvent #> 3 26939254361 CreateEvent #> 4 26939254365 CreateEvent #> 5 26939254366 PushEvent #> 6 26939254367 PushEvent #> 7 26939254379 PushEvent #> 8 26939254380 IssuesEvent #> 9 26939254382 PushEvent #> 10 26939254383 PushEvent ## 'org' is a nested element; extract it j_pivot(ndjson_file, \"org\", as = \"data.frame\") #> id login gravatar_id #> 1 30044474 gitcoinco #> 2 123667276 johnbieren-testing #> 3 123277977 CMPUT404-W23 #> 4 120284018 mornystannit #> url #> 1 https://api.github.com/orgs/gitcoinco #> 2 https://api.github.com/orgs/johnbieren-testing #> 3 https://api.github.com/orgs/CMPUT404-W23 #> 4 https://api.github.com/orgs/mornystannit #> avatar_url #> 1 https://avatars.githubusercontent.com/u/30044474? #> 2 https://avatars.githubusercontent.com/u/123667276? #> 3 https://avatars.githubusercontent.com/u/123277977? #> 4 https://avatars.githubusercontent.com/u/120284018? ## use j_pivot() to filter 'PushEvent' for organizations path <- \"[{id: id, type: type, org: org}] [?@.type == 'PushEvent' && @.org != null] | [0]\" j_pivot(ndjson_file, path, as = \"data.frame\") #> id type org.id org.login org.gravatar_id #> 1 26939254358 PushEvent 123667276 johnbieren-testing #> 2 26939254382 PushEvent 123667276 johnbieren-testing #> org.url #> 1 https://api.github.com/orgs/johnbieren-testing #> 2 https://api.github.com/orgs/johnbieren-testing #> org.avatar_url org.id.1 org.login.1 #> 1 https://avatars.githubusercontent.com/u/123667276? 120284018 mornystannit #> 2 https://avatars.githubusercontent.com/u/123667276? 120284018 mornystannit #> org.gravatar_id.1 org.url.1 #> 1 https://api.github.com/orgs/mornystannit #> 2 https://api.github.com/orgs/mornystannit #> org.avatar_url.1 #> 1 https://avatars.githubusercontent.com/u/120284018? #> 2 https://avatars.githubusercontent.com/u/120284018? ## try also ## ## j_pivot(ndjson_file, path, as = \"tibble\") |> ## tidyr::unnest_wider(\"org\", names_sep = \".\")"},{"path":"https://mtmorgan.github.io/rjsoncons/reference/schema.html","id":null,"dir":"Reference","previous_headings":"","what":"Validate JSON documents against JSON Schema — j_schema_is_valid","title":"Validate JSON documents against JSON Schema — j_schema_is_valid","text":"j_schema_is_vaild() uses JSON Schema https://json-schema.org/ validate JSON 'data' according 'schema'. j_schema_validate() returns JSON R object, data.frame, tibble, describing data conform schema. See \"Using 'jsoncons' R\" vignette help interpreting validation results.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/schema.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Validate JSON documents against JSON Schema — j_schema_is_valid","text":"","code":"j_schema_is_valid( data, schema, ..., data_type = j_data_type(data), schema_type = j_data_type(schema) ) j_schema_validate( data, schema, as = \"string\", ..., data_type = j_data_type(data), schema_type = j_data_type(schema) )"},{"path":"https://mtmorgan.github.io/rjsoncons/reference/schema.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Validate JSON documents against JSON Schema — j_schema_is_valid","text":"data JSON character vector, file, URL defining document validated. NDJSON data schema supported. schema JSON character vector, file, URL defining schema data validated. ... passed jsonlite::toJSON data character-valued. data_type character(1) type data; one \"json\" value returned j_data_type(); schema validation support \"ndjson\" data. schema_type character(1) type schema; see data_type. j_schema_validate(), one \"string\", \"R\", \"data.frame\", \"tibble\", \"details\", determine representation return value.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/schema.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Validate JSON documents against JSON Schema — j_schema_is_valid","text":"","code":"## Allowable `data_type=` and `schema_type` -- excludes 'ndjson' j_data_type() |> Filter(\\(type) !\"ndjson\" %in% type, x = _) |> str() #> List of 4 #> $ : chr \"json\" #> $ : chr [1:2] \"json\" \"file\" #> $ : chr [1:2] \"json\" \"url\" #> $ : chr \"R\" ## compare JSON patch to specification. 'op' key should have value ## 'add'; 'paths' key should be key 'path' ## schema <- \"https://json.schemastore.org/json-patch.json\" schema <- system.file(package = \"rjsoncons\", \"extdata\", \"json-patch.json\") op <- '[{ \"op\": \"adds\", \"paths\": \"/biscuits/1\", \"value\": { \"name\": \"Ginger Nut\" } }]' j_schema_is_valid(op, schema) #> [1] FALSE j_schema_validate(op, schema, as = \"details\") #> # A tibble: 10 × 5 #> valid evaluationPath schemaLocation instanceLocation error #> #> 1 FALSE /items/oneOf/0/required https://json.… /0 Requ… #> 2 FALSE /items/oneOf/0/properties/op/enum https://json.… /0/op 'add… #> 3 FALSE /items/oneOf/0/additionalPropert… https://json.… /0/paths Addi… #> 4 FALSE /items/oneOf/1/required https://json.… /0 Requ… #> 5 FALSE /items/oneOf/1/properties/op/enum https://json.… /0/op 'add… #> 6 FALSE /items/oneOf/1/additionalPropert… https://json.… /0/paths Addi… #> 7 FALSE /items/oneOf/2/required https://json.… /0 Requ… #> 8 FALSE /items/oneOf/2/required https://json.… /0 Requ… #> 9 FALSE /items/oneOf/2/properties/op/enum https://json.… /0/op 'add… #> 10 FALSE /items/oneOf/2/additionalPropert… https://json.… /0/paths Addi…"},{"path":"https://mtmorgan.github.io/rjsoncons/reference/version.html","id":null,"dir":"Reference","previous_headings":"","what":"Version of jsoncons C++ library — version","title":"Version of jsoncons C++ library — version","text":"version() reports version C++ jsoncons library use.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/version.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Version of jsoncons C++ library — version","text":"","code":"version()"},{"path":"https://mtmorgan.github.io/rjsoncons/reference/version.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Version of jsoncons C++ library — version","text":"version() returns character(1) major.minor.patch version string, possibly git hash -release version.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/version.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Version of jsoncons C++ library — version","text":"","code":"version() #> [1] \"0.173.4 [+57967655d]\""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/zzz_paths_and_pointer.html","id":null,"dir":"Reference","previous_headings":"","what":"JSONpath, JMESpath, or JSONpointer query of JSON / NDJSON documents; use j_query() instead — jsonpath","title":"JSONpath, JMESpath, or JSONpointer query of JSON / NDJSON documents; use j_query() instead — jsonpath","text":"jsonpath() executes query JSON string vector NDJSON entries using 'JSONpath' specification. jmespath() executes query JSON string using 'JMESpath' specification. jsonpointer() extracts element JSON string using 'JSON pointer' specification.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/zzz_paths_and_pointer.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"JSONpath, JMESpath, or JSONpointer query of JSON / NDJSON documents; use j_query() instead — jsonpath","text":"","code":"jsonpath(data, path, object_names = \"asis\", as = \"string\", ...) jmespath(data, path, object_names = \"asis\", as = \"string\", ...) jsonpointer(data, path, object_names = \"asis\", as = \"string\", ...)"},{"path":"https://mtmorgan.github.io/rjsoncons/reference/zzz_paths_and_pointer.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"JSONpath, JMESpath, or JSONpointer query of JSON / NDJSON documents; use j_query() instead — jsonpath","text":"data character() JSON string NDJSON records, name file URL containing JSON NDJSON, R object parsed JSON string using jsonlite::toJSON(). path character(1) JSONpointer, JSONpath JMESpath query string. object_names character(1) order data object elements \"asis\" (default) \"sort\" filtering path. character(1) return type. \"string\" returns single JSON string; \"R\" returns R object following rules outlined as_r(). ... arguments parsing NDJSON, passed jsonlite::toJSON data character-valued. NDJSON, Use n_records = 2 parse just first two records NDJSON document. Use verbose = TRUE obtain progress bar reading connection (file URL). Requires cli package. example use jsonlite::toJSON() use auto_unbox = TRUE automatically 'unbox' vectors length 1 JSON scalar values.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/zzz_paths_and_pointer.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"JSONpath, JMESpath, or JSONpointer query of JSON / NDJSON documents; use j_query() instead — jsonpath","text":"jsonpath(), jmespath() jsonpointer() return character(1) JSON string (= \"string\", default) R object (= \"R\") representing result query.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/reference/zzz_paths_and_pointer.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"JSONpath, JMESpath, or JSONpointer query of JSON / NDJSON documents; use j_query() instead — jsonpath","text":"","code":"json <- '{ \"locations\": [ {\"name\": \"Seattle\", \"state\": \"WA\"}, {\"name\": \"New York\", \"state\": \"NY\"}, {\"name\": \"Bellevue\", \"state\": \"WA\"}, {\"name\": \"Olympia\", \"state\": \"WA\"} ] }' ## return a JSON string jsonpath(json, \"$..name\") |> cat(\"\\n\") #> [\"Seattle\",\"New York\",\"Bellevue\",\"Olympia\"] ## return an R object jsonpath(json, \"$..name\", as = \"R\") #> [1] \"Seattle\" \"New York\" \"Bellevue\" \"Olympia\" ## create a list with state and name as scalar vectors lst <- as_r(json) if (requireNamespace(\"jsonlite\", quietly = TRUE)) { ## objects other than scalar character vectors are automatically ## coerced to JSON; use `auto_unbox = TRUE` to represent R scalar ## vectors in the object as JSON scalar vectors jsonpath(lst, \"$..name\", auto_unbox = TRUE) |> cat(\"\\n\") ## use I(\"Seattle\") to coerce to a JSON object [\"Seattle\"] jsonpath(I(\"Seattle\"), \"$[0]\") |> cat(\"\\n\") } #> [\"Seattle\",\"New York\",\"Bellevue\",\"Olympia\"] #> [\"Seattle\"] ## a scalar character vector like \"Seattle\" is not valid JSON... try(jsonpath(\"Seattle\", \"$\")) #> Error : JSON syntax_error at line 1 and column 1 ## ...but a double-quoted string is jsonpath('\"Seattle\"', \"$\") #> [1] \"[\\\"Seattle\\\"]\" ## different ordering of object names -- 'asis' (default) or 'sort' json_obj <- '{\"b\": \"1\", \"a\": \"2\"}' jsonpath(json_obj, \"$\") |> cat(\"\\n\") #> [{\"b\":\"1\",\"a\":\"2\"}] jsonpath(json_obj, \"$.*\") |> cat(\"\\n\") #> [\"1\",\"2\"] jsonpath(json_obj, \"$\", \"sort\") |> cat(\"\\n\") #> [{\"a\":\"2\",\"b\":\"1\"}] jsonpath(json_obj, \"$.*\", \"sort\") |> cat(\"\\n\") #> [\"2\",\"1\"] path <- \"locations[?state == 'WA'].name | sort(@)\" jmespath(json, path) |> cat(\"\\n\") #> [\"Bellevue\",\"Olympia\",\"Seattle\"] if (requireNamespace(\"jsonlite\", quietly = TRUE)) { ## original filter always fails, e.g., '[\"WA\"] != 'WA' jmespath(lst, path) # empty result set, '[]' ## filter with unboxed state, and return unboxed name jmespath(lst, \"locations[?state[0] == 'WA'].name[0] | sort(@)\") |> cat(\"\\n\") ## automatically unbox scalar values when creating the JSON string jmespath(lst, path, auto_unbox = TRUE) |> cat(\"\\n\") } #> [\"Bellevue\",\"Olympia\",\"Seattle\"] #> [\"Bellevue\",\"Olympia\",\"Seattle\"] ## jsonpointer 0-based arrays jsonpointer(json, \"/locations/0/name\") #> [1] \"Seattle\" ## document root \"\", sort selected element keys jsonpointer('{\"b\": 0, \"a\": 1}', \"\", \"sort\", as = \"R\") |> str() #> List of 2 #> $ a: int 1 #> $ b: int 0 ## 'Key not found' -- path '/' searches for a 0-length key try(jsonpointer('{\"b\": 0, \"a\": 1}', \"/\")) #> Error : Key not found"},{"path":"https://mtmorgan.github.io/rjsoncons/news/index.html","id":"rjsoncons-131","dir":"Changelog","previous_headings":"","what":"rjsoncons 1.3.1","title":"rjsoncons 1.3.1","text":"CRAN release: 2024-07-07 (1.3.0.9200) bug fix: NDJSON j_pivot('{\"\": [1,2]}') now pivots '{\"\": [[1,2]]}'. NDJSON j_pivot() records must always objects. (1.3.0.9100) add JSONPath examples article. (1.3.0.9100) make robust ‘noSuggests’ CRAN checks.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/news/index.html","id":"rjsoncons-130","dir":"Changelog","previous_headings":"","what":"rjsoncons 1.3.0","title":"rjsoncons 1.3.0","text":"CRAN release: 2024-04-18 (1.2.0.9900) better support j_pivot() keys differ objects. (1.2.0.9800) add JSON schema support j_schema_is_valid(), j_schema_validate(). (1.2.0.9704) add key value search j_flatten(), j_find_*() supporting JSONpointer JSONpath. (1.2.0.9602) compile Ubuntu 18.04 https://github.com/mtmorgan/rjsoncons/issues/3 (1.2.0.9503) add JSON patch support j_patch_apply(), j_patch_from(), j_patch_op(). (1.2.0.9401) internal C++ code cleanup refactoring. (1.2.0.9300) add ‘Examples’ web-vignette. (1.2.0.9201) restore progress bar NDJSON parsing. (1.2.0.9100) as_r() supports file url connections; improved connection implementation using C++ stream buffer. (1.2.0.9000) bug fix: support JSON j_pivot() file / url connections.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/news/index.html","id":"rjsoncons-120","dir":"Changelog","previous_headings":"","what":"rjsoncons 1.2.0","title":"rjsoncons 1.2.0","text":"CRAN release: 2024-01-26 (1.2.0) CRAN release. (1.1.0.9500) update documentation, include NDJSON-specific, web-vignette. (1.1.0.9400) support NDJSON file / url connections. (1.1.0.9300) implement j_query() (query without requiring path. specification), j_pivot(), j_path_type(). Remove jsonpivot(). (1.1.0.9200) implement jsonpointer() querying JSON documents. (1.1.0.9100) update jsoncons library 173.2, relaxing compiler requirements c++11. (1.1.0.9000) implement jsonpivot() transform JSON. array--objects object--arrays, common step representation data.frame.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/news/index.html","id":"rjsoncons-110","dir":"Changelog","previous_headings":"","what":"rjsoncons 1.1.0","title":"rjsoncons 1.1.0","text":"CRAN release: 2023-12-11 (1.1.0) CRAN release. (1.0.1.9100) using jsonlite (e.g., ‘toJSON()’ parsing R objects). requires separate installation jsonlite. (1.0.1.9000) update jsoncons library 0.172.1; addresses segfault ‘fedora’ CRAN builder.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/news/index.html","id":"rjsoncons-101","dir":"Changelog","previous_headings":"","what":"rjsoncons 1.0.1","title":"rjsoncons 1.0.1","text":"CRAN release: 2023-12-03 (1.0.1) CRAN release. (1.0.0.9200) use pkgdown. (1.0.0.9100) parse JSON R = \"R\" argument as_r().","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/news/index.html","id":"rjsoncons-100","dir":"Changelog","previous_headings":"","what":"rjsoncons 1.0.0","title":"rjsoncons 1.0.0","text":"CRAN release: 2022-09-29 (1.0.0) initial CRAN release. (0.0.99) pre-release version. (0.0.3) support object names ordering ‘asis’ ‘sort’. (0.0.3) DESCRIPTION file updates: correct ‘Title:’ capitalization; avoid warnings misspellings. (0.0.3) Add GitHub action rebuild README.md vignettes/rjsoncons.Rmd. (0.0.2) jsoncons library update. (0.0.2) support R object query addition JSON string. (0.0.2) add unit tests. (0.0.2) R minor C++ code refactoring. (0.0.1) initial C++ / R implementation jmespath() / jsonpath().","code":""}]
+[{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"introduction-installation","dir":"Articles","previous_headings":"","what":"Introduction & installation","title":"Transform and Validate JSON and NDJSON","text":"Use rjsoncons querying, transforming, searching JSON, NDJSON, R objects using JMESpath, JSONpath, JSONpointer. rjsoncons supports JSON patch document editing, JSON schema validation. Link package direct access additional features jsoncons C++ library. Install released package version CRAN Install development version Attach installed package R session, check version C++ library use","code":"install.packages(\"rjsoncons\", repos = \"https://CRAN.R-project.org\") if (!requireNamespace(\"remotes\", quiety = TRUE)) install.packages(\"remotes\", repos = \"https://CRAN.R-project.org\") remotes::install_github(\"mtmorgan/rjsoncons\") library(rjsoncons) rjsoncons::version() ## [1] \"0.173.4 [+57967655d]\""},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"query-and-pivot","dir":"Articles","previous_headings":"","what":"Query and pivot","title":"Transform and Validate JSON and NDJSON","text":"Functions package work JSON NDJSON character vectors, file paths URLs JSON NDJSON documents, R objects can transformed JSON string.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"select-filter-and-transform-with-j_query","dir":"Articles","previous_headings":"Query and pivot","what":"Select, filter and transform with j_query()","title":"Transform and Validate JSON and NDJSON","text":"simple JSON example document several common use cases. Use rjsoncons query JSON string using JSONpath, JMESpath JSONpointer syntax filter larger documents records interest, e.g., cities New York state, using ‘JMESpath’ syntax. Use = \"R\" argument extract deeply nested elements R objects, e.g., character vector city names Washington state. JSON Pointer specification simpler, indexing single object document. JSON arrays 0-based. examples use j_query(), automatically infers query specification form path using j_path_type(). may useful indicate query specification explicitly using jsonpointer(), jsonpath(), jmespath(); examples illustrating features available query specification help pages ?jsonpointer, ?jsonpath, ?jmespath.","code":"json <- '{ \"locations\": [ {\"name\": \"Seattle\", \"state\": \"WA\"}, {\"name\": \"New York\", \"state\": \"NY\"}, {\"name\": \"Bellevue\", \"state\": \"WA\"}, {\"name\": \"Olympia\", \"state\": \"WA\"} ] }' j_query(json, \"locations[?state == 'NY']\") |> cat(\"\\n\") ## [{\"name\":\"New York\",\"state\":\"NY\"}] j_query(json, \"locations[?state == 'WA'].name\", as = \"R\") ## [1] \"Seattle\" \"Bellevue\" \"Olympia\" j_query(json, \"/locations/0/state\") ## [1] \"WA\""},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"array-of-objects-to-r-data-frame-with-j_pivot","dir":"Articles","previous_headings":"Query and pivot","what":"Array-of-objects to R data.frame with j_pivot()","title":"Transform and Validate JSON and NDJSON","text":"following transforms nested JSON document format can incorporated directly R data.frame. transformation JSON ‘array--objects’ ‘object--arrays’ suitable direct representation data.frame common, implemented directly j_pivot() j_pivot() also support = \"tibble\" dplyr package installed.","code":"path <- '{ name: locations[].name, state: locations[].state }' j_query(json, path, as = \"R\") |> data.frame() ## name state ## 1 Seattle WA ## 2 New York NY ## 3 Bellevue WA ## 4 Olympia WA j_pivot(json, \"locations\", as = \"data.frame\") ## name state ## 1 Seattle WA ## 2 New York NY ## 3 Bellevue WA ## 4 Olympia WA"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"ndjson-support","dir":"Articles","previous_headings":"Query and pivot","what":"NDJSON support","title":"Transform and Validate JSON and NDJSON","text":"rjsoncons supports NDJSON (new-line delimited JSON). NDJSON consists file character vector line / element represents JSON record. example uses data GitHub Archive project recording actions public GitHub repositories. data included package first 10 lines https://data.gharchive.org/2023-02-08-0.json.gz. NDJSON can read R (ndjson <- readLines(ndjson_file)) used j_query() / j_pivot(), often better leave full NDJSON files disk. Thus first argument j_query() j_pivot() usually (text gz-compressed) file path URL. Two additional options available working NDJSON. n_records limits number records processed. Using n_records can useful exploring data. instance, first record file can viewed interactively option verbose = TRUE adds progress indicator, provides confidence progress made parsing large files. progress bar requires cli package. j_query() provides one--one mapping NDJSON lines / elements return value, e.g., j_query(ndjson_file, \"@\", = \"string\") NDJSON file 1000 lines return character vector 1000 elements, j_query(ndjson, \"@\", = \"R\") R list length 1000. j_pivot() transforms NDJSON file character vector objects format convenient input R. j_pivot() NDJSON files JMESpath paths work particularly well together, JMESpath provides flexibility creating JSON objects pivoted. Filtering NDJSON files can require relatively complicated paths, e.g., filter ‘PushEvent’ types organizations, construct query acts NDJSON record return array single object, apply filter replace uninteresting elements 0-length arrays (using = \"tibble\" often transforms R list--vectors tibble pleasing robust manner compared = \"data.frame\"). complete example used NDJSON extended vignette","code":"ndjson_file <- system.file(package = \"rjsoncons\", \"extdata\", \"2023-02-08-0.json\") j_query(ndjson_file, n_records = 1) |> listviewer::jsonedit() j_query(ndjson_file, \"{id: id, type: type}\", n_records = 5) ## [1] \"{\\\"id\\\":\\\"26939254345\\\",\\\"type\\\":\\\"DeleteEvent\\\"}\" ## [2] \"{\\\"id\\\":\\\"26939254358\\\",\\\"type\\\":\\\"PushEvent\\\"}\" ## [3] \"{\\\"id\\\":\\\"26939254361\\\",\\\"type\\\":\\\"CreateEvent\\\"}\" ## [4] \"{\\\"id\\\":\\\"26939254365\\\",\\\"type\\\":\\\"CreateEvent\\\"}\" ## [5] \"{\\\"id\\\":\\\"26939254366\\\",\\\"type\\\":\\\"PushEvent\\\"}\" j_pivot(ndjson_file, \"{id: id, type: type}\", as = \"data.frame\") ## id type ## 1 26939254345 DeleteEvent ## 2 26939254358 PushEvent ## 3 26939254361 CreateEvent ## 4 26939254365 CreateEvent ## 5 26939254366 PushEvent ## 6 26939254367 PushEvent ## 7 26939254379 PushEvent ## 8 26939254380 IssuesEvent ## 9 26939254382 PushEvent ## 10 26939254383 PushEvent path <- \"[{id: id, type: type, org: org}] [?@.type == 'PushEvent' && @.org != null] | [0]\" j_pivot(ndjson_file, path, as = \"data.frame\") ## id type org.id org.login org.gravatar_id ## 1 26939254358 PushEvent 123667276 johnbieren-testing ## 2 26939254382 PushEvent 123667276 johnbieren-testing ## org.url ## 1 https://api.github.com/orgs/johnbieren-testing ## 2 https://api.github.com/orgs/johnbieren-testing ## org.avatar_url org.id.1 org.login.1 ## 1 https://avatars.githubusercontent.com/u/123667276? 120284018 mornystannit ## 2 https://avatars.githubusercontent.com/u/123667276? 120284018 mornystannit ## org.gravatar_id.1 org.url.1 ## 1 https://api.github.com/orgs/mornystannit ## 2 https://api.github.com/orgs/mornystannit ## org.avatar_url.1 ## 1 https://avatars.githubusercontent.com/u/120284018? ## 2 https://avatars.githubusercontent.com/u/120284018?"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"r-objects-as-input","dir":"Articles","previous_headings":"Query and pivot","what":"R objects as input","title":"Transform and Validate JSON and NDJSON","text":"rjsoncons can filter transform R objects. converted JSON using jsonlite::toJSON() queries made; toJSON() arguments like auto_unbox = TRUE can added function call.","code":"## `lst` is an *R* list lst <- jsonlite::fromJSON(json, simplifyVector = FALSE) j_query(lst, \"locations[?state == 'WA'].name | sort(@)\", auto_unbox = TRUE) |> cat(\"\\n\") ## [\"Bellevue\",\"Olympia\",\"Seattle\"]"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"patch","dir":"Articles","previous_headings":"","what":"Patch","title":"Transform and Validate JSON and NDJSON","text":"JSON Patch provides simple way edit transform JSON document using JSON commands.","code":""},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"applying-a-patch-with-j_patch_apply","dir":"Articles","previous_headings":"Patch","what":"Applying a patch with j_patch_apply()","title":"Transform and Validate JSON and NDJSON","text":"Starting JSON document one can \"add\" another biscuit, copy favorite biscuit new locations using following patch paths specified using JSONpointer notation; remember JSON arrays 0-based, compared 1-based R arrays. Applying patch results new JSON document. Patches can also created R objects helper function j_patch_op(). j_patch_op() takes care unboxing op=, path=, =, care must taken ‘unboxing’ value= argument operations ‘add’; may also appropriate unbox specific fields, e.g., JSON patch web site, available operations example JSON : add – add elements existing document. remove – remove elements document. replace – replace one element another copy – copy path another location. move – move path another location. test – test existence path; path exist, apply patch. Formal description operations provided Section 4 RFC6902. patch command always array, even single operation involved.","code":"json <- '{ \"biscuits\": [ { \"name\": \"Digestive\" }, { \"name\": \"Choco Leibniz\" } ] }' patch <- '[ {\"op\": \"add\", \"path\": \"/biscuits/1\", \"value\": { \"name\": \"Ginger Nut\" }}, {\"op\": \"copy\", \"from\": \"/biscuits/2\", \"path\": \"/best_biscuit\"} ]' j_patch_apply(json, patch) ## [1] \"{\\\"biscuits\\\":[{\\\"name\\\":\\\"Digestive\\\"},{\\\"name\\\":\\\"Ginger Nut\\\"},{\\\"name\\\":\\\"Choco Leibniz\\\"}],\\\"best_biscuit\\\":{\\\"name\\\":\\\"Choco Leibniz\\\"}}\" ops <- c( j_patch_op( \"add\", \"/biscuits/1\", value = list(name = \"Ginger Nut\"), auto_unbox = TRUE ), j_patch_op(\"copy\", \"/best_biscuit\", from = \"/biscuits/2\") ) identical(j_patch_apply(json, patch), j_patch_apply(json, ops)) ## [1] TRUE value <- list(name = jsonlite::unbox(\"Ginger Nut\")) j_patch_op(\"add\", \"/biscuits/1\", value = value) ## [ ## {\"op\": \"add\", \"path\": \"/biscuits/1\", \"value\": {\"name\": \"Ginger Nut\"}} ## ] {\"op\": \"add\", \"path\": \"/biscuits/1\", \"value\": {\"name\": \"Ginger Nut\"}} {\"op\": \"remove\", \"path\": \"/biscuits/0\"} { \"op\": \"replace\", \"path\": \"/biscuits/0/name\", \"value\": \"Chocolate Digestive\" } {\"op\": \"copy\", \"from\": \"/biscuits/0\", \"path\": \"/best_biscuit\"} {\"op\": \"move\", \"from\": \"/biscuits\", \"path\": \"/cookies\"} {\"op\": \"test\", \"path\": \"/best_biscuit/name\", \"value\": \"Choco Leibniz\"}"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"difference-between-documents-with-j_patch_from","dir":"Articles","previous_headings":"Patch","what":"Difference between documents with j_patch_from()","title":"Transform and Validate JSON and NDJSON","text":"j_patch_from() function constructs patch difference two documents","code":"j_patch_from(j_patch_apply(json, patch), json) ## [1] \"[{\\\"op\\\":\\\"replace\\\",\\\"path\\\":\\\"/biscuits/1/name\\\",\\\"value\\\":\\\"Choco Leibniz\\\"},{\\\"op\\\":\\\"remove\\\",\\\"path\\\":\\\"/biscuits/2\\\"},{\\\"op\\\":\\\"remove\\\",\\\"path\\\":\\\"/best_biscuit\\\"}]\""},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"schema-validation","dir":"Articles","previous_headings":"","what":"Schema validation","title":"Transform and Validate JSON and NDJSON","text":"JSON schema provides structure JSON documents. j_schema_is_valid() checks JSON document valid specified schema, j_schema_validate() tries illustrate document deviates schema. example consider j_patch_op(), operation supposed conform JSON patch schema. convenience, copy schema available rjsoncons. well-formed ‘op’ valid, j_schema_validate() produces output Introduce invalid ‘op’, \"op\": \"invalid_op\", schema longer valid. reason can understood (careful!) consideration output j_schema_validate(), reference schema . validation indicates schema evaluationPath ‘/items/oneOf’ satisfied, error ‘schema [.e., ’oneOf’ elements] matched, …’. ‘details’ column summarizes 3 elements /items/oneOf fails schema specification; use = \"details\" extract directly indicates first item schema rejected ‘invalid_op’ valid enum Reasons rejecting items can explored using similar steps.","code":"## alternatively: schema <- \"https://json.schemastore.org/json-patch\" schema <- system.file(package = \"rjsoncons\", \"extdata\", \"json-patch.json\") cat(readLines(schema), sep = \"\\n\") ## { ## \"$schema\": \"http://json-schema.org/draft-04/schema#\", ## \"definitions\": { ## \"path\": { ## \"description\": \"A JSON Pointer path.\", ## \"type\": \"string\" ## } ## }, ## \"id\": \"https://json.schemastore.org/json-patch.json\", ## \"items\": { ## \"oneOf\": [ ## { ## \"additionalProperties\": false, ## \"required\": [\"value\", \"op\", \"path\"], ## \"properties\": { ## \"path\": { ## \"$ref\": \"#/definitions/path\" ## }, ## \"op\": { ## \"description\": \"The operation to perform.\", ## \"type\": \"string\", ## \"enum\": [\"add\", \"replace\", \"test\"] ## }, ## \"value\": { ## \"description\": \"The value to add, replace or test.\" ## } ## } ## }, ## { ## \"additionalProperties\": false, ## \"required\": [\"op\", \"path\"], ## \"properties\": { ## \"path\": { ## \"$ref\": \"#/definitions/path\" ## }, ## \"op\": { ## \"description\": \"The operation to perform.\", ## \"type\": \"string\", ## \"enum\": [\"remove\"] ## } ## } ## }, ## { ## \"additionalProperties\": false, ## \"required\": [\"from\", \"op\", \"path\"], ## \"properties\": { ## \"path\": { ## \"$ref\": \"#/definitions/path\" ## }, ## \"op\": { ## \"description\": \"The operation to perform.\", ## \"type\": \"string\", ## \"enum\": [\"move\", \"copy\"] ## }, ## \"from\": { ## \"$ref\": \"#/definitions/path\", ## \"description\": \"A JSON Pointer path pointing to the location to move/copy from.\" ## } ## } ## } ## ] ## }, ## \"title\": \"JSON schema for JSONPatch files\", ## \"type\": \"array\" ## } op <- '[{ \"op\": \"add\", \"path\": \"/biscuits/1\", \"value\": { \"name\": \"Ginger Nut\" } }]' j_schema_is_valid(op, schema) ## [1] TRUE j_schema_validate(op, schema) ## [1] \"[]\" op <- '[{ \"op\": \"invalid_op\", \"path\": \"/biscuits/1\", \"value\": { \"name\": \"Ginger Nut\" } }]' j_schema_is_valid(op, schema) ## [1] FALSE j_schema_validate(op, schema, as = \"tibble\") |> tibble::glimpse() ## Rows: 1 ## Columns: 6 ## $ valid FALSE ## $ evaluationPath \"/items/oneOf\" ## $ schemaLocation \"https://json.schemastore.org/json-patch.json#/items/… ## $ instanceLocation \"/0\" ## $ error \"No schema matched, but exactly one of them is requir… ## $ details [[FALSE, \"/items/oneOf/0/properties/op/enum\", \"https:… j_schema_validate(op, schema, as = \"details\") |> tibble::glimpse() ## Rows: 6 ## Columns: 5 ## $ valid FALSE, FALSE, FALSE, FALSE, FALSE, FALSE ## $ evaluationPath \"/items/oneOf/0/properties/op/enum\", \"/items/oneOf/1/… ## $ schemaLocation \"https://json.schemastore.org/json-patch.json#/items/… ## $ instanceLocation \"/0/op\", \"/0/op\", \"/0/value\", \"/0\", \"/0/op\", \"/0/valu… ## $ error \"'invalid_op' is not a valid enum value.\", \"'invalid_… j_query(schema, \"/items/oneOf/0/properties/op/enum\") |> noquote() ## [1] [\"add\",\"replace\",\"test\"]"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"flatten-and-find","dir":"Articles","previous_headings":"","what":"Flatten and find","title":"Transform and Validate JSON and NDJSON","text":"can sometimes helpful explore JSON documents ‘flattening’ JSON object path / value pairs, path JSONpointer path corresponding value. straight-forward search flattened object , e.g., path known field value. example, consider object ‘flat’ JSON can represented named list (using str() provide compact visual representation) names list JSONpointer (default) JSONpath, can used j_query() j_pivot() appropriate two ways find known keys values. first use exact matching one keys values, e.g., also possible match using regular expression. Keys always character vectors, values can different type; j_find_values() supports searches . common operation might find path know value, query original JSON find object value contained. JSONpointer JSONpath supported; advantage latter path distinguishes integer-valued (unquoted) string-valued (quoted) keys first argument j_find_*() can R object, JSON NDJSON string, file, URL. Using j_find_values() R object JSONpath path_type leads path easily converted R index: double [ ] path increment numerical index 1: NDJSON files flattened character vectors, element flattened version corresponding NDJSON record.","code":"codes <- '{ \"discards\": { \"1000\": \"Record does not exist\", \"1004\": \"Queue limit exceeded\", \"1010\": \"Discarding timed-out partial msg\" }, \"warnings\": { \"0\": \"Phone number missing country code\", \"1\": \"State code missing\", \"2\": \"Zip code missing\" } }' j_flatten(codes, as = \"R\") |> str() ## List of 6 ## $ /discards/1000: chr \"Record does not exist\" ## $ /discards/1004: chr \"Queue limit exceeded\" ## $ /discards/1010: chr \"Discarding timed-out partial msg\" ## $ /warnings/0 : chr \"Phone number missing country code\" ## $ /warnings/1 : chr \"State code missing\" ## $ /warnings/2 : chr \"Zip code missing\" j_query(codes, \"/discards/1010\") ## [1] \"Discarding timed-out partial msg\" j_find_values( codes, c(\"Record does not exist\", \"State code missing\"), as = \"tibble\" ) ## # A tibble: 2 × 2 ## path value ## ## 1 /discards/1000 Record does not exist ## 2 /warnings/1 State code missing j_find_keys(codes, \"warnings\", as = \"tibble\") ## # A tibble: 3 × 2 ## path value ## ## 1 /warnings/0 Phone number missing country code ## 2 /warnings/1 State code missing ## 3 /warnings/2 Zip code missing j_find_values_grep(codes, \"missing\", as = \"tibble\") ## # A tibble: 3 × 2 ## path value ## ## 1 /warnings/0 Phone number missing country code ## 2 /warnings/1 State code missing ## 3 /warnings/2 Zip code missing j_find_keys_grep(codes, \"card.*/100\", as = \"tibble\") # span key delimiters ## # A tibble: 2 × 2 ## path value ## ## 1 /discards/1000 Record does not exist ## 2 /discards/1004 Queue limit exceeded j <- '{\"x\":[1,[2, 3]],\"y\":{\"a\":4}}' j_flatten(j, as = \"R\") |> str() ## List of 4 ## $ /x/0 : int 1 ## $ /x/1/0: int 2 ## $ /x/1/1: int 3 ## $ /y/a : int 4 j_find_values(j, c(2, 4), as = \"tibble\") ## # A tibble: 2 × 2 ## path value ## ## 1 /x/1/0 2 ## 2 /y/a 4 j_find_values(j, 3, as = \"tibble\") ## # A tibble: 1 × 2 ## path value ## ## 1 /x/1/1 3 ## path to '3' is '/x/1/1', so containing object is at '/x/1' j_query(j, \"/x/1\") ## [1] \"[2,3]\" j_query(j, \"/x/1\", as = \"R\") ## [1] 2 3 j_find_values(j, 3, as = \"tibble\", path_type = \"JSONpath\") ## # A tibble: 1 × 2 ## path value ## ## 1 $['x'][1][1] 3 l <- j |> as_r() j_find_values(l, 3, auto_unbox = TRUE, path_type = \"JSONpath\", as = \"tibble\") ## # A tibble: 1 × 2 ## path value ## ## 1 $['x'][1][1] 3 l[['x']][[2]] # siblings ## [1] 2 3"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"the-json-parser","dir":"Articles","previous_headings":"","what":"The JSON parser","title":"Transform and Validate JSON and NDJSON","text":"package includes JSON parser, used argument = \"R\" directly as_r() main rules transformation outlined . JSON arrays single type (boolean, integer, double, string) transformed R vectors length corresponding type. JSON arrays mixing integer double values transformed R numeric vectors. JSON integer array contains value larger R’s 32-bit integer representation, array transformed R numeric vector. NOTE results loss precision JSON integer values greater 2^53. JSON objects transformed R named lists. several additional details. JSON scalar JSON vector length 1 represented way R. JSON arrays mixing types integer double transformed R lists JSON null values represented R NULL values; arrays null transformed lists Ordering object members controlled object_names= argument. default preserves names appear JSON definition; use \"sort\" sort names alphabetically. argument applied recursively. parser corresponds approximately jsonlite::fromJSON() arguments simplifyVector = TRUE, simplifyDataFrame = FALSE, simplifyMatrix = FALSE). Unit tests (using tinytest framework) providing additional details available ","code":"as_r('{\"a\": 1.0, \"b\": [2, 3, 4]}') |> str() #> List of 2 #> $ a: num 1 #> $ b: int [1:3] 2 3 4 as_r('[true, false, true]') # boolean -> logical ## [1] TRUE FALSE TRUE as_r('[1, 2, 3]') # integer -> integer ## [1] 1 2 3 as_r('[1.0, 2.0, 3.0]') # double -> numeric ## [1] 1 2 3 as_r('[\"a\", \"b\", \"c\"]') # string -> character ## [1] \"a\" \"b\" \"c\" as_r('[1, 2.0]') |> class() # numeric ## [1] \"numeric\" as_r('[1, 2147483648]') |> class() # 64-bit integers -> numeric ## [1] \"numeric\" as_r('{}') ## named list() as_r('{\"a\": 1.0, \"b\": [2, 3, 4]}') |> str() ## List of 2 ## $ a: num 1 ## $ b: int [1:3] 2 3 4 identical(as_r(\"3.14\"), as_r(\"[3.14]\")) ## [1] TRUE as_r('[true, 1, \"a\"]') |> str() ## List of 3 ## $ : logi TRUE ## $ : int 1 ## $ : chr \"a\" as_r('null') # NULL ## NULL as_r('[null]') |> str() # list(NULL) ## List of 1 ## $ : NULL as_r('[null, null]') |> str() # list(NULL, NULL) ## List of 2 ## $ : NULL ## $ : NULL json <- '{\"b\": 1, \"a\": {\"d\": 2, \"c\": 3}}' as_r(json) |> str() ## List of 2 ## $ b: int 1 ## $ a:List of 2 ## ..$ d: int 2 ## ..$ c: int 3 as_r(json, object_names = \"sort\") |> str() ## List of 2 ## $ a:List of 2 ## ..$ c: int 3 ## ..$ d: int 2 ## $ b: int 1 system.file(package = \"rjsoncons\", \"tinytest\", \"test_as_r.R\")"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"using-jsonlitefromjson","dir":"Articles","previous_headings":"The JSON parser","what":"Using jsonlite::fromJSON()","title":"Transform and Validate JSON and NDJSON","text":"built-parser can replaced alternative parsers returning query JSON string, e.g., using fromJSON() jsonlite package. rjsoncons package particularly useful accessing elements might otherwise require complicated application nested lapply(), purrr expressions, tidyr unnest_*() (see R Data Science chapter ‘Hierarchical data’).","code":"json <- '{ \"locations\": [ {\"name\": \"Seattle\", \"state\": \"WA\"}, {\"name\": \"New York\", \"state\": \"NY\"}, {\"name\": \"Bellevue\", \"state\": \"WA\"}, {\"name\": \"Olympia\", \"state\": \"WA\"} ] }' j_query(json, \"locations[?state == 'WA']\") |> ## `fromJSON()` simplifies list-of-objects to data.frame jsonlite::fromJSON() ## name state ## 1 Seattle WA ## 2 Bellevue WA ## 3 Olympia WA"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"c-library-use-in-other-packages","dir":"Articles","previous_headings":"","what":"C++ library use in other packages","title":"Transform and Validate JSON and NDJSON","text":"package includes complete ‘jsoncons’ C++ header-library, available R packages adding DESCRIPTION file. Typical use R package also include LinkingTo: specifications cpp11 Rcpp (package uses cpp11) packages provide C / C++ interface R C++ ‘jsoncons’ library.","code":"LinkingTo: rjsoncons SystemRequirements: C++11"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/a_rjsoncons.html","id":"session-information","dir":"Articles","previous_headings":"","what":"Session information","title":"Transform and Validate JSON and NDJSON","text":"vignette compiled using following software versions","code":"sessionInfo() ## R version 4.4.1 (2024-06-14) ## Platform: x86_64-pc-linux-gnu ## Running under: Ubuntu 22.04.4 LTS ## ## Matrix products: default ## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 ## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 ## ## locale: ## [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8 ## [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8 ## [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C ## [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C ## ## time zone: UTC ## tzcode source: system (glibc) ## ## attached base packages: ## [1] stats graphics grDevices utils datasets methods base ## ## other attached packages: ## [1] rjsoncons_1.3.1.9100 BiocStyle_2.32.1 ## ## loaded via a namespace (and not attached): ## [1] vctrs_0.6.5 cli_3.6.3 knitr_1.48 ## [4] rlang_1.1.4 xfun_0.47 textshaping_0.4.0 ## [7] jsonlite_1.8.8 glue_1.7.0 htmltools_0.5.8.1 ## [10] ragg_1.3.2 sass_0.4.9 fansi_1.0.6 ## [13] rmarkdown_2.28 evaluate_0.24.0 jquerylib_0.1.4 ## [16] tibble_3.2.1 fastmap_1.2.0 yaml_2.3.10 ## [19] lifecycle_1.0.4 bookdown_0.40 BiocManager_1.30.25 ## [22] compiler_4.4.1 fs_1.6.4 pkgconfig_2.0.3 ## [25] systemfonts_1.1.0 digest_0.6.37 R6_2.5.1 ## [28] utf8_1.2.4 pillar_1.9.0 magrittr_2.0.3 ## [31] bslib_0.8.0 tools_4.4.1 pkgdown_2.1.0 ## [34] cachem_1.1.0 desc_1.4.3"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/b_ndjson_extended.html","id":"installation-setup","dir":"Articles","previous_headings":"","what":"Installation & setup","title":"Processing NDJSON","text":"article assumes rjsoncons, listviewer (interactively exploring JSON), dplyr (manipulating results tibble) tidyr (unnesting columns tibble) cli (providing progress indicator) installed. Start loading rjsoncons dplyr packages current session. use data GH Archive, project record activity public GitHub repositories. Create location file system-wide ‘cache’ directory rjsoncons package. necessary, download single file (1 hour activity, 170,000 events, 100 Mb) GH Archive.","code":"pkgs <- c(\"rjsoncons\", \"dplyr\", \"tidyr\", \"cli\") needed <- pkgs[!pkgs %in% rownames(installed.packages())] install.packages(needed, repos = \"https://CRAN.R-project.org\") library(rjsoncons) library(dplyr) cache <- tools::R_user_dir(\"rjsoncons\", \"cache\") if (!dir.exists(cache)) dir.create(cache, recursive = TRUE) archive_file <- \"https://data.gharchive.org/2023-02-08-0.json.gz\" ndjson_file <- file.path(cache, \"2023-02-08-0.json.gz\") if (!file.exists(ndjson_file)) download.file(archive_file, ndjson_file)"},{"path":"https://mtmorgan.github.io/rjsoncons/articles/b_ndjson_extended.html","id":"data-exploration","dir":"Articles","previous_headings":"","what":"Data exploration","title":"Processing NDJSON","text":"Ensure ndjson_file defined exists get sense data, read visualize first record query uses default path = \"@\", JMESpath expression returns current element. n_records = argument available processing NDJSON, restricts number records input. useful exploring data. record contains information . Records general structure, information can differ, e.g., actions org field. work \"id\" \"type\" top-level fields, available using JMESpath elaborate query might combine , nested, elements, e.g., Note records 3-5 organization.","code":"stopifnot( file.exists(ndjson_file) ) j_query(ndjson_file, n_records = 1) |> listviewer::jsonedit() { \"id\": \"26939254345\", \"type\": \"DeleteEvent\", \"actor\": { \"id\": 19908762, \"login\": \"lucianHymer\", \"display_login\": \"lucianHymer\", \"gravatar_id\": \"\", \"url\": \"https://api.github.com/users/lucianHymer\", \"avatar_url\": \"https://avatars.githubusercontent.com/u/19908762?\" }, \"repo\": { \"id\": 469847426, \"name\": \"gitcoinco/passport\", \"url\": \"https://api.github.com/repos/gitcoinco/passport\" }, \"payload\": { \"ref\": \"format-alert-messages\", \"ref_type\": \"branch\", \"pusher_type\": \"user\" }, \"public\": true, \"created_at\": \"2023-02-08T00:00:00Z\", \"org\": { \"id\": 30044474, \"login\": \"gitcoinco\", \"gravatar_id\": \"\", \"url\": \"https://api.github.com/orgs/gitcoinco\", \"avatar_url\": \"https://avatars.githubusercontent.com/u/30044474?\" } } j_query(ndjson_file, '{id: id, type: type}', n_records = 5) ## [1] \"{\\\"id\\\":\\\"26939254345\\\",\\\"type\\\":\\\"DeleteEvent\\\"}\" ## [2] \"{\\\"id\\\":\\\"26939254358\\\",\\\"type\\\":\\\"PushEvent\\\"}\" ## [3] \"{\\\"id\\\":\\\"26939254361\\\",\\\"type\\\":\\\"CreateEvent\\\"}\" ## [4] \"{\\\"id\\\":\\\"26939254365\\\",\\\"type\\\":\\\"CreateEvent\\\"}\" ## [5] \"{\\\"id\\\":\\\"26939254366\\\",\\\"type\\\":\\\"PushEvent\\\"}\" j_query(ndjson_file, '{id: id, type: type, \"org.id\": org.id}', n_records = 5) ## [1] \"{\\\"id\\\":\\\"26939254345\\\",\\\"type\\\":\\\"DeleteEvent\\\",\\\"org.id\\\":30044474}\" ## [2] \"{\\\"id\\\":\\\"26939254358\\\",\\\"type\\\":\\\"PushEvent\\\",\\\"org.id\\\":123667276}\" ## [3] \"{\\\"id\\\":\\\"26939254361\\\",\\\"type\\\":\\\"CreateEvent\\\",\\\"org.id\\\":null}\" ## [4] \"{\\\"id\\\":\\\"26939254365\\\",\\\"type\\\":\\\"CreateEvent\\\",\\\"org.id\\\":null}\" ## [5] \"{\\\"id\\\":\\\"26939254366\\\",\\\"type\\\":\\\"PushEvent\\\",\\\"org.id\\\":null}\""},{"path":"https://mtmorgan.github.io/rjsoncons/articles/b_ndjson_extended.html","id":"use-jmespath-for-queries","dir":"Articles","previous_headings":"Data exploration","what":"Use JMESpath for queries","title":"Processing NDJSON","text":"JMESpath seems appropriate working NDJSON files. ’s JMESpath query extracting just org information; query processes five records returns five results; records 3-5 key, \"null\". JSONpointer path used, error key exist, third record processed Also, JSONpointer allow one create new objects components data, one assemble id type keys original object new object. JSONpath allows missing keys straight-forward assemble new objects, e.g., placing top-level \"id\" \"type\" keys single object.","code":"j_query(ndjson_file, 'org', n_records = 5) ## [1] \"{\\\"id\\\":30044474,\\\"login\\\":\\\"gitcoinco\\\",\\\"gravatar_id\\\":\\\"\\\",\\\"url\\\":\\\"https://api.github.com/orgs/gitcoinco\\\",\\\"avatar_url\\\":\\\"https://avatars.githubusercontent.com/u/30044474?\\\"}\" ## [2] \"{\\\"id\\\":123667276,\\\"login\\\":\\\"johnbieren-testing\\\",\\\"gravatar_id\\\":\\\"\\\",\\\"url\\\":\\\"https://api.github.com/orgs/johnbieren-testing\\\",\\\"avatar_url\\\":\\\"https://avatars.githubusercontent.com/u/123667276?\\\"}\" ## [3] \"null\" ## [4] \"null\" ## [5] \"null\" try( ## fails: 'b' does not exist j_query('{\"a\": 1}', '/b') ) ## Error : Key not found try( ## fails: record 3 does not have 'org' key j_query(ndjson_file, '/org', n_records = 5) ) ## Error : Key not found j_query(ndjson_file, \"$.org\", n_records = 5) ## [1] \"[{\\\"id\\\":30044474,\\\"login\\\":\\\"gitcoinco\\\",\\\"gravatar_id\\\":\\\"\\\",\\\"url\\\":\\\"https://api.github.com/orgs/gitcoinco\\\",\\\"avatar_url\\\":\\\"https://avatars.githubusercontent.com/u/30044474?\\\"}]\" ## [2] \"[{\\\"id\\\":123667276,\\\"login\\\":\\\"johnbieren-testing\\\",\\\"gravatar_id\\\":\\\"\\\",\\\"url\\\":\\\"https://api.github.com/orgs/johnbieren-testing\\\",\\\"avatar_url\\\":\\\"https://avatars.githubusercontent.com/u/123667276?\\\"}]\" ## [3] \"[]\" ## [4] \"[]\" ## [5] \"[]\""},{"path":"https://mtmorgan.github.io/rjsoncons/articles/b_ndjson_extended.html","id":"use-tibble-with-j_pivot","dir":"Articles","previous_headings":"Data exploration","what":"Use tibble with j_pivot()","title":"Processing NDJSON","text":"j_pivot() useful extracting tabular data JSON NDJSON representations. Recall j_pivot() transforms JSON array file records objects object arrays can represented data.frame tibble ‘hood’, j_pivot() simply calling = \"R\" .data.frame() result. Unfortunately, .data.frame() fails keys translated NULL, e.g., org absent coercion R representation tibble robust missing data Hierarchical data chapter R Data Science suggests using tidyr::unnest_wider() tidyr::unnest_longer()` working nested data. result pivot can flattened one interested keys nested org element, incorporated directly path. Note keys containing . need quoted \"org.id\": org.id.","code":"path <- '{id: id, type: type}' j_pivot(ndjson_file, path, n_records = 5, as = \"R\") |> str() ## List of 2 ## $ id : chr [1:5] \"26939254345\" \"26939254358\" \"26939254361\" \"26939254365\" ... ## $ type: chr [1:5] \"DeleteEvent\" \"PushEvent\" \"CreateEvent\" \"CreateEvent\" ... j_pivot(ndjson_file, path, n_records = 5, as = \"data.frame\") ## id type ## 1 26939254345 DeleteEvent ## 2 26939254358 PushEvent ## 3 26939254361 CreateEvent ## 4 26939254365 CreateEvent ## 5 26939254366 PushEvent path <- '{id: id, type: type, org: org}' try( j_pivot(ndjson_file, path, n_records = 5, as = \"data.frame\") ) ## Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : ## arguments imply differing number of rows: 1, 0 tbl <- j_pivot(ndjson_file, path, n_records = 5, as = \"tibble\") tbl ## # A tibble: 5 × 3 ## id type org ## ## 1 26939254345 DeleteEvent ## 2 26939254358 PushEvent ## 3 26939254361 CreateEvent ## 4 26939254365 CreateEvent ## 5 26939254366 PushEvent tbl |> tidyr::unnest_wider(\"org\", names_sep = \".\") ## # A tibble: 5 × 7 ## id type org.id org.login org.gravatar_id org.url org.avatar_url ## ## 1 26939254345 DeleteEv… 3.00e7 gitcoinco \"\" https:… https://avata… ## 2 26939254358 PushEvent 1.24e8 johnbier… \"\" https:… https://avata… ## 3 26939254361 CreateEv… NA NA NA NA NA ## 4 26939254365 CreateEv… NA NA NA NA NA ## 5 26939254366 PushEvent NA NA NA NA NA path <- '{id: id, type: type, \"org.id\": org.id}' j_pivot(ndjson_file, path, n_records = 5, as = \"tibble\") ## # A tibble: 5 × 3 ## id type org.id ## ## 1 26939254345 DeleteEvent ## 2 26939254358 PushEvent ## 3 26939254361 CreateEvent ## 4 26939254365 CreateEvent ## 5 26939254366 PushEvent "},{"path":"https://mtmorgan.github.io/rjsoncons/articles/b_ndjson_extended.html","id":"filters-with-jmespath","dir":"Articles","previous_headings":"Data exploration","what":"Filters with JMESpath","title":"Processing NDJSON","text":"strategy filtering NDJSON JMESpath create length 1 array containing object interest, filter array. Thus discover PushEvents organizations, form array object containing relevant information [{id: id, type: type, org: org}] filter array using JMESpath’s query syntax [?@.type == 'PushEvent' && org != null]. type quotation (single-quote, ') important query, use double quotes define path j_pivot() removes empty records","code":"path <- \"[{id: id, type: type, org: org}] [?@.type == 'PushEvent' && org != null] | [0]\" j_query(ndjson_file, path, n_records = 5) ## [1] \"null\" ## [2] \"{\\\"id\\\":\\\"26939254358\\\",\\\"type\\\":\\\"PushEvent\\\",\\\"org\\\":{\\\"id\\\":123667276,\\\"login\\\":\\\"johnbieren-testing\\\",\\\"gravatar_id\\\":\\\"\\\",\\\"url\\\":\\\"https://api.github.com/orgs/johnbieren-testing\\\",\\\"avatar_url\\\":\\\"https://avatars.githubusercontent.com/u/123667276?\\\"}}\" ## [3] \"null\" ## [4] \"null\" ## [5] \"null\" path <- \"[{id: id, type: type, org: org}] [?@.type == 'PushEvent' && org != null] | [0]\" j_pivot(ndjson_file, path, n_records = 5, as = \"tibble\") ## # A tibble: 1 × 3 ## id type org ## ## 1 26939254358 PushEvent "},{"path":"https://mtmorgan.github.io/rjsoncons/articles/b_ndjson_extended.html","id":"performance","dir":"Articles","previous_headings":"","what":"Performance","title":"Processing NDJSON","text":"rjsoncons relatively performant processing large files. Use verbose = TRUE get progress indicators. system, takes approximately 13s. Memory use extensive, R level file processed chunks final result represented R data structures. performance rjsoncons comparable purpose-built jq command-line tool. jq takes 9s run command line. additional 3s required input command-line output R. jq provides greater flexibility JMESpath, widely used. CRAN package jqr provides R interface jq library. Linux macOS users required jq library installed. straight-forward use library takes 22 seconds; additional steps required translate result R data.frame. use case outlined compares favorably performance ndjson CRAN package, took 600s complete task . ndjson reads entire data set R, whereas rjsoncons represents final object columns id type R. DuckDB offers CRAN package supports SQL interface JSON, performant. following code takes just 3.7s deliver data.frame R. DuckDB SQL interface allows flexible selection, filtering, data summary. also treats collection JSON files single ‘database’, scales favorably automatically number files processed. DuckDB require additional software, duckdb CRAN package. blog post provides additional details comparison solutions, including discussion design decisions rjsoncons adopted achieve reasonable performance.","code":"system.time({ tbl <- j_pivot( ndjson_file, '{id: id, type: type}', as = \"tibble\", verbose = TRUE ) }) ## processing 33464 records ## processing 68962 records ## processing 107092 records ## processing 144248 records ## user system elapsed ## 13.656 0.110 13.766 tbl ## # A tibble: 172,049 × 2 ## id type ## ## 1 26939254345 DeleteEvent ## 2 26939254358 PushEvent ## 3 26939254361 CreateEvent ## 4 26939254365 CreateEvent ## 5 26939254366 PushEvent ## 6 26939254367 PushEvent ## 7 26939254379 PushEvent ## 8 26939254380 IssuesEvent ## 9 26939254382 PushEvent ## 10 26939254383 PushEvent ## # ℹ 172,039 more rows tbl |> count(type, sort = TRUE) ## # A tibble: 15 × 2 ## type n ##