- provide user option
jsonlite.pagesizeforjsonlite::stream_{in,out}() - increase default value of
pagesizeforjsonlite::stream_{in,out}()
- Fix
docdb_queryforsrc_sqlitewhen using$inwith strings in query - Use
jsonb_treeas available sinceRSQLiteversion 2.4.4 - Update database version requirements, cleanup version-dependent code
- Avoid time-costly
json_treeforsrc_duckdbforlistfields = TRUE - Handle
docdb_updateforndjsonfiles with duplicate _id's or target rows
- Improve mangling DuckDB version number
- Correct
listfields = TRUEforsrc_duckdb
- Prepare
docdb_query()forduckdb1.3.0 (e.g., use newjson_treefunction) - Move warning for non-persistent connections to
src_sqlite,src_duckdbcalls - Harmomise printing connections
- Use
duckdbinternal function for writing NDJSON to file - Removing message about RSQLite handling NDJSON file name as value
docdb_query()modified so that it returns a data frame, in which each column has just one type (atomic or list) across all the rows of the respective column (previously, e.g. a mix of single-item lists simplified to atomic values and of multi-item lists were returned)
docdb_create()anddocdb_update()for SQLite and PostgreSQL (only if on localhost) now import directly and fast fromndjsonfiles, in analogy to DuckDB (needs RSQLite >= 2.3.7.9014)- Refactored
docdb_update()forsrc_couchdb() - Add message from
docdb_create()if a data frame has column names with a dot(s) since dots innodbiare used forJSONdot paths - Add code to check database backend version requirements
- Adding info if PostgreSQL database is not yet created
- Factored out further code
- uses new features of
duckdb1.11.0 for refactoring ofdocdb_query(), accelerating queries - accelerated creating and updating from file
- partial refactoring of
docdb_query(), accelerating queries up to 20-fold for SQLite, DuckDB, and acceleratinglistfields = TRUEseveral times for DuckDB
- address
docdb_query()not working for cases when dot paths had no counts between fields - address wrong database size printing
- stop if query is invalid even though JSON is valid
- print information also for MongoDB connection object
- code cleaning, parameters checking
- document that
$regexindocdb_query()is case-sensitive
- re-adding field formatting for
docdb_query(src, key, query, listfields = TRUE, limit = <integer>)
- minor fixes to
limitindocdb_query(src, key, query, listfields = TRUE, limit = <integer>)and speed up
- added vignette
- added tests internal functions, verbose option
- added caching to GitHub action workflow
- added missing fields validity check for duckdb
- more robust parameter checks in
docdb_queryanddocdb_update - ensure
NULLalso for all MongoDB returns
- docTyp'ed src.R
- minify
JSONwith Elasticsearch indocdb_update - moved local variable out of UseMethod in
docdb_query
- make
docdb_get()work again forsrc_sqlite()by castingJSONBback toJSON
- empty parameter
querynow triggers a warning as it should be a valid JSON string; changequery = ""intoquery = "{}"
- adapted to use new, faster
JSONBfunctions inSQLite3.45.0 (RSQLite>= 2.3.4.9005) - refactored parts of
docdb_create()to speed up handling large data frames and lists - made Elasticsearch to immediately refresh index after
docdb_create()and other functions docdb_update()now reports which records failed to update and then continuesdocdb_delete()now returns harmonised success logical value across backends
docdb_query() reimplementation to have the same functionality across all databases (DuckDB, SQLite, PostgreSQL, MongoDB, Elasticsearch, CouchDB); even though the API and unit tests remained, user provisions may break e.g. to handle return values of databases that previously were incompletely implemented (in particular Elasticsearch and CouchDB). Details:
querycan now be complex (nested, various operators) and is digested with a Javascript helperfieldscan now be nested fields (e.g.,friends.name) to directly return values lifted from the nested fieldlistfieldsparameter newly implemented to return dot paths for all fields of all or selected documents in collection- expanded use of
jqviajqrfor mangling parameters, selecting documents, filtering fields and lifting nested field values - if no data are found, returns
NULL(previously some backends returned an empty data frame) docdb_query(src, key, query = "{}", fields = "{}")now delegates todocdb_get(src, key)_idis always returned, unless specified with"_id": 0in parameterfields- for
scr_postgres, only fewer than 50 fields if any can be specified infields - for
src_sqlite, minimise the use of the time-costlyjson_tree - workaround for path collisions of MongoDB
- some acceleration of
docdb_query() - factored out common code
- expanded testing
- updated docs
- escaping newline character within a JSON value, in
docdb_*()functions
- changed
docdb_update()to directly use NDJSON from file for duckdb - cleaned up unnecessary code in
docdb_create() - no more using transactions with
src_duckdb()
- regression error from not specifying top-level jq script
- corrected and improve field selection in
docdb_query() - corrected test exceptions for mongodb, updated GitHub Actions, expanded tests
- corrected marginal case in
docdb_query.src_duckdb() - corrected minimum R version
- replaced in tests
httpbinwithwebfakes - removed explicit UTF-8 encoding reference
- speed up in
docdb_query() - switched to v2 GitHub r-lib/action for R CMD check
- replaced a dependency, gained speed
- fix initialisation in
docdb_query()withsrc_duckdb()
docdb_update()now can do bulk updates when _id's are invalue(for SQLite, DuckDB, PostgreSQL, MongoDB; not yet for CouchDB and Elastic)
- fix tests for value parameter to be a file or an url
src_duckdb()handles when json_type returns NULL for non-existing pathsrc_sqlite()handles when text includes double quotation marks
- added warning if DuckDB's JSON extension is not available; improve instructions; see also issue #45
- minor simplification of
docdb_exists()forsrc_mongo(), and ofdocdb_query()for SQL databases
- corrected closing connections to SQL database backends upon session restart
- improved provisions for parallel write access and corresponding tests
- capture marginal case of no rows in
docdb_query()
- adding support for duckdb (R package version 0.6.0 or higher) as database backend
- suppressed warnings when checking if a string points to a file
- replaced
isa()as not available with R version 3.x
- refactored
docdb_update.src_couchdb()to usejqr - adapted
docdb_createto acceptjsonlite,jsonify,jqrJSON - added details to README
- testing (unset LANG, relocate open code, better cleaning up)
- fixed
docdb_query()to account for change in SQLite 3.38.3 adding quotation of labels (closes issue #44), test added - made
docdb_query()work for PostgreSQL when a string used with the$inoperator has a comma(s), test added
docdb_create()now supports file names and http urls as argumentvaluefor importing datadocdb_create()(and thusdocdb_update()) now supports quantifiers (e.g., '[a-z]{2,3}') in regular expressions
- for SQLite, return
FALSElike other backends when usingdocdb_delete()for a non-existing container (table, in the case of SQLite) - better handle special characters and encodings under Windows
- full support for PostgreSQL (using jsonb)
- for SQLite add closing file references also on exit
- for SQLite under Windows ensure handling of special characters (avoiding encoding conversions with file operations that stream out / in NDJSON)
- identical API for
docdb_*()functions so thatqueryandfieldsparameters can be used across database backends - identical return values across database backends
- re-factored recently added functions for RSQLite
- re-factored most functions to provide identical API
- performance (timing and memory use) profiled and optimised as far as possible
- testing now uses the same test file across databases
- currently, no more support for redis (no way was found to query and update specific documents in a container)
docdb_list()added as function to list container in database
- Support for complex queries not yet implemented for Elasticsearch
- Only root fields (no subitems) returned by Elasticsearch and CouchDB
- made remaining
docdb_*()functions return a logical indicating the success of the function (docdb_create,docdb_delete), or a data frame (docdb_get,docdb_query), or the number of documents affected by the function (docdb_update) - change testing approach
docdb_get()to not return '_id' field forsrc_{sqlite,mongo}since already used for row names
docdb_query.src_sqlite()now handles JSON objects, returning nested lists (#40)src_sqlite()now uses transactions for relevant functions (#39)docdb_update.src_mongo()now returns the number of upserted or matched documents, irrespective of whether they were updated or not
docdb_get()to not return '_id' field forsrc_{sqlite,mongo}since already used for row names
- change of maintainer agreed
- fix for
src_couchdb(): we were not setting user and password correctly internally, was causing issues in CouchDB v3 (#35) thanks to @drtagkim for the pull request
- in
docdb_query()anddocdb_get(), for sqlite source, use a connection instead of a regular file path to avoid certain errors on Windows (#33) work by @rfhb - in
docdb_query()anddocdb_create()for sqlite source, fix to handle mixed values of different types (#34) work by @rfhb - some Sys.sleep's added to Elasticserch eg's to make sure data is available after creation, and before a data request
- new author Ralf Herold, with contribution of new functions for working with SQLite/json1. new functions:
src_sqlite,print.src_sqlite,docdb_create.src_sqlite,docdb_delete.src_sqlite,docdb_exists.src_sqlite,docdb_get.src_sqlite,docdb_query.src_sqlite, anddocdb_update.src_sqlite. includes new datasetcontacts(#25) (#27) (#28) (#29) (#30) (#31) docdb_updategains method for working with MongoDB, via (#27)
- added
.githubfiles in the source repository to facilitate contributions src_mongochanges, improved behavior, via (#27)
etcd(via theetseedpackage) integration has been removed from this package as etcd doesn't really fit the main goal of the pkg. functions now defunct are:src_etcd,docdb_create.src_etcd,docdb_delete.src_etcd,docdb_exists.src_etcd,docdb_get.src_etcd, andprint.src_etcd(#26)
docdb_get()gainslimitparameter to do pagination, for CouchDB, Elasticsearch and MongoDB only (#17) (#23)- gains function
docdb_query()to send queries to each backend (#18) (#22) - gains function
docdb_exists()to check if a database or equivalent exists (#21) (#22)
- Updated package for new version of
elastic, which has slightly different setup for connecting to the Elasticsearch instance (#20)
- released to CRAN