Feature Proposal: JSON Projected Attributes on Secondary Indexes and MGET_IF #34
martinsumner
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Background
The Query API in Riak 3.4 replaces the secondary index feature for handling range queries with overloaded terms. Whereas in the original Riak 2.0 2i solution, attributes projected onto overloaded terms had to be filtered using regular expressions, in the Query API they could be converted into a map of attributes using an
evalexpression, before being filtered with afilterexpression.The design of this feature was focused on being backwards compatible with existing index entries, but this is not the most obvious and easy way of working with projected attributes on queries.
Further, as part of 4.0, the intention is to retire the Map/Reduce feature. Although there were issues with the feature, it is hoped to retain some of the power of the solution, but with a significantly reduced maintenance overhead. The primary use for Map/Reduce was as a mechanism for supporting multiple gets, that were conditional (i.e. MGET_IF). The expectation is to have a simple way of supplying a "stream" of multiple objects that have passed some filter - either filter on an index, or a filter on the object metadata.
Proposal
There are three proposed extensions to the Query API:
_j64index (i.e. rather than_bin). These indexes should be base64 encoded json with two elements to the json document - a$sort_keyand a$attr_map; where the$sort_keyis a binary string, and the$attr_mapis a map of Key/Value pairings. This will then be stored as an index key as a {binary(), maps:map()} tuple. Querying this will not require an eval expression to create an initial map of projected attributes, as the map will simply be deserialised.filterexpression can be applied on that map of metadata to filter the results during the fold.accumulation_optionofqueue, where the output of a query (of object keys) to be pushed to a queue (e.g. just as withrange_repland other aae_folds), and so that a reference to the queue will be returned to the client (not a list of results), and where the client may then use the fetch API to fetch the resultant objects one by one.merge_optionwill be added to the fetch API, and so the fetched object will be the result of the merge_strategy callbackhandle_get_response_body/2There are some potential further extensions:
_j64index, a key in the$attr _map, can use a$b64.prefix, and be converted by the API into a binary for internal storage.$str.and$b64.filterexpressions so that bespoke functions can be defined, and then used in filter expressions.Design
Most of the scaffolding for the solution already exists within Riak, so the change in code footprint from adding these features should be small.
Using a fetch queue for the processing and receipt of a stream of objects, should be simpler from an API and riak client perspective than trying to stream objects.
This should provide functionality similar to what is available via riak_pip/map-reduce, but without some of the challenging side-effects:
Although there will be no capability to fail-over work in the event of node down (i.e. a riak_pipe_vnode can do handoff), however the usefulness of this feature is limited relative to the cost of implementing and maintaining test for it.
Alternative Design Ideas
Testing
Caveats
Depends in-part on the merge strategy behaviour bucket property.
Pull Request
Planned Release for Inclusion
Riak 4.0
Beta Was this translation helpful? Give feedback.
All reactions