Feature Proposal: Data Types #31
Replies: 4 comments 2 replies
-
|
With respect to the points on (i) modeling complex data objects and (ii) attempts to reduce cognitive load, I'll provide my experiment with wrapping the riak_dt_map in Elixir 's Access behaviour, thus exposing the crdt update via BTW, The code I'm sharing here doesn't interact with Ecto nor FDB even though the rest of the repo does. crdt =
Crdt.Map.new()
|> Crdt.Map.for_actor(0)
|> Crdt.Map.Counter.new(["post_1", "comment_1", "reaction_1"])
crdt = update_in(crdt, ["post_1", "comment_1", "reaction_1"], &Crdt.Map.Counter.increment/1)https://github.com/jessestimpson/ecto_foundationdb/blob/crdt/lib/crdt/map.ex Of course this leverages some syntactical magic specific to Elixir. I don't have any ideas for similar Erlang operations unfortunately. |
Beta Was this translation helpful? Give feedback.
-
|
I’m currently evaluating Riak as the core data store for an IoT platform, and native CRDT support is a decisive factor in that assessment. The points raised in this thread introduce uncertainty about the availability and long‑term support of CRDTs in OpenRiak. Could you clarify the project’s current stance and roadmap for CRDTs—specifically maintenance ownership, compatibility guarantees, and recommended upgrade paths? I’m comfortable adopting a niche solution when its direction aligns with our needs, but clarity on the above will determine whether Riak remains a fit for us. |
Beta Was this translation helpful? Give feedback.
-
|
I was not aware that no one is using CRDTs in Riak and I thought I heard otherwise. I was also not aware of any trend against use of CRDTs. In my eyes they were exactly the kind of slightly obscure technology with a learning curve one picks for competitive advantage against other who prefer a simple less innovative life. But maybe I missed something, would be good to hear about reasons against CRDT usage and what to use instead. Might be slightly off topic but since the discussion is about what to do about CRDTs I think ist relevant. I guess I will wait for this discussion to end before I decide for Riak in new projects. Hope the discussion doesn't drag on until I am forced to decide, because in the current state of the discussion I would probably pick something else which also has no CRDTs. To add something more strategic to the discussion: how would Riak place itself as a pure distributed KV store without CRDTs against the competition? |
Beta Was this translation helpful? Give feedback.
-
|
On 10/09/25 there was a community discussion on the challenges associated with the existing data-type implementation. Here are the rough notes used in that discussion ... The riak data-types are essentially a prototype that was rushed into production for the 2.0 deadline. There were a number of caveats to the original implementation, and a long-term plan to replace it (with Many of these challenges are addressable, and in some cases the ChallengesThe scale of a delta operationThe API for Riak data-types supports delta operations, but every operation is in the implementation a full re-write of the object, and not an application of the delta. So to update any part of a data-type object, as part of the update cycle the object must be:
This means that any CRDT write operation increases in complexity (and hence response time) as the size of the object increases, and not the size of the delta. This issue is exacerbated by the fact that the internal size of the object is bigger than the external representation, and it is not possible from the outside to discover how big the object is within Riak. For example, externally a counter is the size of an integer, but internally it is approximately 32-bytes in size for every vnode that has coordinated an update for that counter: and the number of coordinating vnodes will continuously increase over the lifecycle of a long-lived environment. In the current (3.2) version of Riak there is a non-linear increase on the cost of change with scale. For sets this may be addressable (to being a linear increase) with a simple change to optimise of the happy-path (i.e. when no actual conflicting updates are discovered); but making latency predictable when conflict occurs is more challenging. Specific technical debt in implementationThere is some some specific technical debt in the implementation of data-types:
Expensive update paths for Inexpensive UpdatesIt is reasonable to expect counter updates will be exceptionally fast - it is a relatively simple change. However, even with the memory backend, a counter update takes 350 microseconds even when the vnode update takes just 50 microseconds. All CRDT updates follow a common path with standard Riak object updates, but it is potentially the case that for the simplest of updates the common path has excessive overheads: e.g. bucket property checks, option checks, Object diversityWhe using CRDTs there is a need to design for object diversity - to have activity split across multiple objects. This may be intuitive when all objects within a Riak database use data-types - but may be surprising when data-types are used to perform specific functions in addition to the normal usage of a database e.g. adding a hit counter to an existing database. The object diversity need is driven by the existence of a single queue for each vnode, meaning that if workload is not distributed over vnodes, queues will vary in size between vnodes creating unpredictable latency across all operations. There are also some subtle consequences of non-diversity with anti-entropy operation, which can have a particular impact on inter-cluster reconciliation costs. Inefficient queryabilityThere is no way to efficiently query a bucket of CRDTS e.g. produce a leaderboard of counters, find counters outside of an expected range, sum all counters in the bucket. All queries require every object in the bucket to be read (once), deserialised and processed into a result stream (mapped), before the result-stream is reduced into an actual result. As well as being inefficient (as everything must be read and deserialised for every query), this Map/Reduce mechanism is scheduled to be deprecated in Riak 4.0 as:
Adding secondary indexes to data-type objects is not supported at present, so there is therefore no ability to use the Query API as with other Riak objects. Lack of data-type specific functionsThere are some functions that might be considered as natural requirements for specific types, which aren't provided:
Is there a minimal necessary set of functions required to provide a useful counters or sets service? Action-at-a-distance and IdempotencyNone of the Riak data-type operations (except for gset updates) are idempotent, and this would be unexpected toa developer with experience of using of CRDTs. This is as a result of the action-at-a-distance approach to embedding CRDTs within Riak. There is therefore no documented advice for the application developer to handle failure, and no obvious clarity on what the advice would be. In many systems there may obvious Actor IDs to use within an application, but to be able to track changes based on those IDs requires additional overheads to be included into the data-type (for example by adding a set of recent change history). It is often preferable for implementations that require the functionality of CRDTs to track whether changes have been applied - to implement their own CRDTs client-side rather than be constrained by server-side CRDTs. Data types and SchemasRiak data types do not provide for schema management, and so they are an incomplete answer to the problem of modelling and managing of data. Also there are a number of potential conflicts with merging schema changes, and schema changes with changes - where eventual consistency guarantees cannot be made. API definitionAPIs are hard, and the challenge is increased by not being clear what the high-level target is - i.e. to expose data-types directly to those with experience of them, or to hide the existence of conflict resolution behind an API that is familiar to developer with no exposure to data types. In some cases, the API appears unnecessarily convoluted - i.e. increment a counter by 1: riakc_pb_socket:update_type(
Client,
{<<"SampleBucketType">>, <<"SmapleBucketName">>},
<<"SampleKey">>,
riakc_counter:to_op(riakc_counter:increment(1, riakc_counter:new()))
).There are also known bugs in the API. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Background
The availability of the variety of data-types in Riak, with automatic conflict resolution for eventual consistency, is rare within distributed databases. There are other examples (for sets and counters in Redis clusters, AntidoteDB, and some support within the framework of Orleans), but it is still considered a differentiator in Riak. However, the data-type support has known limitations:
The decision of whether to relaunch, pareto-replace or retire CRDTs in Riak 4.0 is complicated by the variety of options available. For the purpose of this discussion, the date-type support within Riak is split into three categories of use:
Rapidly mutating counters
Within Riak, counters (incrementing/decrementing integers), and maps of multiple counters can be updated at very-low cost to the client as no-context is required prior to updating the counter: i.e. the counter can be updated without reading the counter. As the objects are naturally small (the object requires two integers and the ID for every vnode that has coordinated a change to the counter, so o(100) bytes in size); this makes such data-types potentially suitable to requirements where very high volumes of mutations may be received in parallel (e.g. tracking page hits, character health status in a game etc).
There are a number of caveats to the use of counters in Riak:
memory_backend(which has significant other side effects, and subject to potential deprecation).There are potential improvements to address the caveats, and a subset of caveats could be addressed by moving counters to a dedicated in-memory vnode type, and this is discussed elsewhere.
Sets as indexes
Index support in Riak is based around secondary indexes, however secondary indexes are eventually consistent r=1 operations, have additional costs as they require RingSize/N vnode operations rather than N operations, and are only available when a sorted backend is used. It is therefore part of the Riak design cookbook to store index entries for exact term queries in specific objects, and set data-types are suited to this use case.
There are a number of caveats to the use of sets as indexes in Riak:
There exists a third-party library to help with coordination between objects and indexes using CRDTs, Leapsight's Babel, but this is an Erlang-only solution.
Most current implementations of using sets as indexes are used where fixed term queries are required, and where entries are added and rarely leave and false positives are tolerable. In this case GSETs can be used, and the updates are easier to manage as GSET updates are context-free and idempotent. GSETs still suffer from inefficient scaling as with SETs.
The
kv_index_tictactreeanti-entropy system was designed with index coordination as a consideration, so it can be theoretically be extended to support the job of reconciling between object metadata and indexes; but the work necessary to implement this is non-trivial.There may be other approaches to improve queryability in Riak, that have lower overheads and greater rewards - e.g. generalising on existing customer-specific work to replicate and reconcile Riak changes into OpenSearch.
Modelling complex data objects as maps
One of the original intentions of implementing data-types was to support the modelling of complex data-types using maps, with nested maps, sets, registers, counters and flags. So that the application developer would be able to send operations to mutate the complex data-types knowing that the results will be eventually consistent. For example, the JSON object definition used in DynamoDB could be translated to a CRDT for all cases other than the List type - and such objects could be updated across clusters concurrently with deterministic merging of results.
The basic hypothesis is that this resolves a problem that is relatively hard (sibling resolution), by accepting a problem that is relatively easy (mapping application data to Riak data types). In reality though, this trade-off has not generally been perceived as positive by application developers. This might be because of a lack of documented examples, issues with the need to convert object components to/from binaries client-side, or specific problems with the unfamiliarity with or complexity of the API.
Further, when taking this approach, there is no direct in-built support for secondary indexes. So there is an additional hard problem to data modelling, integrating bespoke indexing and queryability through the application.
In Riak 3.4 the problem of sibling management is mitigated with the addition of strengthened conditional PUT logic. It is possible that simply switching to conditional PUTs is generally more convenient for developers, and also potentially more performant - although it leaves an unresolved problem with a parallel multi-cluster updates.
Proposal
The proposal is to implement Merge Strategy Behaviour as a bucket property. The proposal is described on a prototype branch, and is also discussed in the riak_kv discussions.
The proposal is subject to the success of the prototype. The primary goals of the proposal are:
Design
See proposal
Alternative Design Ideas
Optimising CRDTs through a dedicated vnode is not preferred, as:
Simply continuing with existing CRDTs is not preferred as:
Testing
Caveats
The proposal is subject to the success of the prototype. The prototype scope is described within the proposal.
Pull Requests
Planned Release for Inclusion
Riak 4.0. It is assumed that as part of the 4.0 release much of the existing CRDT framework would be removed, but the
riak_dtlibrary would remain to make it easier to write merge strategy behaviours that compatible with pre-4.0 data within existing Riak stores.Beta Was this translation helpful? Give feedback.
All reactions