Feature Proposal: Data Types #31

martinsumner · 2025-08-29T11:52:54Z

martinsumner
Aug 29, 2025
Maintainer

Background

The availability of the variety of data-types in Riak, with automatic conflict resolution for eventual consistency, is rare within distributed databases. There are other examples (for sets and counters in Redis clusters, AntidoteDB, and some support within the framework of Orleans), but it is still considered a differentiator in Riak. However, the data-type support has known limitations:

Although described as a way of reducing the cognitive load on developers it has been known to confuse developers unfamiliar with the underlying concepts. There is a lack of cross-over with knowledge from using other databases.
The developer-facing documentation of CRDTs is sparse, a lack of a cookbook to walk developers through sample non-trivial scenarios, and limited articulation of potential challenges with regards to performance and scalability.
There are also implications on the functional load of Riak clients, which complicates the aim of providing limited (up-to-date) client support in the future within the community; as data-types and manipulation functions are required to exist in the clients not just the API.
The use of CRDTs is incompatible with the Query API, and is generally dependent on the Map/Reduce facility for queryability (which is subject to planned deprecation).

The decision of whether to relaunch, pareto-replace or retire CRDTs in Riak 4.0 is complicated by the variety of options available. For the purpose of this discussion, the date-type support within Riak is split into three categories of use:

Rapidly mutating counters;
Sets as indexes;
Modelling complex data objects using maps.

Rapidly mutating counters

Within Riak, counters (incrementing/decrementing integers), and maps of multiple counters can be updated at very-low cost to the client as no-context is required prior to updating the counter: i.e. the counter can be updated without reading the counter. As the objects are naturally small (the object requires two integers and the ID for every vnode that has coordinated a change to the counter, so o(100) bytes in size); this makes such data-types potentially suitable to requirements where very high volumes of mutations may be received in parallel (e.g. tracking page hits, character health status in a game etc).

There are a number of caveats to the use of counters in Riak:

Counters should be used when there is a diverse number (i.e. thousands) of counters, not for individual counters (which would lead to imbalanced load distribution across vnodes).
Mutation still requires every change to be written to the WAL (or equivalent) in persisted backends, it is only really fast and efficient when stored in a memory_backend (which has significant other side effects, and subject to potential deprecation).
A bucket of Counters can only be queried via map-reduce (i.e. sort keys boy counter value, find counters beyond a threshold), and there are no pre-built functions to assist in this. There is no in-built facility in Riak to support efficiently folding over a bucket of counters.
It is not possible to trigger counter updates as a side effect of other operations (e.g. update a COUNTER on GET), and the no-context nature of counters would make this efficient to implement.
There is no optimisation within the replication system for handling counter updates (e.g. batching of changes).
Counter updates are not idempotent, and there is no deterministic way of handling the failure (or non-acknowledgement) of a counter update. It may be better, and more performant for the client; for the API to acknowledge acceptance not processing of a counter update.

There are potential improvements to address the caveats, and a subset of caveats could be addressed by moving counters to a dedicated in-memory vnode type, and this is discussed elsewhere.

Sets as indexes

Index support in Riak is based around secondary indexes, however secondary indexes are eventually consistent r=1 operations, have additional costs as they require RingSize/N vnode operations rather than N operations, and are only available when a sorted backend is used. It is therefore part of the Riak design cookbook to store index entries for exact term queries in specific objects, and set data-types are suited to this use case.

There are a number of caveats to the use of sets as indexes in Riak:

The current implementation of has scalability constraints which mean that the update time increases with the size of the set, making it inefficient for larger sets.
The sets support indexing only by partition key (i.e. fixed term), not for partition key + sort key (as with DynamoDB).
The sets do not support the filtering of projected attributes as can be done with the Query API (on secondary indexes), or indeed set operations on results.
There is no off-the-shelf way of coordinating object changes with set changes (it requires bespoke metadata and commit hooks, or work in the application).
There is no off-the-shelf way of reconciling the current state of set membership with the current state of object metadata.

There exists a third-party library to help with coordination between objects and indexes using CRDTs, Leapsight's Babel, but this is an Erlang-only solution.

Most current implementations of using sets as indexes are used where fixed term queries are required, and where entries are added and rarely leave and false positives are tolerable. In this case GSETs can be used, and the updates are easier to manage as GSET updates are context-free and idempotent. GSETs still suffer from inefficient scaling as with SETs.

The kv_index_tictactree anti-entropy system was designed with index coordination as a consideration, so it can be theoretically be extended to support the job of reconciling between object metadata and indexes; but the work necessary to implement this is non-trivial.

There may be other approaches to improve queryability in Riak, that have lower overheads and greater rewards - e.g. generalising on existing customer-specific work to replicate and reconcile Riak changes into OpenSearch.

Modelling complex data objects as maps

One of the original intentions of implementing data-types was to support the modelling of complex data-types using maps, with nested maps, sets, registers, counters and flags. So that the application developer would be able to send operations to mutate the complex data-types knowing that the results will be eventually consistent. For example, the JSON object definition used in DynamoDB could be translated to a CRDT for all cases other than the List type - and such objects could be updated across clusters concurrently with deterministic merging of results.

The basic hypothesis is that this resolves a problem that is relatively hard (sibling resolution), by accepting a problem that is relatively easy (mapping application data to Riak data types). In reality though, this trade-off has not generally been perceived as positive by application developers. This might be because of a lack of documented examples, issues with the need to convert object components to/from binaries client-side, or specific problems with the unfamiliarity with or complexity of the API.

Further, when taking this approach, there is no direct in-built support for secondary indexes. So there is an additional hard problem to data modelling, integrating bespoke indexing and queryability through the application.

In Riak 3.4 the problem of sibling management is mitigated with the addition of strengthened conditional PUT logic. It is possible that simply switching to conditional PUTs is generally more convenient for developers, and also potentially more performant - although it leaves an unresolved problem with a parallel multi-cluster updates.

Proposal

The proposal is to implement Merge Strategy Behaviour as a bucket property. The proposal is described on a prototype branch, and is also discussed in the riak_kv discussions.

The proposal is subject to the success of the prototype. The primary goals of the proposal are:

Allow for the potential for backwards-compatible data-types with existing CRDTs, but without the need for them to be carried forward as an ongoing overhead on Riak and Riak client development;
Allow for more flexible innovation on mergeable data-types, but where such activity can evolve Riak without being coupled to the Riak release cycle;
Support delivery of ideas and strategies outside of the field of data-types.

Design

See proposal

Alternative Design Ideas

Optimising CRDTs through a dedicated vnode is not preferred, as:

expectation of long development and unpredictable development timeline, especially given the previous experience with other such prototypes;
ability to improve existing data-type performance through simple changes;
the potential to support more fundamental improvements through sharding support in Merge Strategy.

Simply continuing with existing CRDTs is not preferred as:

inconsistencies with other Riak functions, especially with regards to querying data in CRDT buckets,
- Simpler potential to improve querying with merge strategy behaviour;
Cost of maintaining clients due to dedicated APIs;
Existence of both known problems with scalability, and potentially further unknown problems.

Testing

Caveats

The proposal is subject to the success of the prototype. The prototype scope is described within the proposal.

Pull Requests

Planned Release for Inclusion

Riak 4.0. It is assumed that as part of the 4.0 release much of the existing CRDT framework would be removed, but the riak_dt library would remain to make it easier to write merge strategy behaviours that compatible with pre-4.0 data within existing Riak stores.

jessestimpson · 2025-08-30T01:25:07Z

jessestimpson
Aug 30, 2025

With respect to the points on (i) modeling complex data objects and (ii) attempts to reduce cognitive load, I'll provide my experiment with wrapping the riak_dt_map in Elixir 's Access behaviour, thus exposing the crdt update via Kernel.update_in/3. After toying with a few approaches in Elixir this one felt workable. More work to be done of course (as always 😄).

BTW, The code I'm sharing here doesn't interact with Ecto nor FDB even though the rest of the repo does.

crdt =
        Crdt.Map.new()
        |> Crdt.Map.for_actor(0)
        |> Crdt.Map.Counter.new(["post_1", "comment_1", "reaction_1"])

crdt = update_in(crdt, ["post_1", "comment_1", "reaction_1"], &Crdt.Map.Counter.increment/1)

https://github.com/jessestimpson/ecto_foundationdb/blob/crdt/lib/crdt/map.ex
https://github.com/jessestimpson/ecto_foundationdb/blob/crdt/test/ecto/integration/crdt_test.exs

Of course this leverages some syntactical magic specific to Elixir. I don't have any ideas for similar Erlang operations unfortunately.

0 replies

peerst · 2025-09-01T08:33:22Z

peerst
Sep 1, 2025

I’m currently evaluating Riak as the core data store for an IoT platform, and native CRDT support is a decisive factor in that assessment. The points raised in this thread introduce uncertainty about the availability and long‑term support of CRDTs in OpenRiak. Could you clarify the project’s current stance and roadmap for CRDTs—specifically maintenance ownership, compatibility guarantees, and recommended upgrade paths? I’m comfortable adopting a niche solution when its direction aligns with our needs, but clarity on the above will determine whether Riak remains a fit for us.

1 reply

martinsumner Sep 1, 2025
Maintainer Author

The next release of Riak (3.4) will continue to support CRDTs unchanged from previous versions. This will at release run on OTP 26, and support for OTP 28 is scheduled to be added in 2026. So I expect that this should give that release a life as a maintainable solution at least through to the end of 2028.

How long Riak 3.x, with the current feature set, will be maintained beyond 2028 - will depend on demand and the availability of resource. There is nothing fundamental that I'm aware of that would block Riak 3.4 from being continued to be uplifted for future OTP versions and from being supported by the community beyond 2028, it is just a question of whether there's a need which drives someone to provide resource to the community to make it happen.

This debate is part of a broader debate about roadmap, that was initiated in-part as we had a genuine concern that we were maintaining features that no-one was using. It is focused only that will happen in the next major release beyond 3.4 (which will probably be labelled 4.0 as a major change). Work on this release is not expected to start in earnest until the end of 2025.

The debate over future features in 4.0 is only just beginning, so I can't tell you what the outcome will be. There is no hierarchy of influence over the debate, influence in the debate will be a function of:

having a real-world need for the feature;
having available resource to contribute.

So for Riak 4.0, if you have an interest in a specific feature there is both a challenge (uncertainty), but also an opportunity (influence).

Although the debate has just started, it has established there are multiple organisations with a confirmed need for data-types of some sort. So indications are positive at this early stage.

peerst · 2025-09-01T11:00:49Z

peerst
Sep 1, 2025

I was not aware that no one is using CRDTs in Riak and I thought I heard otherwise. I was also not aware of any trend against use of CRDTs. In my eyes they were exactly the kind of slightly obscure technology with a learning curve one picks for competitive advantage against other who prefer a simple less innovative life. But maybe I missed something, would be good to hear about reasons against CRDT usage and what to use instead. Might be slightly off topic but since the discussion is about what to do about CRDTs I think ist relevant.

I guess I will wait for this discussion to end before I decide for Riak in new projects. Hope the discussion doesn't drag on until I am forced to decide, because in the current state of the discussion I would probably pick something else which also has no CRDTs.

To add something more strategic to the discussion: how would Riak place itself as a pure distributed KV store without CRDTs against the competition?

1 reply

martinsumner Sep 1, 2025
Maintainer Author

I don't believe there has been a time when no-one was using CRDTs in Riak, and I apologise, as I didn't mean to give that impression.

One of the challenges OpenRiak faces as a community, is that we have no way of discovering how Riak is being used, we're only ultimately aware how Riak is being used by those within the community - and we know we represent a small subset of the broader user population.

Also, like any open-source community, there is a reality that direction is influenced through contribution, and contribution is influence by the direct needs of those funding the contribution.

The point you raise about differentiation, is perfectly valid - I also referred to CRDTs as a differentiator in the opening paragraph of the discussion background section. This again is a challenge for any community, how do we grow a community if we care only about the needs of those already in it? Can a community continue to thrive if it doesn't grow? How do we balance the overheads of maintenance, against the desirability of the product?

So the debate is hard, and it must consider these things. But the debate is also open, I would much prefer people be involved in the debate and influence it, rather than be sat waiting for the outcome.

martinsumner · 2025-09-10T16:27:56Z

martinsumner
Sep 10, 2025
Maintainer Author

On 10/09/25 there was a community discussion on the challenges associated with the existing data-type implementation.

Here are the rough notes used in that discussion ...

The riak data-types are essentially a prototype that was rushed into production for the 2.0 deadline. There were a number of caveats to the original implementation, and a long-term plan to replace it (with bigsets, then bigmaps). The riak_dt lbrary has received minimal attention over the past few years, so there are now a number of challenges with the use of data-types.

Many of these challenges are addressable, and in some cases the bigsets library in particular gives an indication of how to address them. However, there is work involved to address or workaround these challenges if either a relaunch or a pareto-replacement is the chosen way forward.

Challenges

The scale of a delta operation

The API for Riak data-types supports delta operations, but every operation is in the implementation a full re-write of the object, and not an application of the delta. So to update any part of a data-type object, as part of the update cycle the object must be:

read from disk entirely, at least once;
de-serialised entirely (from a binary to an erlang term) at least once;
re-serialised entirely (from an erlang term to a binary) at least once;
re-written to disk entirely at least N times;
a backend-specific set of background activity to garbage collect the history of changes.

This means that any CRDT write operation increases in complexity (and hence response time) as the size of the object increases, and not the size of the delta.

This issue is exacerbated by the fact that the internal size of the object is bigger than the external representation, and it is not possible from the outside to discover how big the object is within Riak. For example, externally a counter is the size of an integer, but internally it is approximately 32-bytes in size for every vnode that has coordinated an update for that counter: and the number of coordinating vnodes will continuously increase over the lifecycle of a long-lived environment.

In the current (3.2) version of Riak there is a non-linear increase on the cost of change with scale. For sets this may be addressable (to being a linear increase) with a simple change to optimise of the happy-path (i.e. when no actual conflicting updates are discovered); but making latency predictable when conflict occurs is more challenging.

Specific technical debt in implementation

There is some some specific technical debt in the implementation of data-types:

The use of dict() rather than map() in erlang.
Known issues with the map data-type implementation.
Non-optimisation of merging with empty objects (required once for every update operation).
Undocumented behaviour on field removal, concurrent to field updates.

Expensive update paths for Inexpensive Updates

It is reasonable to expect counter updates will be exceptionally fast - it is a relatively simple change. However, even with the memory backend, a counter update takes 350 microseconds even when the vnode update takes just 50 microseconds.

All CRDT updates follow a common path with standard Riak object updates, but it is potentially the case that for the simplest of updates the common path has excessive overheads: e.g. bucket property checks, option checks, riak_kv_stat updates.

Object diversity

Whe using CRDTs there is a need to design for object diversity - to have activity split across multiple objects. This may be intuitive when all objects within a Riak database use data-types - but may be surprising when data-types are used to perform specific functions in addition to the normal usage of a database e.g. adding a hit counter to an existing database.

The object diversity need is driven by the existence of a single queue for each vnode, meaning that if workload is not distributed over vnodes, queues will vary in size between vnodes creating unpredictable latency across all operations. There are also some subtle consequences of non-diversity with anti-entropy operation, which can have a particular impact on inter-cluster reconciliation costs.

Inefficient queryability

There is no way to efficiently query a bucket of CRDTS e.g. produce a leaderboard of counters, find counters outside of an expected range, sum all counters in the bucket. All queries require every object in the bucket to be read (once), deserialised and processed into a result stream (mapped), before the result-stream is reduced into an actual result. As well as being inefficient (as everything must be read and deserialised for every query), this Map/Reduce mechanism is scheduled to be deprecated in Riak 4.0 as:

It represents a significant security vulnerability, unless using predefined Map and Reduce functions (and there are no pre-defined Map functions for handling data-types).
Map/Reduce jobs may have unconstrained impact on parallel operations as individual requests are greedily sent direct to vnode queues (no snapshots are used).
There is a future cost associated with maintaining the riak_pipe application.

Adding secondary indexes to data-type objects is not supported at present, so there is therefore no ability to use the Query API as with other Riak objects.

Lack of data-type specific functions

There are some functions that might be considered as natural requirements for specific types, which aren't provided:

Check membership of set
Intersection or union of sets
Reset counters
Triggers on counter threshold

Is there a minimal necessary set of functions required to provide a useful counters or sets service?

Action-at-a-distance and Idempotency

None of the Riak data-type operations (except for gset updates) are idempotent, and this would be unexpected toa developer with experience of using of CRDTs. This is as a result of the action-at-a-distance approach to embedding CRDTs within Riak. There is therefore no documented advice for the application developer to handle failure, and no obvious clarity on what the advice would be.

In many systems there may obvious Actor IDs to use within an application, but to be able to track changes based on those IDs requires additional overheads to be included into the data-type (for example by adding a set of recent change history). It is often preferable for implementations that require the functionality of CRDTs to track whether changes have been applied - to implement their own CRDTs client-side rather than be constrained by server-side CRDTs.

Data types and Schemas

Riak data types do not provide for schema management, and so they are an incomplete answer to the problem of modelling and managing of data. Also there are a number of potential conflicts with merging schema changes, and schema changes with changes - where eventual consistency guarantees cannot be made.

API definition

APIs are hard, and the challenge is increased by not being clear what the high-level target is - i.e. to expose data-types directly to those with experience of them, or to hide the existence of conflict resolution behind an API that is familiar to developer with no exposure to data types.

In some cases, the API appears unnecessarily convoluted - i.e. increment a counter by 1:

riakc_pb_socket:update_type(
    Client,
    {<<"SampleBucketType">>, <<"SmapleBucketName">>},
    <<"SampleKey">>,
    riakc_counter:to_op(riakc_counter:increment(1, riakc_counter:new()))
).

There are also known bugs in the API.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Open Riak

Feature Proposal: Data Types #31

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 4 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Open Riak

Feature Proposal: Data Types #31

Uh oh!

Uh oh!

martinsumner Aug 29, 2025 Maintainer

Background

Rapidly mutating counters

Sets as indexes

Modelling complex data objects as maps

Proposal

Design

Alternative Design Ideas

Testing

Caveats

Pull Requests

Planned Release for Inclusion

Replies: 4 comments · 2 replies

Uh oh!

jessestimpson Aug 30, 2025

Uh oh!

Uh oh!

peerst Sep 1, 2025

Uh oh!

martinsumner Sep 1, 2025 Maintainer Author

Uh oh!

peerst Sep 1, 2025

Uh oh!

martinsumner Sep 1, 2025 Maintainer Author

Uh oh!

martinsumner Sep 10, 2025 Maintainer Author

Challenges

The scale of a delta operation

Specific technical debt in implementation

Expensive update paths for Inexpensive Updates

Object diversity

Inefficient queryability

Lack of data-type specific functions

Action-at-a-distance and Idempotency

Data types and Schemas

API definition

martinsumner
Aug 29, 2025
Maintainer

Replies: 4 comments 2 replies

jessestimpson
Aug 30, 2025

peerst
Sep 1, 2025

martinsumner Sep 1, 2025
Maintainer Author

peerst
Sep 1, 2025

martinsumner Sep 1, 2025
Maintainer Author

martinsumner
Sep 10, 2025
Maintainer Author