Skip to content

Conversation

@martinsumner
Copy link
Contributor

@martinsumner martinsumner commented Feb 4, 2025

There are four main parts to this refactoring:

  • Allow for handoff-type specific fold options. to be passed back to the riak_core_handoff_sender from Mod:handoff_started/2 (this is then used by riak_kv to turn fold_objects queries into fold_heads with deferred fetching).
  • Expand the width of vnodes to be involved in repairs from a pair (i.e. one either side), to a double-pair (two either side).
  • Add negative filters to handoff, so that riak_core_repair filters can be used to check not just if a Key is in the target, but if the Key is not in the range of another source that has already commenced handoff.
  • Update riak_core_repair to use small integers of log2(RingSize) bits rather than 160-bit integers.

The net effect is that multiple repairs for each vnode will be triggered, but the the losers of the race to start will have less work to do (and potentially no work). By using fold_heads with deferred fetch, there is minimal work when losing the race. This allows for busier nodes to avoid repair work, and for less busy nodes to pick up the slack.

Allows for type-specific list of options to be added by Module:handoff_started/2, and also if the Module exports a handoff_encoding_fun/2 an alternative encoder can be provided rather than Module:encode_handoff_item/2.
This allows for a riak_core_vnode_manager that is managing a repair to return negative filters when a second handoff is requested for the same target partition - so that data already sent in the first handoff is not resent.  If using a fold_heads, this will avoid the deferred fetch.
Increase the width of repairs so that more nodes can contribute - but using the NegativeFilters to avoid duplication of work.

To use a broader width - double-pair should be set as the riak_core/repair_span, not pair.
Previously the riak_core_repair filter used the a riak_core_bucket function to return an NValMap - but that doesn't include bucket types within the map.  There's no equivalent for typed buckets - so lookup dynamically, and use the process dict to avoid repeated lookups.

Also as ring_sizes beyond 64K are not supported - make life easier by checking with 16-bit integers only
Will otherwise produce hundreds of logs.

Also make the core worker pool logs metric logs
As it may be use multiple times with a negative filter
As we only need log2(RingSize) bits of the hash - just work with these small integers.
@martinsumner
Copy link
Contributor Author

OpenRiak/riak_test#7

@martinsumner
Copy link
Contributor Author

OpenRiak/riak_kv#24

martinsumner and others added 6 commits April 1, 2025 10:16
Co-authored-by: Thomas Arts <thomas.arts@quviq.com>
FilterFun must only return boolean according to type - so NotTrue must be false
Snappy issues now reoslved
@martinsumner martinsumner merged commit 8d8bf7f into openriak-3.4 Jul 29, 2025
2 checks passed
@martinsumner martinsumner deleted the nhse-o34-orkv.i61-repairoptions branch July 29, 2025 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants