Fix slice split by yi-xmu · Pull Request #336 · eloqdata/tx_service

yi-xmu · 2026-01-05T09:40:22Z

Only need to split slice when post ckpt size is greater than the current size of the slice and greater than the slice upper bound.
For non snapshot read, there is no need to return the deleted keys from the data store.

Here are some reminders before you submit the pull request

Add tests for the change
Document changes
Reference the link of issue using fixes eloqdb/tx_service#issue_id
Reference the link of RFC if exists
Pass ./mtr --suite=mono_main,mono_multi,mono_basic

Summary by CodeRabbit

Performance
- Refined snapshot read handling for improved data consistency and accuracy.
- Optimized data partitioning logic to reduce unnecessary splitting operations and improve overall system efficiency.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

1. Only need to split slice when post ckpt size is greater than the current size of the slice and greater than the slice upper bound. 2. For non snapshot read, there is no need to return the deleted keys from the data store.

coderabbitai · 2026-01-05T09:40:38Z

Walkthrough

These changes modify snapshot-aware filtering and slice-splitting logic across store handler and transaction service components. Range slice splitting conditions are tightened in template maps and local shards, while snapshot-aware filtering is introduced to skip deleted or out-of-date entries during range reads.

Changes

Cohort / File(s)	Summary
Store Handler Snapshot Filtering `store_handler/data_store_service_client_closure.cpp`	Introduces snapshot-aware filtering in LoadRangeSliceCallback to skip range entries that are deleted on non-snapshot reads or older than the snapshot timestamp. Moves kv_start_key_ advancement inside loop during last item processing.
Slice Split Logic Enhancements `tx_service/include/cc/template_cc_map.h`, `tx_service/src/cc/local_cc_shards.cpp`	Tightens range slice splitting conditions: slices are now pinned for splitting only when PostCkptSize exceeds both slice_upper_bound and current slice Size. Extends in-memory heuristic to treat slices as un-split when post-checkpoint size fits within current slice, enabling earlier bypass of split logic.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Fix slice index out of range #333: Modifies slice-pinned and end-of-batch decision logic in TemplateCcMap, directly related to slice handling changes.
Fix assert bug #306: Modifies slice-splitting logic and post-checkpoint size handling in local_cc_shards.cpp, overlapping scope with current changes.

Suggested reviewers

liunyl

Poem

🐰 Snapshots filter swift and true,
Deleted entries bid adieu—
Slices split with stricter care,
Logic trimmed with wisdom fair! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.
Title check	❓ Inconclusive	The title 'Fix slice split' is vague and generic, using non-descriptive language that doesn't convey specific details about what is being fixed.	Provide a more specific title that describes the actual fix, such as 'Fix slice split condition to check post-ckpt size against both current size and upper bound' or 'Improve slice splitting logic and snapshot-aware filtering'.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description check	✅ Passed	The PR description addresses the two main changes but does not complete the required checklist items (issue reference, tests, documentation, or test suite confirmation).

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix_slice_split

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Fix all issues with AI Agents 🤖

In @store_handler/data_store_service_client_closure.cpp:
- Around line 1317-1328: The snapshot filtering condition is inverted: inside
the if-block that checks (snapshot_ts == 0 && is_deleted) || (snapshot_ts > 0 &&
snapshot_ts > ts) you should flip the comparison to (ts > snapshot_ts) so it
reads (snapshot_ts == 0 && is_deleted) || (snapshot_ts > 0 && ts > snapshot_ts);
update that condition in the same if used for skipping deleted/newer records
(variables snapshot_ts, ts, is_deleted) so the code matches the comment and MVCC
semantics (skip records newer than snapshot) and retains the existing
continue/backfill behavior.

🧹 Nitpick comments (1)

tx_service/include/cc/template_cc_map.h (1)
5424-5428: Slice‑split predicate matches intent; consider syncing comments and minor cleanup

The new condition
return (slice->PostCkptSize() > StoreSlice::slice_upper_bound &&
        slice->PostCkptSize() > slice->Size());
correctly implements “split only when post‑ckpt size exceeds both the slice upper bound and the current slice size” as described in the PR.

Two small follow‑ups:

The RangePartitionDataSyncScanCc block comment above (around Line 5196) still describes splitting only in terms of PostCkptSize vs upper_bound and can now be misleading. It would be good to update it to mention the > current size requirement (e.g., “PostCkptSize > max(slice_upper_bound, current_size)”).

For readability and to avoid repeated calls, you could cache PostCkptSize() in a local auto post_ckpt_size = slice->PostCkptSize(); and use that in the comparisons and the assert.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bc6852e and 7956bd5.

📒 Files selected for processing (3)

store_handler/data_store_service_client_closure.cpp
tx_service/include/cc/template_cc_map.h
tx_service/src/cc/local_cc_shards.cpp

🧰 Additional context used

🧠 Learnings (4)

📚 Learning: 2025-10-20T04:30:07.884Z

Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.

Applied to files:

tx_service/include/cc/template_cc_map.h

📚 Learning: 2025-11-11T07:10:40.346Z

Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.

Applied to files:

tx_service/include/cc/template_cc_map.h
tx_service/src/cc/local_cc_shards.cpp

📚 Learning: 2025-12-02T10:43:27.431Z

Learnt from: lokax
Repo: eloqdata/tx_service PR: 254
File: tx_service/src/cc/local_cc_shards.cpp:2949-3188
Timestamp: 2025-12-02T10:43:27.431Z
Learning: In tx_service/src/cc/local_cc_shards.cpp, whenever TryPinNodeGroupData is used, only call Sharder::Instance().UnpinNodeGroupData(node_group) if the recorded term is >= 0 (i.e., pin succeeded). Example: LocalCcShards::PostProcessFlushTaskEntries guards the unpin with `if (term >= 0)`.

Applied to files:

tx_service/src/cc/local_cc_shards.cpp

📚 Learning: 2025-10-09T03:56:58.811Z

Learnt from: thweetkomputer
Repo: eloqdata/tx_service PR: 150
File: include/cc/local_cc_shards.h:626-631
Timestamp: 2025-10-09T03:56:58.811Z
Learning: For the LocalCcShards class in include/cc/local_cc_shards.h: Writer locks (unique_lock) should continue using the original meta_data_mux_ (std::shared_mutex) rather than fast_meta_data_mux_ (FastMetaDataMutex) at this stage. Only reader locks may use the FastMetaDataMutex wrapper.

Applied to files:

tx_service/src/cc/local_cc_shards.cpp

🧬 Code graph analysis (1)

store_handler/data_store_service_client_closure.cpp (1)

tx_service/src/cc/cc_shard.cpp (1)

key (1327-1327)

🔇 Additional comments (3)

store_handler/data_store_service_client_closure.cpp (2)

1279-1279: LGTM: Snapshot timestamp extraction.

Extracting the snapshot timestamp for use in filtering logic is appropriate.

1285-1289: LGTM: Advancing start key for next batch.

Updating kv_start_key_ when processing the last item correctly prepares the next batch's start position, even if that item is subsequently skipped by the filtering logic.

tx_service/src/cc/local_cc_shards.cpp (1)

5465-5469: Slice-split gating now correctly requires growth beyond both thresholds

The updated guard (slice_post_ckpt_size <= StoreSlice::slice_upper_bound || slice_post_ckpt_size <= curr_slice->Size()) means we only enter the split path when PostCkptSize is finite, above slice_upper_bound, and strictly larger than the pre-ckpt slice size. This matches the PR’s stated heuristic and is consistent with the paused-slice handling earlier in UpdateSlices (any slice that previously entered the split path already had PostCkptSize > Size()). I don’t see correctness or concurrency issues here; just ensure tests cover cases where slices shrink but remain above slice_upper_bound so this behavior is intentional.

coderabbitai · 2026-01-05T09:44:43Z

store_handler/data_store_service_client_closure.cpp

+            if ((snapshot_ts == 0 && is_deleted) ||
+                (snapshot_ts > 0 && snapshot_ts > ts))
+            {
+                // if it is not a snapshot read, there is no need to return
+                // deleted keys.
+                // If it is a snapshot read and the latest version is newer than
+                // snapshot read ts, we need to backfill deleted key since there
+                // might be visible archive version of this key for snapshot ts.
+                // The caller will decide if reading archive table is necessary
+                // based on the deleted key version.
+                continue;
+            }


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Get file size to understand how to read it safely wc -l store_handler/data_store_service_client_closure.cpp

Repository: eloqdata/tx_service

Length of output: 119

🏁 Script executed:

# View the disputed lines 1317-1328 with context sed -n '1300,1350p' store_handler/data_store_service_client_closure.cpp | cat -n

Repository: eloqdata/tx_service

Length of output: 2404

🏁 Script executed:

# Also view the similar pattern at lines 219-220 mentioned in the review sed -n '210,230p' store_handler/data_store_service_client_closure.cpp | cat -n

Repository: eloqdata/tx_service

Length of output: 1105

🏁 Script executed:

# Search for function names containing the target lines to understand context rg -nP 'LoadRangeSliceCallback|FetchRecordCallback' store_handler/data_store_service_client_closure.cpp | head -20

Repository: eloqdata/tx_service

Length of output: 289

Critical: Snapshot filtering condition is inverted.

The condition snapshot_ts > ts at line 1319 contradicts both the comment and MVCC semantics:

Comment states: "the latest version is newer than snapshot read ts" (implies ts > snapshot_ts)

Code checks: snapshot_ts > ts (opposite condition)

MVCC rule: A snapshot at time S should skip records with ts > S (records newer than snapshot)

Confirming pattern (lines 219-220): Uses snapshot_read_ts_ < rec_ts_ to detect when a record is too new and requires archive fetch

The condition should be ts > snapshot_ts instead of snapshot_ts > ts to correctly skip records newer than the snapshot timestamp.

🤖 Prompt for AI Agents

In @store_handler/data_store_service_client_closure.cpp around lines 1317-1328, The snapshot filtering condition is inverted: inside the if-block that checks (snapshot_ts == 0 && is_deleted) || (snapshot_ts > 0 && snapshot_ts > ts) you should flip the comparison to (ts > snapshot_ts) so it reads (snapshot_ts == 0 && is_deleted) || (snapshot_ts > 0 && ts > snapshot_ts); update that condition in the same if used for skipping deleted/newer records (variables snapshot_ts, ts, is_deleted) so the code matches the comment and MVCC semantics (skip records newer than snapshot) and retains the existing continue/backfill behavior.

yi-xmu · 2026-01-09T10:27:55Z

Replaced by PR: #352

Fix slice split

7956bd5

1. Only need to split slice when post ckpt size is greater than the current size of the slice and greater than the slice upper bound. 2. For non snapshot read, there is no need to return the deleted keys from the data store.

yi-xmu self-assigned this Jan 5, 2026

yi-xmu requested a review from liunyl January 5, 2026 09:40

coderabbitai bot reviewed Jan 5, 2026

View reviewed changes

yi-xmu closed this Jan 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix slice split#336

Fix slice split#336
yi-xmu wants to merge 1 commit intomainfrom
fix_slice_split

yi-xmu commented Jan 5, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 5, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 5, 2026

Uh oh!

yi-xmu commented Jan 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yi-xmu commented Jan 5, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Here are some reminders before you submit the pull request

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

yi-xmu commented Jan 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yi-xmu commented Jan 5, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 5, 2026 •

edited

Loading