Conversation
WalkthroughThis change adds two unconditional reset statements to the Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ❌ 3❌ Failed checks (2 warnings, 1 inconclusive)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
tx_service/include/cc/cc_entry.h (1)
367-379: Fix the return value: the function should returnretinstead of alwaystrue.The fix correctly adds
RecycleKeyLock(*ccs)when the abort succeeds, addressing the PR objective. However, the function now always returnstrueregardless of whether the innerAbortBlockCmdRequestactually found and processed the blocked request.The inner
NonBlockingLock::AbortBlockCmdRequestreturnsfalsewhen the transaction is not found in either queue. The caller inobject_cc_map.h(line 408) explicitly checks for this failure case and logs a warning:if (!succeed) { DLOG(WARNING) << "AbortBlockCmdRequest fail to find tx in queue_block_cmds_ and blocking_queue_..."; }With the current change, this warning can never trigger. The function should return
retto preserve the original semantic meaning—thatfalseindicates the abort did not find a matching request.tx_service/src/cc/local_cc_shards.cpp (1)
3490-3493: Change unpin condition from> 0to>= 0to avoid pin leak when term equals 0.Line 3490-3493 uses
if (ng_term > 0)to guardUnpinNodeGroupData(), butTryPinNodeGroupData()returns a term value where>= 0indicates success. This creates a resource leak whenterm == 0, a valid pinned state. Other calls in the same file guard with>= 0(lines 3303, 3352, 3613), making this inconsistent.Fix
- if (ng_term > 0) + if (ng_term >= 0) { Sharder::Instance().UnpinNodeGroupData(task->node_group_id_); }
🧹 Nitpick comments (1)
tx_service/src/cc/local_cc_shards.cpp (1)
4314-4324: Potential quota-accounting bug if “empty after erase” still retains carry-over recordsDeallocating
flush_data_sizehere is only correct if no records are retained for the next loop viadata_sync_vecs[cc_shards_.size()]/archive_vecs[...]/mv_base_vecs[...]. If this branch can happen with non-empty “remaining” vectors, quota would be returned while memory is still held (and subsequent iterations also won’t re-account that carry-over memory).Proposed guard (assert or conditional)
if (data_sync_vec->empty()) { LOG(WARNING) << "data_sync_vec becomes empty after erase, old " "size of data_sync_vec_size: " << data_sync_vec_size; // Reset scan_cc.Reset(); + // Ensure we are not retaining carry-over records while returning quota. + assert(data_sync_vecs[cc_shards_.size()].empty()); + assert(archive_vecs[cc_shards_.size()].empty()); + assert(mv_base_vecs[cc_shards_.size()].empty()); // Return the quota to flush data memory usage pool since the // flush data task is not put into flush worker. data_sync_mem_controller_.DeallocateFlushMemQuota( flush_data_size); continue; }
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
tx_service/include/cc/cc_entry.htx_service/src/cc/local_cc_shards.cpp
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-12-02T10:43:27.431Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 254
File: tx_service/src/cc/local_cc_shards.cpp:2949-3188
Timestamp: 2025-12-02T10:43:27.431Z
Learning: In tx_service/src/cc/local_cc_shards.cpp, whenever TryPinNodeGroupData is used, only call Sharder::Instance().UnpinNodeGroupData(node_group) if the recorded term is >= 0 (i.e., pin succeeded). Example: LocalCcShards::PostProcessFlushTaskEntries guards the unpin with `if (term >= 0)`.
Applied to files:
tx_service/src/cc/local_cc_shards.cpp
📚 Learning: 2025-10-09T03:56:58.811Z
Learnt from: thweetkomputer
Repo: eloqdata/tx_service PR: 150
File: include/cc/local_cc_shards.h:626-631
Timestamp: 2025-10-09T03:56:58.811Z
Learning: For the LocalCcShards class in include/cc/local_cc_shards.h: Writer locks (unique_lock) should continue using the original meta_data_mux_ (std::shared_mutex) rather than fast_meta_data_mux_ (FastMetaDataMutex) at this stage. Only reader locks may use the FastMetaDataMutex wrapper.
Applied to files:
tx_service/src/cc/local_cc_shards.cpp
📚 Learning: 2025-11-11T07:10:40.346Z
Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.
Applied to files:
tx_service/src/cc/local_cc_shards.cpp
🔇 Additional comments (1)
tx_service/src/cc/local_cc_shards.cpp (1)
4364-4376: Good: return flush quota when breaking before enqueueing flush workThis closes the leak when a previous flush failure aborts the scan loop before the batch is handed off to flush workers. One nuance:
DeallocateFlushMemQuota()is called while holdingflight_task_mux_; ifDeallocateFlushMemQuota()can block/contend, consider moving it outside the lock.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tx_service/include/cc/cc_request.h (1)
4421-4435: InitializeSliceCoordinator::min_paused_slice_index_(and clarify Reset semantics).With the union removed,
min_paused_slice_index_is now always a livesize_t(Line 4528) but is only assigned in the non-continuous branch of the ctor (Line 4431-4433). In continuous mode it’s left uninitialized, which is an unnecessary UB footgun if it’s ever read (even accidentally / via future refactor / logging / sanitizer).Also worth double-checking whether
SliceCoordinator::Reset()(Line 4461+) is intended to preserve or reset the “min paused” state; right now it preserves both.Proposed fix (deterministic initialization)
struct SliceCoordinator { static constexpr uint16_t MaxBatchSliceCount = 128; SliceCoordinator(bool is_continuous, std::vector<std::pair<TxKey, bool>> *slices_keys) : is_continuous_(is_continuous), slices_keys_ptr_(slices_keys) { pinned_slices_.reserve(MaxBatchSliceCount); - if (is_continuous_) - { - min_paused_key_ = TxKey(); - } - else - { - min_paused_slice_index_ = 0; - } + // Always initialize both fields since both are now always present. + min_paused_key_ = TxKey(); + min_paused_slice_index_ = 0; batch_end_slice_index_ = 0; } @@ - TxKey min_paused_key_; - size_t min_paused_slice_index_; + TxKey min_paused_key_{}; + size_t min_paused_slice_index_{0}; std::vector<std::pair<TxKey, bool>> *slices_keys_ptr_{nullptr};Also applies to: 4461-4469, 4527-4529
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tx_service/include/cc/cc_request.h
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-20T04:30:07.884Z
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.
Applied to files:
tx_service/include/cc/cc_request.h
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tx_service/include/cc/cc_request.h (1)
4385-4402: GuardTxKeycomparisons against a default-initializedmin_paused_key_to avoid UB.When
min_paused_key_is default-constructed,GetConstPtr()returnsnullptr. The comparison*key < slice_coordinator_.min_paused_key_callsinterface_->LessThan(key_ptr, nullptr), which attempts to cast and dereference a null pointer in the lambda, causing undefined behavior. A guard prevents this by treating the default (negative infinity) case specially.Proposed fix
void UpdateMinPausedSlice(const TxKey *key) { assert(key->KeyPtr() != nullptr); assert(export_base_table_item_); - if (*key < slice_coordinator_.min_paused_key_) + if (slice_coordinator_.min_paused_key_.KeyPtr() == nullptr || + *key < slice_coordinator_.min_paused_key_) { slice_coordinator_.min_paused_key_ = std::move(key->GetShallowCopy()); } }
🧹 Nitpick comments (1)
tx_service/include/cc/cc_request.h (1)
4417-4465: SliceCoordinator cleanup is good; consider clarifying/strengthening init invariants for min_paused_ (post-union).*
Removing the anonymous union + destructor reduces lifetime footguns, but now correctness depends on what “empty”TxKey()andmin_paused_slice_index_ = 0mean (esp. across Reset/reuse). If0/ empty-key are sentinels, please document that (or encode it explicitly).
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
store_handler/eloq_data_store_service/eloqstoretx_service/include/cc/cc_request.h
✅ Files skipped from review due to trivial changes (1)
- store_handler/eloq_data_store_service/eloqstore
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 357
File: tx_service/src/tx_operation.cpp:4174-4195
Timestamp: 2026-01-12T13:27:46.584Z
Learning: In tx_service/src/tx_operation.cpp, SplitFlushRangeOp::Forward must only directly recycle (Finish(false), ClearInfos(), pop, return the op to the pool) when op_ == nullptr; otherwise let the state machine unwind to avoid recycling while sub-ops (including AsyncOp workers) may still be running.
Learnt from: thweetkomputer
Repo: eloqdata/tx_service PR: 150
File: include/cc/local_cc_shards.h:626-631
Timestamp: 2025-10-09T03:56:58.811Z
Learning: For the LocalCcShards class in include/cc/local_cc_shards.h: Writer locks (unique_lock) should continue using the original meta_data_mux_ (std::shared_mutex) rather than fast_meta_data_mux_ (FastMetaDataMutex) at this stage. Only reader locks may use the FastMetaDataMutex wrapper.
Learnt from: lokax
Repo: eloqdata/tx_service PR: 254
File: tx_service/src/cc/local_cc_shards.cpp:2949-3188
Timestamp: 2025-12-02T10:43:27.431Z
Learning: In tx_service/src/cc/local_cc_shards.cpp, whenever TryPinNodeGroupData is used, only call Sharder::Instance().UnpinNodeGroupData(node_group) if the recorded term is >= 0 (i.e., pin succeeded). Example: LocalCcShards::PostProcessFlushTaskEntries guards the unpin with `if (term >= 0)`.
📚 Learning: 2025-10-20T04:30:07.884Z
Learnt from: liunyl
Repo: eloqdata/tx_service PR: 149
File: include/cc/cc_request.h:1876-1927
Timestamp: 2025-10-20T04:30:07.884Z
Learning: ScanNextBatchCc in include/cc/cc_request.h is used only for hash-partition scans; range-partition scans are handled by ScanSliceCc.
Applied to files:
tx_service/include/cc/cc_request.h
🔇 Additional comments (1)
tx_service/include/cc/cc_request.h (1)
4455-4504:SliceCoordinator::Reset()clears only preparation-phase state; no behavioral change to resume logic.
Reset()clearsmin_paused_key_andmin_paused_slice_index_only as part of initializing a new scan. These fields are used exclusively during the one-time slice preparation phase (Phase 1). Actual batch-to-batch resumption during parallel scanning (Phase 2) relies on the separatepause_pos_[core_id]tracking, which is never cleared bySliceCoordinator::Reset(). No impact on resume semantics.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.