Skip to content

Range size tracking#420

Closed
yi-xmu wants to merge 15 commits intomainfrom
range_size_tracking
Closed

Range size tracking#420
yi-xmu wants to merge 15 commits intomainfrom
range_size_tracking

Conversation

@yi-xmu
Copy link
Collaborator

@yi-xmu yi-xmu commented Feb 26, 2026

Here are some reminders before you submit the pull request

  • Add tests for the change
  • Document changes
  • Reference the link of issue using fixes eloqdb/tx_service#issue_id
  • Reference the link of RFC if exists
  • Pass ./mtr --suite=mono_main,mono_multi,mono_basic

Summary by CodeRabbit

  • New Features

    • Added ability to fetch table range size information from the data store.
    • Enhanced range partitioning support with improved partition-aware operations and metadata tracking.
    • Extended range metadata to include size information for better resource management.
  • Improvements

    • Improved range batching logic to track and propagate range size metadata throughout write and batch operations.

@coderabbitai
Copy link

coderabbitai bot commented Feb 26, 2026

Walkthrough

This PR introduces range-size tracking infrastructure across the storage and transactional system. It adds the FetchTableRangeSize API (with stubs in handler implementations), threads range_size metadata through data store encoding/batching, introduces FetchTableRangeSizeCc request types, integrates range_size_flags into write entries and CC request classes, and extends CC map structures to track per-partition range sizes.

Changes

Cohort / File(s) Summary
Handler Stub Implementations
store_handler/bigtable_handler.*, store_handler/dynamo_handler.*, store_handler/rocksdb_handler.*
Added FetchTableRangeSize virtual method overrides as stubs logging "not implemented" and asserting, plus one additional RocksDBHandler stub for DeleteOutOfRangeDataInternal.
DataStoreServiceClient Range-Size Integration
store_handler/data_store_service_client.h, store_handler/data_store_service_client.cpp
Extended EncodeRangeValue to accept range_size parameter, added FetchTableRangeSize method, updated RangeSliceBatchPlan with range_size field, and threaded range_size through InitTableRanges, EnqueueRangeMetadataRecord, and batch preparation/dispatch paths.
Range-Size Callback Infrastructure
store_handler/data_store_service_client_closure.h, store_handler/data_store_service_client_closure.cpp
Added new FetchRangeSizeCallback function to handle range-size fetch results, adjusted value payload size assertions for encoded range metadata.
CC Request Types and CcMap
tx_service/include/cc/cc_req_misc.h, tx_service/src/cc/cc_req_misc.cpp, tx_service/include/cc/cc_map.h, tx_service/src/cc/cc_map.cpp, tx_service/include/cc/cc_shard.h, tx_service/src/cc/cc_shard.cpp
Introduced FetchTableRangeSizeCc struct with Reset/ValidTermCheck/Execute/SetFinish methods, added InitRangeSize to CcMap for per-partition range size tracking, extended CcShard with FetchTableRangeSize method and fetch_range_size_cc_pool_.
CC Request Class Extensions
tx_service/include/cc/cc_request.h
Added range_size_flags_ field to PostWriteCc with accessors (PartitionId, OnDirtyRange, NeedUpdateRangeSize), extended UploadBatchCc with partition_id_ member and revised WriteEntryTuple/paused_pos_ structures to accommodate range-related metadata.
CC Handler Signatures
tx_service/include/cc/cc_handler.h, tx_service/include/cc/local_cc_handler.h, tx_service/include/remote/remote_cc_handler.h, tx_service/src/cc/local_cc_handler.cpp, tx_service/src/remote/remote_cc_handler.cpp
Added range_size_flags (uint8_t) parameter with default 0x10 to PostWrite virtual method across CcHandler, LocalCcHandler, and RemoteCcHandler; propagated parameter through implementation call chains.
TemplateCcMap and RangeCcMap
tx_service/include/cc/template_cc_map.h, tx_service/include/cc/range_cc_map.h, tx_service/include/cc/object_cc_map.h
Added UpdateRangeSize method to TemplateCcMap with per-partition delta tracking, extended range size initialization in RangeCcMap during range split/commit, updated ObjectCcMap tuple destructuring to handle flags field.
WriteSetEntry and Type Definitions
tx_service/include/read_write_entry.h, tx_service/include/type.h
Added range_size_flags_ field to WriteSetEntry with move semantics, introduced RangeSizeStatus enum (kNotInitialized, kLoading) for tracking range size load state.
Protobuf Message Updates
tx_service/include/proto/cc_request.proto
Added partition_id (int32) and range_size_flags (bytes) fields to UploadBatchRequest, added range_size_flags (uint32) field to PostCommitRequest.
Network and Remote Layer
tx_service/src/remote/cc_node_service.cpp, tx_service/src/remote/remote_cc_request.cpp
Updated UploadBatch processing to extract and use partition_id for targeted core routing, extended WriteEntryTuple serialization to include range_size_flags, propagated range_size_flags through PostWriteCc Reset invocations.
Index Generation and Routing
tx_service/include/sk_generator.h, tx_service/src/sk_generator.cpp
Refactored NGIndexSet type to map node groups to per-range_id nested maps with (uint8_t, WriteEntry*) pairs, updated SendIndexes signature to accept partition_id and new write_entry_vec type, introduced per-range batching and routing logic with range_size_flags handling.
Transaction Execution
tx_service/src/tx_execution.cpp, tx_service/src/tx_operation.cpp, tx_service/src/cc/local_cc_shards.cpp
Modified PostProcess to compute and pass range_size_flags as (0x10 | write_entry.range_size_flags_) to PostWrite, initialized range_size_flags_ in LockWriteRangeBucketsOp with dirty state encoding, masked flags (0x0F) when forwarding to new range owners.
DataStoreHandler Interface
tx_service/include/store/data_store_handler.h
Added pure virtual FetchTableRangeSize method to DataStoreHandler base class.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

  • [Feature]: Range size tracking #408: Directly implements FetchTableRangeSize stubs and threads range_size through DataStoreServiceClient, CC types, and ccmap structures to track per-partition range sizing state.

Possibly related PRs

Suggested reviewers

  • liunyl
  • xiexiaoy
  • lokax

Poem

🐰 A rabbit hops through ranges wide,
Tracking sizes far and far,
Range_size_flags that help decide,
Which partition plays its part. 🌟
From store to CC, the data flows,
Now knowing what each partition knows!

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description consists only of the template checklist with unchecked boxes; it lacks any substantive explanation of the changes, motivation, or implementation details beyond what the template requests. Add a detailed description explaining the range size tracking feature, including what was implemented, why it was needed, and how it works. Check off completed items from the template.
Docstring Coverage ⚠️ Warning Docstring coverage is 6.76% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Range size tracking' directly describes the primary feature added across the codebase: tracking range sizes through new methods, data structures, and logic.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch range_size_tracking

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@yi-xmu yi-xmu self-assigned this Feb 26, 2026
@yi-xmu yi-xmu linked an issue Feb 26, 2026 that may be closed by this pull request
8 tasks
@yi-xmu yi-xmu force-pushed the range_size_tracking branch from 11b0654 to 4694818 Compare March 2, 2026 10:57
@yi-xmu yi-xmu marked this pull request as ready for review March 4, 2026 07:08
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
tx_service/src/remote/remote_cc_request.cpp (1)

577-625: ⚠️ Potential issue | 🟠 Major

Add range_size_flags field to ForwardPostCommitRequest proto and pass it to PostWriteCc::Reset in the forwarded path.

range_size_flags is present in PostCommitRequest (field 8) and explicitly passed to PostWriteCc::Reset in the direct path, but ForwardPostCommitRequest lacks this field entirely. This causes the forwarded path to use the default value (0x10) rather than the actual range mutation state, creating tracking drift if forwarded commits can modify range size.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tx_service/src/remote/remote_cc_request.cpp` around lines 577 - 625,
Forwarded commits don't carry range_size_flags because ForwardPostCommitRequest
lacks the proto field and the forwarded path's call to PostWriteCc::Reset omits
it, causing default 0x10 to be used; add a range_size_flags field to
ForwardPostCommitRequest (matching PostCommitRequest), update the code that
reads ForwardPostCommitRequest (symbol: post_commit) to call
post_commit.range_size_flags(), and pass
static_cast<uint8_t>(post_commit.range_size_flags()) into the forwarded
PostWriteCc::Reset invocation (symbol: PostWriteCc::Reset) so both direct
(PostCommitRequest) and forwarded paths use the same range_size_flags value.
tx_service/include/proto/cc_request.proto (1)

917-926: ⚠️ Potential issue | 🔴 Critical

Proto3 scalar default creates silent wire-compatibility issue for range_size_flags.

PostCommitRequest adds uint32 range_size_flags at line 925. In proto3, absent uint32 fields deserialize to 0, not the expected C++ default of 0x10. When older senders omit this field, receivers silently get 0x00 instead of the legacy default, breaking behavior consistency across mixed-version deployments.

The code directly sets and reads this field without normalization:

  • Set: post_commit->set_range_size_flags(range_size_flags) (line 198)
  • Read: static_cast<uint8_t>(post_commit.range_size_flags()) (line 598)

No decode-path fallback exists to normalize zero values to 0x10 for backward compatibility.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tx_service/include/proto/cc_request.proto` around lines 917 - 926,
PostCommitRequest's proto-added uint32 range_size_flags will default to 0 on
wire (proto3), breaking legacy behavior that expects 0x10; fix by normalizing
both write and read paths: when constructing messages, ensure callers calling
set_range_size_flags(...) set 0x10 if the provided range_size_flags is 0
(replace direct set_range_size_flags(range_size_flags) with a conditional that
uses 0x10 when range_size_flags==0), and when reading, replace direct uses of
static_cast<uint8_t>(post_commit.range_size_flags()) with a small helper (e.g.,
get_normalized_range_size_flags(const PostCommitRequest&)) that returns
post_commit.range_size_flags() ?: 0x10 (or equivalent) so receivers treat
absent/zero values as 0x10 for backward compatibility.
tx_service/src/remote/cc_node_service.cpp (1)

1175-1241: ⚠️ Potential issue | 🔴 Critical

Require explicit presence check for partition_id to prevent silent single-partition routing.

UploadBatchRequest.partition_id is not marked optional in the proto definition (cc_request.proto:180), so unset values deserialize as 0 in proto3. The handler at line 1203 and 1230 checks if (partition_id >= 0), which is true for 0, causing unset requests to route to partition_id % cc_shards->Count() (core 0) instead of broadcasting across all shards. The intended semantics—where -1 signifies broadcast and >= 0 targets a specific partition—are silently broken if a caller forgets to explicitly set partition_id.

All current callers in sk_generator.cpp and local_cc_shards.cpp explicitly set partition_id, but this is not enforced at the proto or closure level. Require optional int32 partition_id in the proto and check has_partition_id() alongside the value comparison to preserve legacy broadcast behavior:

Proposed fix
-    int32_t partition_id = request->partition_id();
+    // Requires `optional int32 partition_id` in proto for presence check.
+    const bool has_partition_id = request->has_partition_id();
+    int32_t partition_id = has_partition_id ? request->partition_id() : -1;

-    size_t core_cnt = (partition_id >= 0) ? 1 : cc_shards->Count();
+    size_t core_cnt =
+        (has_partition_id && partition_id >= 0) ? 1 : cc_shards->Count();

-    if (partition_id >= 0)
+    if (has_partition_id && partition_id >= 0)
     {
         uint16_t dest_core =
             static_cast<uint16_t>(partition_id % cc_shards->Count());
         cc_shards->EnqueueToCcShard(dest_core, &req);
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tx_service/src/remote/cc_node_service.cpp` around lines 1175 - 1241, The
handler currently reads request->partition_id() into partition_id and uses
(partition_id >= 0) to decide single-partition routing; because proto3 defaults
unset ints to 0 this silently routes unset requests to core 0. Update the proto
cc_request.proto to make partition_id optional (optional int32 partition_id) and
then change the CcNodeService UploadBatch handler in
tx_service/src/remote/cc_node_service.cpp to test presence using
request->has_partition_id() before using the value (e.g., if
(request->has_partition_id() && partition_id >= 0) { /* route to single core
using partition_id % cc_shards->Count() */ } else { /* broadcast to all cores */
}), ensuring the broadcast semantics when partition_id is absent or explicitly
-1 are preserved; reference symbols: UploadBatchRequest.partition_id,
cc_request.proto, variable partition_id, request->has_partition_id(),
Sharder::Instance(), cc_shards->EnqueueToCcShard.
store_handler/data_store_service_client.cpp (1)

1287-1304: ⚠️ Potential issue | 🟠 Major

Range-size accumulation can overflow int32_t and corrupt metadata.

range_size is tracked and encoded as int32_t, but it is incremented from per-slice uint32_t sizes. Large ranges can overflow at Line 1421 and then persist an invalid size via EncodeRangeValue(...).

🛠️ Minimal safe fix (saturating add while keeping wire format)
+#include <limits>
...
-        uint32_t slice_size = static_cast<uint32_t>(slices[i]->Size());
-        plan.range_size += static_cast<int32_t>(slice_size);
+        uint32_t slice_size = static_cast<uint32_t>(slices[i]->Size());
+        int64_t next_range_size =
+            static_cast<int64_t>(plan.range_size) + static_cast<int64_t>(slice_size);
+        plan.range_size =
+            next_range_size > std::numeric_limits<int32_t>::max()
+                ? std::numeric_limits<int32_t>::max()
+                : static_cast<int32_t>(next_range_size);

Also applies to: 1370-1373, 1420-1422, 1766-1768, 2085-2087

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@store_handler/data_store_service_client.cpp` around lines 1287 - 1304, The
code currently accumulates per-slice uint32_t sizes into the int32_t variable
range_size which can overflow and corrupt the serialized metadata; update the
logic in EncodeRangeValue (and other places where range_size is built from slice
sizes) to perform accumulation in a wider signed type (e.g., int64_t or int64_t
temp_sum), use saturating-add semantics (clamp to INT32_MAX if the sum exceeds
INT32_MAX), and only cast/truncate to int32_t when appending/encoding (e.g.,
before kv_range_record.append(reinterpret_cast<const char*>(&range_size),
sizeof(int32_t)) use int32_t safe_size = (temp_sum > INT32_MAX ? INT32_MAX :
static_cast<int32_t>(temp_sum))). Ensure the same saturating behavior is applied
wherever range_size is computed (the other instances you noted) so the wire
format stays int32_t but never contains a negative/overflowed value.
🧹 Nitpick comments (6)
tx_service/include/remote/remote_cc_handler.h (1)

87-88: Document the meaning of range_size_flags and its default value.

The default value 0x10 is a magic number without documentation. Consider:

  1. Adding a brief comment explaining what this parameter represents and what the default value means.
  2. Defining named constants for the flag values (e.g., kDefaultRangeSizeFlags) to improve readability and maintainability.

This helps future maintainers understand the semantics without needing to trace through the codebase.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tx_service/include/remote/remote_cc_handler.h` around lines 87 - 88, Document
the meaning of the uint8_t parameter range_size_flags in the signature of the
relevant handler (the function taking CcHandlerResult<PostProcessResult> &hres)
by adding a brief comment that explains what each bit/flag represents and what
the default 0x10 value denotes; also replace the magic literal with a named
constant (e.g., kDefaultRangeSizeFlags) declared near the header (or an
appropriate enum/constexpr) and use that constant in the function declaration to
improve readability and maintainability.
tx_service/include/read_write_entry.h (1)

75-77: Replace magic default 0x10 with a named constant.

This helps avoid semantic drift across call sites that also rely on the same bit contract.

♻️ Suggested cleanup
 struct WriteSetEntry
 {
+    static constexpr uint8_t kRangeSizeFlagsDefault = 0x10;
@@
-    uint8_t range_size_flags_{0x10};
+    uint8_t range_size_flags_{kRangeSizeFlagsDefault};
 };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tx_service/include/read_write_entry.h` around lines 75 - 77, The field
range_size_flags_ is initialized with a magic literal 0x10; replace that literal
with a descriptive named constant (e.g., a constexpr or enum value like
DEFAULT_RANGE_SIZE_FLAG or RangeSizeFlags::kNeedUpdateRangeSize) that documents
the bit meaning (high 4 bits = need update range size, low 4 bits = on-dirty
splitting), initialize range_size_flags_ using that constant, and update any
other places that rely on the same bit contract to use the named constant so the
intent is clear and consistent across call sites.
tx_service/src/remote/cc_node_service.cpp (1)

1206-1211: Add a fail-fast size check for range_size_flags.

This method forwards range_size_flags without validating that its cardinality matches batch_size (or is empty by design). A malformed payload can desynchronize per-entry metadata from keys/records downstream.

💡 Proposed refactor
     uint32_t batch_size = request->batch_size();
+    if (request->range_size_flags_size() != 0 &&
+        request->range_size_flags_size() != static_cast<int>(batch_size))
+    {
+        response->set_error_code(
+            ToRemoteType::ConvertCcErrorCode(CcErrorCode::INVALID_REQ));
+        response->set_ng_term(ng_term);
+        return;
+    }

     auto write_entry_tuple =
         UploadBatchCc::WriteEntryTuple(request->keys(),
                                        request->records(),
                                        request->commit_ts(),
                                        request->rec_status(),
                                        request->range_size_flags());
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tx_service/src/remote/cc_node_service.cpp` around lines 1206 - 1211, The code
forwards request->range_size_flags into UploadBatchCc::WriteEntryTuple without
validating its cardinality; add a fail-fast check before building
write_entry_tuple that computes batch_size (from request->keys().size() or
request->records().size()), and if range_size_flags is non-empty and its size()
!= batch_size, return an error (e.g., set an appropriate grpc/Status error or
early return) and log the mismatch; ensure this validation sits just before the
UploadBatchCc::WriteEntryTuple(...) call so malformed payloads are rejected
rather than desynchronizing downstream processing.
tx_service/src/cc/cc_req_misc.cpp (1)

1551-1555: Consider candidate-term-aware validation for election transitions

ValidTermCheck() only compares against LeaderTerm. Using the same candidate/leader strategy used elsewhere in this file would reduce transient false negatives during leadership transitions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tx_service/src/cc/cc_req_misc.cpp` around lines 1551 - 1555, ValidTermCheck
currently only compares node_group_term_ against
Sharder::Instance().LeaderTerm(node_group_id_), which can spuriously fail during
election transitions; change the check in
FetchTableRangeSizeCc::ValidTermCheck() to consider the candidate term as well
(e.g., fetch the candidate term from
Sharder::Instance().CandidateTerm(node_group_id_) or otherwise accept equality
to either the leader term or the candidate term) so the method returns true if
node_group_term_ matches either term; update any references to node_group_id_
and node_group_term_ inside that function accordingly.
tx_service/include/cc/template_cc_map.h (1)

622-633: Extract duplicated range-size delta math into one helper.

The delta formula is duplicated in two paths (PostWrite and UploadBatch). Keeping one implementation reduces drift risk in future changes.

♻️ Refactor sketch
+    static int32_t ComputeRangeSizeDelta(const KeyT *write_key,
+                                         size_t old_payload_size,
+                                         size_t new_payload_size,
+                                         RecordStatus old_status,
+                                         RecordStatus new_status)
+    {
+        if (new_status == RecordStatus::Deleted)
+        {
+            return -static_cast<int32_t>(write_key->Size() + old_payload_size);
+        }
+        if (old_status != RecordStatus::Normal)
+        {
+            return static_cast<int32_t>(write_key->Size() + new_payload_size);
+        }
+        return static_cast<int32_t>(new_payload_size - old_payload_size);
+    }

-                        const int64_t key_delta_size =
-                            (new_status == RecordStatus::Deleted)
-                                ? (-static_cast<int64_t>(write_key->Size() +
-                                                         old_payload_size))
-                                : (cce_old_status != RecordStatus::Normal
-                                       ? static_cast<int64_t>(
-                                             write_key->Size() +
-                                             cce->PayloadSize())
-                                       : static_cast<int64_t>(
-                                             cce->PayloadSize() -
-                                             old_payload_size));
+                        const int32_t key_delta_size = ComputeRangeSizeDelta(
+                            write_key,
+                            old_payload_size,
+                            cce->PayloadSize(),
+                            cce_old_status,
+                            new_status);

-                        UpdateRangeSize(range_id,
-                                        static_cast<int32_t>(key_delta_size),
-                                        req.OnDirtyRange());
+                        UpdateRangeSize(range_id, key_delta_size, req.OnDirtyRange());

Also applies to: 7850-7858

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tx_service/include/cc/template_cc_map.h` around lines 622 - 633, Extract the
duplicated delta calculation into a single helper (e.g. ComputeRangeSizeDelta)
so both PostWrite and UploadBatch call it instead of duplicating logic:
implement a function that takes the inputs used in the duplicated expression
(new_status, cce_old_status, write_key->Size(), old_payload_size, and
cce->PayloadSize()) and returns the int64_t key_delta_size using the same
conditional logic for RecordStatus::Deleted and the other branches; replace the
existing inline expressions in both places (where key_delta_size is computed)
with a call to this helper and keep using req.PartitionId() as before.
tx_service/src/sk_generator.cpp (1)

1005-1007: Replace magic flag masks with named constants.

0x10 and 0x0F in the flag mutation path are hard to audit and easy to misuse during future changes.

♻️ Suggested readability refactor
+    constexpr uint8_t kUpdateRangeSizeFlag = 0x10;
+    constexpr uint8_t kPreserveLowerBitsMask = 0x0F;
...
-    const uint8_t default_flags =
-        0x10 | static_cast<uint8_t>(range_info->IsDirty());
+    const uint8_t default_flags =
+        kUpdateRangeSizeFlag | static_cast<uint8_t>(range_info->IsDirty());
...
-            *old_range_flags_ptr &= 0x0F;
+            *old_range_flags_ptr &= kPreserveLowerBitsMask;
...
-                    *new_bucket_flags_ptr &= 0x0F;
+                    *new_bucket_flags_ptr &= kPreserveLowerBitsMask;

Also applies to: 1045-1057

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tx_service/src/sk_generator.cpp` around lines 1005 - 1007, Replace the magic
literals used for flags with well-named constants: define e.g. constexpr uint8_t
kDirtyFlag = 0x10; and constexpr uint8_t kFlagMask = 0x0F (or better semantic
names matching their meaning) and use those instead of the literals in the range
flag code (references: range_info->PartitionId(), default_flags, and the flag
mutation path around the 1045-1057 block). Update the default_flags
initialization and any bitwise operations that currently use 0x10 or 0x0F to use
the new constants and add a short comment explaining each constant's purpose.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@store_handler/bigtable_handler.cpp`:
- Around line 713-718: The FetchTableRangeSize method currently aborts via
assert(false) and never completes the txservice::FetchTableRangeSizeCc request;
instead, remove the assert and implement a proper error completion path: log the
unimplemented path, set an appropriate error status on fetch_cc (e.g., an
internal/NOT_IMPLEMENTED error or populate any available response/error fields
on txservice::FetchTableRangeSizeCc), then call fetch_cc->Complete() (or the
correct completion method) so the caller is always notified; reference the
EloqDS::BigTableHandler::FetchTableRangeSize function and the fetch_cc parameter
when making these changes.

In `@store_handler/bigtable_handler.h`:
- Around line 85-87: The stub FetchTableRangeSize implementation must stop using
assert(false) and instead set an explicit error on the completion callback
parameter txservice::FetchTableRangeSizeCc *fetch_cc to signal an unsupported
operation; locate the FetchTableRangeSize definition in bigtable_handler.cpp and
replace the assert with a call that sets an appropriate error/status on fetch_cc
(for example using fetch_cc->set_error(...) or the project's equivalent
set_status/failure method with a clear "not implemented/unsupported" message)
before returning so callers always receive a structured failure response.

In `@store_handler/data_store_service_client_closure.cpp`:
- Around line 956-962: The assert on read_val.size() must be replaced with a
runtime size validation before doing buf arithmetic and the reinterpret_cast:
check that read_val.size() ==
(sizeof(int32_t)+sizeof(uint64_t)+sizeof(uint64_t)+sizeof(uint32_t)+sizeof(int32_t))
and if not, handle the error path (e.g., set an error status/return an error
code, log and abort parsing) instead of proceeding; then compute const char *buf
= read_val.data() + read_val.size() - sizeof(int32_t) and safely decode
fetch_range_size_cc->store_range_size_ = *reinterpret_cast<const int32_t
*>(buf). Apply the same runtime validation/failure handling to the similar
patterns around the uses at the other locations (the checks near the code that
sets fetch_range_size_cc->store_range_size_ and the spots mentioned at the other
offsets).

In `@store_handler/dynamo_handler.cpp`:
- Around line 2537-2541: The function EloqDS::DynamoHandler::FetchTableRangeSize
currently aborts via assert and leaves fetch_cc uncompleted; instead, set an
appropriate datastore error on fetch_cc and invoke its completion path so the
caller is notified (do not assert). Locate
EloqDS::DynamoHandler::FetchTableRangeSize and replace the assert with code that
populates fetch_cc with an error status/message indicating "unsupported on this
path" (matching the error pattern used by other handlers) and then call the same
completion method used elsewhere for these CC objects (the fetch_cc
completion/send/done method used by other handlers) so fetch_cc is always
completed. Ensure you reference fetch_cc and the function name
EloqDS::DynamoHandler::FetchTableRangeSize when making the change.

In `@store_handler/rocksdb_handler.cpp`:
- Around line 1113-1118: Replace the assert-based abort in
RocksDBHandler::FetchTableRangeSize so it returns a proper datastore error
instead of crashing: detect that the operation is unsupported on RocksDB,
populate an appropriate error/status on the provided
txservice::FetchTableRangeSizeCc* fetch_cc (e.g., set a
datastore-not-implemented or internal error with a clear message), and then
complete the request by invoking the fetch_cc completion/reply method so the
caller receives an error response rather than triggering assert(false).

In `@store_handler/rocksdb_handler.h`:
- Around line 349-351: The FetchTableRangeSize override currently asserts(false)
on unsupported ops; replace that crash with proper error handling by setting an
error/status on the txservice::FetchTableRangeSizeCc* fetch_cc (e.g., set an
error code and descriptive error message indicating "operation not supported" or
similar) and return gracefully; do the same fix for the BigTableHandler
equivalent to ensure callers can check fetch_cc for failures instead of relying
on assertions. Ensure you use the fetch_cc API (status/error fields or setter
methods present on txservice::FetchTableRangeSizeCc) to communicate the failure
and do not call assert or abort.

In `@tx_service/include/cc/cc_req_misc.h`:
- Around line 1169-1171: FetchTableRangeSizeCc currently leaves table_name_
uninitialized and uses 0 for partition_id_, which can cause UB or silent logic
errors; change the member initializers in the FetchTableRangeSizeCc declaration
so table_name_ is explicitly initialized to nullptr (or equivalent sentinel) and
partition_id_ is initialized to an invalid sentinel like -1, keeping start_key_
initialized as-is; update any constructor/Reset() assumptions to handle
nullptr/-1 as the uninitialized state (references: FetchTableRangeSizeCc,
table_name_, partition_id_, start_key_, Reset).

In `@tx_service/include/cc/cc_request.h`:
- Around line 888-893: PartitionId() is currently extracting the low 10
core-shard/load-balancing bits (key_shard_code_ & 0x3FF) instead of the
range-partition id; change the extraction to drop the low 10 bits and return the
upper bits of key_shard_code_ (i.e., shift right by 10) so the method returns
the actual partition id used for range partitioning (update PartitionId() to use
key_shard_code_ >> 10 or the equivalent masked-shift if a width is required);
note LocalCcShards intentionally uses shard_code & 0x3FF for core load
balancing, so keep that behavior separate.

In `@tx_service/include/cc/object_cc_map.h`:
- Around line 1561-1562: The UploadBatchCc handler unpacks flags_str from
entry_tuples but never tracks or advances a flags_offset, and it passes a
hardcoded 0 into SetPausedPosition; fix by adding a flags_offset variable
(initialize to 0 like the template_cc_map.h implementation), increment it
whenever you consume a flags byte for each entry (mirror the logic used for
key_offset/rec_offset/ts_offset/status_offset), and pass the current
flags_offset to SetPausedPosition instead of 0; update the block around
UploadBatchCc, the entry_tuples unpack usage, and the call to SetPausedPosition
so flags_offset is maintained and used for resume alignment.

In `@tx_service/include/cc/range_cc_map.h`:
- Around line 731-736: The conversion from size_t to int32_t for range_size
(obtained via new_range_entries.at(idx)->TypedStoreRange()->PostCkptSize()) can
overflow; update the code before calling ccm->InitRangeSize to check that
range_size <= INT32_MAX and handle the overflow (e.g., cap to INT32_MAX or
return/log an error/abort the operation) instead of unconditionally
static_casting to int32_t; ensure the same checked conversion is applied to the
other occurrence (the second block around ccm->InitRangeSize) so both places
avoid truncation/corruption.

In `@tx_service/include/cc/template_cc_map.h`:
- Around line 12007-12020: When transitioning a partition's range size from
kNotInitialized to kLoading in template_cc_map (the block that sets
it->second.first = RangeSizeStatus::kLoading and calls
shard_->FetchTableRangeSize(...)), you must also preserve the write delta that
triggered initialization so it isn't lost; before returning after the
FetchTableRangeSize call, add the current write's delta to the partition's
pending delta accumulator (e.g., it->second.second or the structure holding
accumulated deltas for table_name_, partition_id, cc_ng_id_) or mark it as
pending so the value will be applied when the real range size arrives, ensuring
the first write is counted while the async load completes.

In `@tx_service/src/cc/cc_map.cpp`:
- Around line 481-483: The addition persisted_size + it->second.second can
overflow int32_t; change the computation to use int64_t (e.g., compute int64_t
final_size = (int64_t)persisted_size + (int64_t)it->second.second), then clamp
final_size to the int32_t range (lower bound 0 per current behavior and upper
bound INT32_MAX) before assigning back to it->second.first and set
it->second.second = 0; ensure assignments cast back to int32_t after clamping to
avoid signed overflow.

In `@tx_service/src/cc/cc_req_misc.cpp`:
- Around line 1565-1568: The code currently assumes ccs.GetCcm(*table_name_,
node_group_id_) always returns non-null and dereferences ccm directly; add a
runtime null-check after calling ccs.GetCcm (checking if ccm == nullptr) before
calling ccm->InitRangeSize, and handle the null case by returning or logging an
error/context (include table_name_ and node_group_id_ in the log) so
InitRangeSize is only invoked when ccm is valid; reference the call site using
ccs.GetCcm, the ccm pointer variable, and the InitRangeSize invocation along
with partition_id_, store_range_size_, and succ for context.
- Line 1568: The code casts partition_id_ to uint32_t without checking for
negative values, which can wrap and update the wrong partition; before the
static_cast<uint32_t>(partition_id_) (in cc_req_misc.cpp where partition_id_,
store_range_size_, succ are used), add a validation that partition_id_ >= 0 and
handle the error path (log an error via the existing logger and return/abort the
update) if it's negative so you never pass a wrapped uint32_t onward.

In `@tx_service/src/cc/cc_shard.cpp`:
- Around line 390-393: The code currently asserts range_entry != nullptr after
calling GetTableRangeEntry and then dereferences it (TxKey start_key =
range_entry->GetRangeInfo()->StartTxKey()), which can crash in release builds;
replace the assert with an explicit null check for the TableRangeEntry* returned
by GetTableRangeEntry(cc_ng_id, partition_id) and return a deterministic error
path instead of dereferencing — e.g., return an error
Status/optional/null-equivalent or propagate a clear error code from the
enclosing function, and add a descriptive log message naming the
range_table_name, cc_ng_id and partition_id; update any callers to handle the
new error return if needed so the code no longer relies on assert for
correctness.

In `@tx_service/src/tx_execution.cpp`:
- Around line 5300-5317: The primary PostWrite call is omitting the base
range-size flag while the forward-write loop ORs 0x10 into
write_entry.range_size_flags_; update the primary cc_handler_->PostWrite
invocation (the one before the forward_addr_ loop) to include the same base flag
logic (i.e., pass (0x10 | write_entry.range_size_flags_)) so range-size tracking
is preserved for primary writes; ensure this change is applied where PostWrite
is called and that update_post_cnt/related handling remains unchanged.

---

Outside diff comments:
In `@store_handler/data_store_service_client.cpp`:
- Around line 1287-1304: The code currently accumulates per-slice uint32_t sizes
into the int32_t variable range_size which can overflow and corrupt the
serialized metadata; update the logic in EncodeRangeValue (and other places
where range_size is built from slice sizes) to perform accumulation in a wider
signed type (e.g., int64_t or int64_t temp_sum), use saturating-add semantics
(clamp to INT32_MAX if the sum exceeds INT32_MAX), and only cast/truncate to
int32_t when appending/encoding (e.g., before
kv_range_record.append(reinterpret_cast<const char*>(&range_size),
sizeof(int32_t)) use int32_t safe_size = (temp_sum > INT32_MAX ? INT32_MAX :
static_cast<int32_t>(temp_sum))). Ensure the same saturating behavior is applied
wherever range_size is computed (the other instances you noted) so the wire
format stays int32_t but never contains a negative/overflowed value.

In `@tx_service/include/proto/cc_request.proto`:
- Around line 917-926: PostCommitRequest's proto-added uint32 range_size_flags
will default to 0 on wire (proto3), breaking legacy behavior that expects 0x10;
fix by normalizing both write and read paths: when constructing messages, ensure
callers calling set_range_size_flags(...) set 0x10 if the provided
range_size_flags is 0 (replace direct set_range_size_flags(range_size_flags)
with a conditional that uses 0x10 when range_size_flags==0), and when reading,
replace direct uses of static_cast<uint8_t>(post_commit.range_size_flags()) with
a small helper (e.g., get_normalized_range_size_flags(const PostCommitRequest&))
that returns post_commit.range_size_flags() ?: 0x10 (or equivalent) so receivers
treat absent/zero values as 0x10 for backward compatibility.

In `@tx_service/src/remote/cc_node_service.cpp`:
- Around line 1175-1241: The handler currently reads request->partition_id()
into partition_id and uses (partition_id >= 0) to decide single-partition
routing; because proto3 defaults unset ints to 0 this silently routes unset
requests to core 0. Update the proto cc_request.proto to make partition_id
optional (optional int32 partition_id) and then change the CcNodeService
UploadBatch handler in tx_service/src/remote/cc_node_service.cpp to test
presence using request->has_partition_id() before using the value (e.g., if
(request->has_partition_id() && partition_id >= 0) { /* route to single core
using partition_id % cc_shards->Count() */ } else { /* broadcast to all cores */
}), ensuring the broadcast semantics when partition_id is absent or explicitly
-1 are preserved; reference symbols: UploadBatchRequest.partition_id,
cc_request.proto, variable partition_id, request->has_partition_id(),
Sharder::Instance(), cc_shards->EnqueueToCcShard.

In `@tx_service/src/remote/remote_cc_request.cpp`:
- Around line 577-625: Forwarded commits don't carry range_size_flags because
ForwardPostCommitRequest lacks the proto field and the forwarded path's call to
PostWriteCc::Reset omits it, causing default 0x10 to be used; add a
range_size_flags field to ForwardPostCommitRequest (matching PostCommitRequest),
update the code that reads ForwardPostCommitRequest (symbol: post_commit) to
call post_commit.range_size_flags(), and pass
static_cast<uint8_t>(post_commit.range_size_flags()) into the forwarded
PostWriteCc::Reset invocation (symbol: PostWriteCc::Reset) so both direct
(PostCommitRequest) and forwarded paths use the same range_size_flags value.

---

Nitpick comments:
In `@tx_service/include/cc/template_cc_map.h`:
- Around line 622-633: Extract the duplicated delta calculation into a single
helper (e.g. ComputeRangeSizeDelta) so both PostWrite and UploadBatch call it
instead of duplicating logic: implement a function that takes the inputs used in
the duplicated expression (new_status, cce_old_status, write_key->Size(),
old_payload_size, and cce->PayloadSize()) and returns the int64_t key_delta_size
using the same conditional logic for RecordStatus::Deleted and the other
branches; replace the existing inline expressions in both places (where
key_delta_size is computed) with a call to this helper and keep using
req.PartitionId() as before.

In `@tx_service/include/read_write_entry.h`:
- Around line 75-77: The field range_size_flags_ is initialized with a magic
literal 0x10; replace that literal with a descriptive named constant (e.g., a
constexpr or enum value like DEFAULT_RANGE_SIZE_FLAG or
RangeSizeFlags::kNeedUpdateRangeSize) that documents the bit meaning (high 4
bits = need update range size, low 4 bits = on-dirty splitting), initialize
range_size_flags_ using that constant, and update any other places that rely on
the same bit contract to use the named constant so the intent is clear and
consistent across call sites.

In `@tx_service/include/remote/remote_cc_handler.h`:
- Around line 87-88: Document the meaning of the uint8_t parameter
range_size_flags in the signature of the relevant handler (the function taking
CcHandlerResult<PostProcessResult> &hres) by adding a brief comment that
explains what each bit/flag represents and what the default 0x10 value denotes;
also replace the magic literal with a named constant (e.g.,
kDefaultRangeSizeFlags) declared near the header (or an appropriate
enum/constexpr) and use that constant in the function declaration to improve
readability and maintainability.

In `@tx_service/src/cc/cc_req_misc.cpp`:
- Around line 1551-1555: ValidTermCheck currently only compares node_group_term_
against Sharder::Instance().LeaderTerm(node_group_id_), which can spuriously
fail during election transitions; change the check in
FetchTableRangeSizeCc::ValidTermCheck() to consider the candidate term as well
(e.g., fetch the candidate term from
Sharder::Instance().CandidateTerm(node_group_id_) or otherwise accept equality
to either the leader term or the candidate term) so the method returns true if
node_group_term_ matches either term; update any references to node_group_id_
and node_group_term_ inside that function accordingly.

In `@tx_service/src/remote/cc_node_service.cpp`:
- Around line 1206-1211: The code forwards request->range_size_flags into
UploadBatchCc::WriteEntryTuple without validating its cardinality; add a
fail-fast check before building write_entry_tuple that computes batch_size (from
request->keys().size() or request->records().size()), and if range_size_flags is
non-empty and its size() != batch_size, return an error (e.g., set an
appropriate grpc/Status error or early return) and log the mismatch; ensure this
validation sits just before the UploadBatchCc::WriteEntryTuple(...) call so
malformed payloads are rejected rather than desynchronizing downstream
processing.

In `@tx_service/src/sk_generator.cpp`:
- Around line 1005-1007: Replace the magic literals used for flags with
well-named constants: define e.g. constexpr uint8_t kDirtyFlag = 0x10; and
constexpr uint8_t kFlagMask = 0x0F (or better semantic names matching their
meaning) and use those instead of the literals in the range flag code
(references: range_info->PartitionId(), default_flags, and the flag mutation
path around the 1045-1057 block). Update the default_flags initialization and
any bitwise operations that currently use 0x10 or 0x0F to use the new constants
and add a short comment explaining each constant's purpose.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a4d91a30-392f-4c8b-9ed4-b59f1c34cfbe

📥 Commits

Reviewing files that changed from the base of the PR and between 225ad67 and 6a2d770.

📒 Files selected for processing (36)
  • store_handler/bigtable_handler.cpp
  • store_handler/bigtable_handler.h
  • store_handler/data_store_service_client.cpp
  • store_handler/data_store_service_client.h
  • store_handler/data_store_service_client_closure.cpp
  • store_handler/data_store_service_client_closure.h
  • store_handler/dynamo_handler.cpp
  • store_handler/dynamo_handler.h
  • store_handler/rocksdb_handler.cpp
  • store_handler/rocksdb_handler.h
  • tx_service/include/cc/cc_handler.h
  • tx_service/include/cc/cc_map.h
  • tx_service/include/cc/cc_req_misc.h
  • tx_service/include/cc/cc_request.h
  • tx_service/include/cc/cc_shard.h
  • tx_service/include/cc/local_cc_handler.h
  • tx_service/include/cc/object_cc_map.h
  • tx_service/include/cc/range_cc_map.h
  • tx_service/include/cc/template_cc_map.h
  • tx_service/include/proto/cc_request.proto
  • tx_service/include/read_write_entry.h
  • tx_service/include/remote/remote_cc_handler.h
  • tx_service/include/sk_generator.h
  • tx_service/include/store/data_store_handler.h
  • tx_service/include/type.h
  • tx_service/src/cc/cc_map.cpp
  • tx_service/src/cc/cc_req_misc.cpp
  • tx_service/src/cc/cc_shard.cpp
  • tx_service/src/cc/local_cc_handler.cpp
  • tx_service/src/cc/local_cc_shards.cpp
  • tx_service/src/remote/cc_node_service.cpp
  • tx_service/src/remote/remote_cc_handler.cpp
  • tx_service/src/remote/remote_cc_request.cpp
  • tx_service/src/sk_generator.cpp
  • tx_service/src/tx_execution.cpp
  • tx_service/src/tx_operation.cpp

Comment on lines +713 to +718
void EloqDS::BigTableHandler::FetchTableRangeSize(
txservice::FetchTableRangeSizeCc *fetch_cc)
{
LOG(ERROR) << "BigTableHandler::FetchTableRangeSize not implemented";
assert(false);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Complete the CC request instead of aborting the process.

Line 717 uses assert(false). That crashes debug builds, and in release builds the assert is removed so fetch_cc is never finished, which can leave the caller hanging indefinitely.

Proposed fix
 void EloqDS::BigTableHandler::FetchTableRangeSize(
     txservice::FetchTableRangeSizeCc *fetch_cc)
 {
     LOG(ERROR) << "BigTableHandler::FetchTableRangeSize not implemented";
-    assert(false);
+    fetch_cc->SetFinish(static_cast<int>(txservice::CcErrorCode::DATA_STORE_ERR));
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
void EloqDS::BigTableHandler::FetchTableRangeSize(
txservice::FetchTableRangeSizeCc *fetch_cc)
{
LOG(ERROR) << "BigTableHandler::FetchTableRangeSize not implemented";
assert(false);
}
void EloqDS::BigTableHandler::FetchTableRangeSize(
txservice::FetchTableRangeSizeCc *fetch_cc)
{
LOG(ERROR) << "BigTableHandler::FetchTableRangeSize not implemented";
fetch_cc->SetFinish(static_cast<int>(txservice::CcErrorCode::DATA_STORE_ERR));
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@store_handler/bigtable_handler.cpp` around lines 713 - 718, The
FetchTableRangeSize method currently aborts via assert(false) and never
completes the txservice::FetchTableRangeSizeCc request; instead, remove the
assert and implement a proper error completion path: log the unimplemented path,
set an appropriate error status on fetch_cc (e.g., an internal/NOT_IMPLEMENTED
error or populate any available response/error fields on
txservice::FetchTableRangeSizeCc), then call fetch_cc->Complete() (or the
correct completion method) so the caller is always notified; reference the
EloqDS::BigTableHandler::FetchTableRangeSize function and the fetch_cc parameter
when making these changes.

Comment on lines +85 to +87
void FetchTableRangeSize(
txservice::FetchTableRangeSizeCc *fetch_cc) override;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

rg -n -A 15 'BigTableHandler::FetchTableRangeSize' --type=cpp

Repository: eloqdata/tx_service

Length of output: 1492


Fix stub implementation to properly signal unsupported operation.

The stub implementation at lines 713-718 in bigtable_handler.cpp currently uses assert(false) and does not interact with the fetch_cc parameter. This approach has two problems:

  1. Silent failure in release builds: With assertions disabled, the function returns without setting any error status on fetch_cc, leaving the caller with an uninitialized/unmodified object.
  2. No proper error signaling: Unlike proper error handling patterns in the codebase, this stub doesn't set an appropriate error code or failure status on the completion callback.

Set an appropriate error status on fetch_cc (e.g., fetch_cc->set_error(...)) before returning, or use a structured failure response, to ensure callers are properly notified that this operation is not yet supported.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@store_handler/bigtable_handler.h` around lines 85 - 87, The stub
FetchTableRangeSize implementation must stop using assert(false) and instead set
an explicit error on the completion callback parameter
txservice::FetchTableRangeSizeCc *fetch_cc to signal an unsupported operation;
locate the FetchTableRangeSize definition in bigtable_handler.cpp and replace
the assert with a call that sets an appropriate error/status on fetch_cc (for
example using fetch_cc->set_error(...) or the project's equivalent
set_status/failure method with a clear "not implemented/unsupported" message)
before returning so callers always receive a structured failure response.

Comment on lines +956 to +962
assert(read_val.size() ==
(sizeof(int32_t) + sizeof(uint64_t) + sizeof(uint64_t) +
sizeof(uint32_t) + sizeof(int32_t)));
const char *buf = read_val.data();
buf += read_val.size() - sizeof(int32_t);
fetch_range_size_cc->store_range_size_ =
*reinterpret_cast<const int32_t *>(buf);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -name "data_store_service_client_closure.cpp" -type f

Repository: eloqdata/tx_service

Length of output: 116


🏁 Script executed:

wc -l ./store_handler/data_store_service_client_closure.cpp

Repository: eloqdata/tx_service

Length of output: 121


🏁 Script executed:

sed -n '950,970p' ./store_handler/data_store_service_client_closure.cpp

Repository: eloqdata/tx_service

Length of output: 866


🏁 Script executed:

sed -n '930,975p' ./store_handler/data_store_service_client_closure.cpp

Repository: eloqdata/tx_service

Length of output: 2006


🏁 Script executed:

# Check if there's any build configuration that defines assert behavior
find . -name "CMakeLists.txt" -o -name "*.cmake" | head -5

Repository: eloqdata/tx_service

Length of output: 188


🏁 Script executed:

# Check the CMakeLists.txt for build configuration, especially Debug vs Release handling
head -100 ./CMakeLists.txt | grep -A 5 -B 5 "NDEBUG\|assert\|Debug\|Release"

Repository: eloqdata/tx_service

Length of output: 45


🏁 Script executed:

grep -i "ndebug\|cmake_build_type" ./CMakeLists.txt | head -20

Repository: eloqdata/tx_service

Length of output: 45


🏁 Script executed:

# Search for similar patterns in the file - other uses of assert before reinterpret_cast
grep -n "assert.*size()" ./store_handler/data_store_service_client_closure.cpp

Repository: eloqdata/tx_service

Length of output: 333


🏁 Script executed:

# Check the other similar patterns to see if they have the same issue
sed -n '790,810p' ./store_handler/data_store_service_client_closure.cpp

Repository: eloqdata/tx_service

Length of output: 1020


🏁 Script executed:

# Check the third similar pattern at line 1009
sed -n '1005,1025p' ./store_handler/data_store_service_client_closure.cpp

Repository: eloqdata/tx_service

Length of output: 1048


🏁 Script executed:

# Let's understand where read_closure->Value() comes from - check for ReadClosure definition
grep -n "class ReadClosure\|struct ReadClosure" ./store_handler/*.cpp ./store_handler/*.h

Repository: eloqdata/tx_service

Length of output: 334


🏁 Script executed:

sed -n '600,700p' ./store_handler/data_store_service_client_closure.h

Repository: eloqdata/tx_service

Length of output: 3387


🏁 Script executed:

sed -n '700,800p' ./store_handler/data_store_service_client_closure.h

Repository: eloqdata/tx_service

Length of output: 2563


🏁 Script executed:

sed -n '800,850p' ./store_handler/data_store_service_client_closure.h

Repository: eloqdata/tx_service

Length of output: 911


🏁 Script executed:

# Check if there are any other safety checks or validations before the problematic code
sed -n '930,965p' ./store_handler/data_store_service_client_closure.cpp

Repository: eloqdata/tx_service

Length of output: 1648


Replace assert() with runtime validation before decoding store_range_size_.

The size check at line 956 relies solely on assert(), which is disabled in Release builds. Malformed payloads will reach the pointer arithmetic and reinterpret_cast operations unchecked, causing buffer overread from untrusted data. Add a runtime check with proper error handling:

Suggested fix
-        assert(read_val.size() ==
-               (sizeof(int32_t) + sizeof(uint64_t) + sizeof(uint64_t) +
-                sizeof(uint32_t) + sizeof(int32_t)));
-        const char *buf = read_val.data();
-        buf += read_val.size() - sizeof(int32_t);
-        fetch_range_size_cc->store_range_size_ =
-            *reinterpret_cast<const int32_t *>(buf);
+        constexpr size_t kExpectedRangeMetaSize =
+            sizeof(int32_t) + sizeof(uint64_t) + sizeof(uint64_t) +
+            sizeof(uint32_t) + sizeof(int32_t);
+        if (read_val.size() != kExpectedRangeMetaSize)
+        {
+            LOG(ERROR) << "Fetch range size decode failed: unexpected value size="
+                       << read_val.size();
+            fetch_range_size_cc->SetFinish(
+                static_cast<uint32_t>(txservice::CcErrorCode::DATA_STORE_ERR));
+            return;
+        }
+        const char *buf = read_val.data() + read_val.size() - sizeof(int32_t);
+        fetch_range_size_cc->store_range_size_ =
+            *reinterpret_cast<const int32_t *>(buf);

Note: Similar patterns exist at lines 796 and 1009 that may need the same treatment.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@store_handler/data_store_service_client_closure.cpp` around lines 956 - 962,
The assert on read_val.size() must be replaced with a runtime size validation
before doing buf arithmetic and the reinterpret_cast: check that read_val.size()
==
(sizeof(int32_t)+sizeof(uint64_t)+sizeof(uint64_t)+sizeof(uint32_t)+sizeof(int32_t))
and if not, handle the error path (e.g., set an error status/return an error
code, log and abort parsing) instead of proceeding; then compute const char *buf
= read_val.data() + read_val.size() - sizeof(int32_t) and safely decode
fetch_range_size_cc->store_range_size_ = *reinterpret_cast<const int32_t
*>(buf). Apply the same runtime validation/failure handling to the similar
patterns around the uses at the other locations (the checks near the code that
sets fetch_range_size_cc->store_range_size_ and the spots mentioned at the other
offsets).

Comment on lines +2537 to +2541
void EloqDS::DynamoHandler::FetchTableRangeSize(FetchTableRangeSizeCc *fetch_cc)
{
LOG(ERROR) << "DynamoHandler::FetchTableRangeSize not implemented";
assert(false);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Do not leave FetchTableRangeSizeCc unfinished on unsupported path.

Line 2540 aborts in debug and becomes a silent early return in release (assert removed), so fetch_cc may never be completed. Return a datastore error via fetch_cc instead of asserting.

Proposed fix
 void EloqDS::DynamoHandler::FetchTableRangeSize(FetchTableRangeSizeCc *fetch_cc)
 {
     LOG(ERROR) << "DynamoHandler::FetchTableRangeSize not implemented";
-    assert(false);
+    fetch_cc->SetFinish(static_cast<int>(txservice::CcErrorCode::DATA_STORE_ERR));
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
void EloqDS::DynamoHandler::FetchTableRangeSize(FetchTableRangeSizeCc *fetch_cc)
{
LOG(ERROR) << "DynamoHandler::FetchTableRangeSize not implemented";
assert(false);
}
void EloqDS::DynamoHandler::FetchTableRangeSize(FetchTableRangeSizeCc *fetch_cc)
{
LOG(ERROR) << "DynamoHandler::FetchTableRangeSize not implemented";
fetch_cc->SetFinish(static_cast<int>(txservice::CcErrorCode::DATA_STORE_ERR));
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@store_handler/dynamo_handler.cpp` around lines 2537 - 2541, The function
EloqDS::DynamoHandler::FetchTableRangeSize currently aborts via assert and
leaves fetch_cc uncompleted; instead, set an appropriate datastore error on
fetch_cc and invoke its completion path so the caller is notified (do not
assert). Locate EloqDS::DynamoHandler::FetchTableRangeSize and replace the
assert with code that populates fetch_cc with an error status/message indicating
"unsupported on this path" (matching the error pattern used by other handlers)
and then call the same completion method used elsewhere for these CC objects
(the fetch_cc completion/send/done method used by other handlers) so fetch_cc is
always completed. Ensure you reference fetch_cc and the function name
EloqDS::DynamoHandler::FetchTableRangeSize when making the change.

Comment on lines +1113 to +1118
void RocksDBHandler::FetchTableRangeSize(
txservice::FetchTableRangeSizeCc *fetch_cc)
{
LOG(ERROR) << "RocksDBHandler::FetchTableRangeSize not implemented";
assert(false);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Avoid process abort in FetchTableRangeSize runtime path.

Line 1117 hard-aborts (assert(false)). Since this API is now wired into range-size tracking, invoking it on RocksDB will crash the node. Return a datastore error through fetch_cc and complete the request instead of asserting.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@store_handler/rocksdb_handler.cpp` around lines 1113 - 1118, Replace the
assert-based abort in RocksDBHandler::FetchTableRangeSize so it returns a proper
datastore error instead of crashing: detect that the operation is unsupported on
RocksDB, populate an appropriate error/status on the provided
txservice::FetchTableRangeSizeCc* fetch_cc (e.g., set a
datastore-not-implemented or internal error with a clear message), and then
complete the request by invoking the fetch_cc completion/reply method so the
caller receives an error response rather than triggering assert(false).

Comment on lines +481 to +483
int32_t final_size = persisted_size + it->second.second;
it->second.first = final_size < 0 ? 0 : final_size;
it->second.second = 0;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's locate and examine the file around lines 481-483
head -490 tx_service/src/cc/cc_map.cpp | tail -20

Repository: eloqdata/tx_service

Length of output: 619


🏁 Script executed:

# Get broader context around those lines
sed -n '460,500p' tx_service/src/cc/cc_map.cpp

Repository: eloqdata/tx_service

Length of output: 983


🏁 Script executed:

# Find the function containing this code
sed -n '450,490p' tx_service/src/cc/cc_map.cpp

Repository: eloqdata/tx_service

Length of output: 1224


🏁 Script executed:

# Search for range_sizes_ declaration in header file
fd -e h tx_service/src/cc/cc_map.h | head -5

Repository: eloqdata/tx_service

Length of output: 45


🏁 Script executed:

# Look for range_sizes_ declaration
rg -n 'range_sizes_' tx_service/src/cc/cc_map.h

Repository: eloqdata/tx_service

Length of output: 129


🏁 Script executed:

# Try to find the class definition
rg -n 'class CcMap' tx_service/src/cc/ -A 30

Repository: eloqdata/tx_service

Length of output: 45


🏁 Script executed:

# Find the header file for CcMap
find . -name "*cc_map*" -type f 2>/dev/null

Repository: eloqdata/tx_service

Length of output: 382


🏁 Script executed:

# Search for range_sizes_ across the codebase
rg -n 'range_sizes_' tx_service/src/cc/cc_map.cpp

Repository: eloqdata/tx_service

Length of output: 232


🏁 Script executed:

# Look for the CcMap class definition in the cpp file itself
rg -n 'class CcMap\|struct range_sizes' tx_service/src/cc/cc_map.cpp -B 5 -A 10

Repository: eloqdata/tx_service

Length of output: 45


🏁 Script executed:

# Check the header file for CcMap class definition and range_sizes_ member
cat tx_service/include/cc/cc_map.h | head -200

Repository: eloqdata/tx_service

Length of output: 6670


🏁 Script executed:

# Search for range_sizes_ in the header
rg -n 'range_sizes_' tx_service/include/cc/cc_map.h -B 2 -A 2

Repository: eloqdata/tx_service

Length of output: 317


🏁 Script executed:

# Verify the actual ranges and semantics of these values
rg -n 'RangeSizeStatus' tx_service/include/cc/cc_map.h -B 2 -A 5

Repository: eloqdata/tx_service

Length of output: 45


🏁 Script executed:

# Check if there are any comments explaining the size semantics
sed -n '310,325p' tx_service/include/cc/cc_map.h

Repository: eloqdata/tx_service

Length of output: 780


🏁 Script executed:

# Check if <limits> is already included in cc_map.cpp
rg -n '#include' tx_service/src/cc/cc_map.cpp | head -20

Repository: eloqdata/tx_service

Length of output: 230


🏁 Script executed:

# Verify the InitRangeSize function one more time with full signature
sed -n '465,490p' tx_service/src/cc/cc_map.cpp

Repository: eloqdata/tx_service

Length of output: 847


Prevent signed overflow in range-size merge math.

Line 481 adds two int32_t values directly; overflow is undefined behavior in C++. The current code only clamps negative values to 0 but does not guard against positive overflow. Please widen the addition to int64_t and clamp both bounds before writing back.

🛠️ Proposed fix
+#include <limits>
@@
-        int32_t final_size = persisted_size + it->second.second;
-        it->second.first = final_size < 0 ? 0 : final_size;
+        int64_t final_size = static_cast<int64_t>(persisted_size) +
+                             static_cast<int64_t>(it->second.second);
+        if (final_size < 0)
+        {
+            final_size = 0;
+        }
+        else if (final_size > std::numeric_limits<int32_t>::max())
+        {
+            final_size = std::numeric_limits<int32_t>::max();
+        }
+        it->second.first = static_cast<int32_t>(final_size);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tx_service/src/cc/cc_map.cpp` around lines 481 - 483, The addition
persisted_size + it->second.second can overflow int32_t; change the computation
to use int64_t (e.g., compute int64_t final_size = (int64_t)persisted_size +
(int64_t)it->second.second), then clamp final_size to the int32_t range (lower
bound 0 per current behavior and upper bound INT32_MAX) before assigning back to
it->second.first and set it->second.second = 0; ensure assignments cast back to
int32_t after clamping to avoid signed overflow.

Comment on lines +1565 to +1568
CcMap *ccm = ccs.GetCcm(*table_name_, node_group_id_);
assert(ccm != nullptr);
ccm->InitRangeSize(
static_cast<uint32_t>(partition_id_), store_range_size_, succ);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Add a runtime null-check before using ccm

Line 1565 can return nullptr (e.g., map teardown/re-init races), and Line 1567 dereferences it unconditionally in non-assert builds.

Proposed fix
 bool FetchTableRangeSizeCc::Execute(CcShard &ccs)
 {
     if (!ValidTermCheck())
     {
         error_code_ = static_cast<uint32_t>(CcErrorCode::NG_TERM_CHANGED);
     }

     bool succ = (error_code_ == 0);
     CcMap *ccm = ccs.GetCcm(*table_name_, node_group_id_);
-    assert(ccm != nullptr);
+    if (ccm == nullptr)
+    {
+        error_code_ = static_cast<uint32_t>(CcErrorCode::NG_TERM_CHANGED);
+        return true;
+    }
     ccm->InitRangeSize(
         static_cast<uint32_t>(partition_id_), store_range_size_, succ);

     return true;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tx_service/src/cc/cc_req_misc.cpp` around lines 1565 - 1568, The code
currently assumes ccs.GetCcm(*table_name_, node_group_id_) always returns
non-null and dereferences ccm directly; add a runtime null-check after calling
ccs.GetCcm (checking if ccm == nullptr) before calling ccm->InitRangeSize, and
handle the null case by returning or logging an error/context (include
table_name_ and node_group_id_ in the log) so InitRangeSize is only invoked when
ccm is valid; reference the call site using ccs.GetCcm, the ccm pointer
variable, and the InitRangeSize invocation along with partition_id_,
store_range_size_, and succ for context.

CcMap *ccm = ccs.GetCcm(*table_name_, node_group_id_);
assert(ccm != nullptr);
ccm->InitRangeSize(
static_cast<uint32_t>(partition_id_), store_range_size_, succ);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate partition_id_ before casting to uint32_t

Line 1568 casts int32_t to uint32_t without checking sign. A negative partition_id_ will wrap and can update the wrong partition state.

Proposed fix
-    ccm->InitRangeSize(
-        static_cast<uint32_t>(partition_id_), store_range_size_, succ);
+    if (partition_id_ < 0)
+    {
+        error_code_ = static_cast<uint32_t>(CcErrorCode::DATA_STORE_ERR);
+        succ = false;
+        return true;
+    }
+    ccm->InitRangeSize(
+        static_cast<uint32_t>(partition_id_), store_range_size_, succ);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tx_service/src/cc/cc_req_misc.cpp` at line 1568, The code casts partition_id_
to uint32_t without checking for negative values, which can wrap and update the
wrong partition; before the static_cast<uint32_t>(partition_id_) (in
cc_req_misc.cpp where partition_id_, store_range_size_, succ are used), add a
validation that partition_id_ >= 0 and handle the error path (log an error via
the existing logger and return/abort the update) if it's negative so you never
pass a wrapped uint32_t onward.

Comment on lines +390 to +393
const TableRangeEntry *range_entry =
GetTableRangeEntry(range_table_name, cc_ng_id, partition_id);
assert(range_entry != nullptr);
TxKey start_key = range_entry->GetRangeInfo()->StartTxKey();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

cat -n tx_service/src/cc/cc_shard.cpp | sed -n '375,405p'

Repository: eloqdata/tx_service

Length of output: 1399


🏁 Script executed:

# Find GetTableRangeEntry method definition
rg -n "GetTableRangeEntry" tx_service/src/cc/cc_shard.cpp | head -20

Repository: eloqdata/tx_service

Length of output: 602


🏁 Script executed:

cat -n tx_service/src/cc/cc_shard.cpp | sed -n '1485,1495p'

Repository: eloqdata/tx_service

Length of output: 660


🏁 Script executed:

rg -n "local_shards_" tx_service/src/cc/cc_shard.cpp | grep -E "(LocalCcShards|member|private:)" | head -5

Repository: eloqdata/tx_service

Length of output: 45


🏁 Script executed:

rg -n "local_shards_" tx_service/src/cc/cc_shard.h | head -10

Repository: eloqdata/tx_service

Length of output: 131


🏁 Script executed:

fd "cc_shard.h" tx_service/

Repository: eloqdata/tx_service

Length of output: 95


🏁 Script executed:

rg -n "local_shards_" tx_service/include/cc/cc_shard.h

Repository: eloqdata/tx_service

Length of output: 100


🏁 Script executed:

fd "local_cc_shards.h" tx_service/include/

Repository: eloqdata/tx_service

Length of output: 102


🏁 Script executed:

rg -n "GetTableRangeEntry" tx_service/include/cc/local_cc_shards.h | head -20

Repository: eloqdata/tx_service

Length of output: 969


🏁 Script executed:

cat -n tx_service/include/cc/local_cc_shards.h | sed -n '1122,1135p'

Repository: eloqdata/tx_service

Length of output: 855


🏁 Script executed:

fd "local_cc_shards.cpp" tx_service/src/

Repository: eloqdata/tx_service

Length of output: 100


🏁 Script executed:

rg -n "const TableRangeEntry \*LocalCcShards::GetTableRangeEntry" tx_service/src/cc/local_cc_shards.cpp

Repository: eloqdata/tx_service

Length of output: 197


🏁 Script executed:

cat -n tx_service/src/cc/local_cc_shards.cpp | sed -n '1479,1495p'

Repository: eloqdata/tx_service

Length of output: 1002


🏁 Script executed:

rg -n "GetTableRangeEntryInternal.*LocalCcShards" tx_service/src/cc/local_cc_shards.cpp | head -5

Repository: eloqdata/tx_service

Length of output: 45


🏁 Script executed:

rg -n "TableRangeEntry \*LocalCcShards::GetTableRangeEntryInternal" tx_service/src/cc/local_cc_shards.cpp

Repository: eloqdata/tx_service

Length of output: 192


🏁 Script executed:

cat -n tx_service/src/cc/local_cc_shards.cpp | sed -n '1699,1720p'

Repository: eloqdata/tx_service

Length of output: 927


Handle missing TableRangeEntry explicitly instead of relying on assert.

At line 392, assert(range_entry != nullptr) can be compiled out in release builds; line 393 then dereferences a possibly null pointer. The GetTableRangeEntry() method explicitly returns nullptr when the range table or range mapping is not found. Please return a deterministic error to avoid crash.

Suggested fix
     const TableRangeEntry *range_entry =
         GetTableRangeEntry(range_table_name, cc_ng_id, partition_id);
-    assert(range_entry != nullptr);
-    TxKey start_key = range_entry->GetRangeInfo()->StartTxKey();
+    if (range_entry == nullptr || range_entry->GetRangeInfo() == nullptr)
+    {
+        LOG(ERROR) << "FetchTableRangeSize: missing range entry. table="
+                   << table_name.Trace() << ", partition_id=" << partition_id
+                   << ", node_group=" << cc_ng_id;
+        fetch_cc->SetFinish(
+            static_cast<uint32_t>(CcErrorCode::DATA_STORE_ERR));
+        return;
+    }
+    TxKey start_key = range_entry->GetRangeInfo()->StartTxKey();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const TableRangeEntry *range_entry =
GetTableRangeEntry(range_table_name, cc_ng_id, partition_id);
assert(range_entry != nullptr);
TxKey start_key = range_entry->GetRangeInfo()->StartTxKey();
const TableRangeEntry *range_entry =
GetTableRangeEntry(range_table_name, cc_ng_id, partition_id);
if (range_entry == nullptr || range_entry->GetRangeInfo() == nullptr)
{
LOG(ERROR) << "FetchTableRangeSize: missing range entry. table="
<< table_name.Trace() << ", partition_id=" << partition_id
<< ", node_group=" << cc_ng_id;
fetch_cc->SetFinish(
static_cast<uint32_t>(CcErrorCode::DATA_STORE_ERR));
return;
}
TxKey start_key = range_entry->GetRangeInfo()->StartTxKey();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tx_service/src/cc/cc_shard.cpp` around lines 390 - 393, The code currently
asserts range_entry != nullptr after calling GetTableRangeEntry and then
dereferences it (TxKey start_key = range_entry->GetRangeInfo()->StartTxKey()),
which can crash in release builds; replace the assert with an explicit null
check for the TableRangeEntry* returned by GetTableRangeEntry(cc_ng_id,
partition_id) and return a deterministic error path instead of dereferencing —
e.g., return an error Status/optional/null-equivalent or propagate a clear error
code from the enclosing function, and add a descriptive log message naming the
range_table_name, cc_ng_id and partition_id; update any callers to handle the
new error return if needed so the code no longer relies on assert for
correctness.

Comment on lines +5300 to +5317
post_process.hd_result_,
write_entry.range_size_flags_);
update_post_cnt(ret);

for (auto &[forward_shard_code, cce_addr] :
write_entry.forward_addr_)
{
CcReqStatus ret =
cc_handler_->PostWrite(tx_number,
tx_term_,
command_id,
commit_ts_,
cce_addr,
write_entry.rec_.get(),
write_entry.op_,
forward_shard_code,
post_process.hd_result_);
CcReqStatus ret = cc_handler_->PostWrite(
tx_number,
tx_term_,
command_id,
commit_ts_,
cce_addr,
write_entry.rec_.get(),
write_entry.op_,
forward_shard_code,
post_process.hd_result_,
(0x10 | write_entry.range_size_flags_));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Primary PostWrite is missing the base range-size flag compared to forward writes.

Line 5301 passes write_entry.range_size_flags_ directly, while Line 5317 uses (0x10 | write_entry.range_size_flags_). This inconsistency can drop range-size tracking on the primary write path when flags are zero.

💡 Proposed fix
                 CcReqStatus ret =
                     cc_handler_->PostWrite(tx_number,
                                            tx_term_,
                                            command_id,
                                            commit_ts_,
                                            write_entry.cce_addr_,
                                            write_entry.rec_.get(),
                                            write_entry.op_,
                                            write_entry.key_shard_code_,
                                            post_process.hd_result_,
-                                           write_entry.range_size_flags_);
+                                           (0x10 | write_entry.range_size_flags_));
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
post_process.hd_result_,
write_entry.range_size_flags_);
update_post_cnt(ret);
for (auto &[forward_shard_code, cce_addr] :
write_entry.forward_addr_)
{
CcReqStatus ret =
cc_handler_->PostWrite(tx_number,
tx_term_,
command_id,
commit_ts_,
cce_addr,
write_entry.rec_.get(),
write_entry.op_,
forward_shard_code,
post_process.hd_result_);
CcReqStatus ret = cc_handler_->PostWrite(
tx_number,
tx_term_,
command_id,
commit_ts_,
cce_addr,
write_entry.rec_.get(),
write_entry.op_,
forward_shard_code,
post_process.hd_result_,
(0x10 | write_entry.range_size_flags_));
post_process.hd_result_,
write_entry.range_size_flags_);
update_post_cnt(ret);
for (auto &[forward_shard_code, cce_addr] :
write_entry.forward_addr_)
{
CcReqStatus ret = cc_handler_->PostWrite(
tx_number,
tx_term_,
command_id,
commit_ts_,
cce_addr,
write_entry.rec_.get(),
write_entry.op_,
forward_shard_code,
post_process.hd_result_,
(0x10 | write_entry.range_size_flags_));
Suggested change
post_process.hd_result_,
write_entry.range_size_flags_);
update_post_cnt(ret);
for (auto &[forward_shard_code, cce_addr] :
write_entry.forward_addr_)
{
CcReqStatus ret =
cc_handler_->PostWrite(tx_number,
tx_term_,
command_id,
commit_ts_,
cce_addr,
write_entry.rec_.get(),
write_entry.op_,
forward_shard_code,
post_process.hd_result_);
CcReqStatus ret = cc_handler_->PostWrite(
tx_number,
tx_term_,
command_id,
commit_ts_,
cce_addr,
write_entry.rec_.get(),
write_entry.op_,
forward_shard_code,
post_process.hd_result_,
(0x10 | write_entry.range_size_flags_));
post_process.hd_result_,
(0x10 | write_entry.range_size_flags_));
update_post_cnt(ret);
for (auto &[forward_shard_code, cce_addr] :
write_entry.forward_addr_)
{
CcReqStatus ret = cc_handler_->PostWrite(
tx_number,
tx_term_,
command_id,
commit_ts_,
cce_addr,
write_entry.rec_.get(),
write_entry.op_,
forward_shard_code,
post_process.hd_result_,
(0x10 | write_entry.range_size_flags_));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tx_service/src/tx_execution.cpp` around lines 5300 - 5317, The primary
PostWrite call is omitting the base range-size flag while the forward-write loop
ORs 0x10 into write_entry.range_size_flags_; update the primary
cc_handler_->PostWrite invocation (the one before the forward_addr_ loop) to
include the same base flag logic (i.e., pass (0x10 |
write_entry.range_size_flags_)) so range-size tracking is preserved for primary
writes; ensure this change is applied where PostWrite is called and that
update_post_cnt/related handling remains unchanged.

@yi-xmu
Copy link
Collaborator Author

yi-xmu commented Mar 12, 2026

Replaced by #438 #439 #440 #441 #442

@yi-xmu yi-xmu closed this Mar 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Range size tracking

1 participant