Skip to content

Conversation

@zhuxinjie-nz
Copy link
Collaborator

Support end-to-end read capability for local SSD objects:

  1. The real client provides an RPC interface for other clients to send read requests.
  2. The client creates a read request via the RPC protocol, and the Transfer Engine returns the data content.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @zhuxinjie-nz, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces comprehensive support for reading objects stored on local SSDs. It establishes an end-to-end read capability by providing a new RPC interface on the RealClient for other clients to request data. A dedicated ClientRequester class handles the creation and processing of these read requests, leveraging a Transfer Engine to return the data content. Additionally, the PR refines the internal handling of local disk replicas within the buffer management system and implements the corresponding write mechanism for offloaded objects.

Highlights

  • New RPC Client for Local SSD Reads: Introduced the ClientRequester class to facilitate RPC-based retrieval of objects from remote Transfer Engines, specifically designed for local SSD objects.
  • End-to-End Read Capability: Implemented batch_get_offload_object and batch_get_into_offload_object_internal on RealClient to handle remote read requests for local SSD objects, integrating with the new ClientRequester.
  • Local SSD Replica Handling: Updated buffer allocation and slicing logic (client_buffer.cpp) to correctly identify and manage is_local_disk_replica types, ensuring proper data handling for local SSDs.
  • Offloaded Object Write Mechanism: Implemented submit_batch_put_offload_object in TransferSubmitter and integrated its usage in Client::BatchPutOffloadObject, enabling the writing of offloaded objects via the transfer engine.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new ClientRequester class and related functionality to support batch offload operations, specifically for batch_get_offload_object and batch_put_offload_object. Key changes include adding ClientRequester to pyclient.h and PyClient, implementing batch_get_offload_object and batch_get_into_offload_object_internal in RealClient, and registering the RPC handler for batch_get_offload_object. The TransferSubmitter now includes submit_batch_put_offload_object to handle the actual data transfer. Additionally, the client_buffer.cpp file was updated to correctly handle local_disk_replica types when calculating total size and allocating slices. Review comments highlighted several issues: a critical bug in client_buffer.cpp where get_disk_descriptor() was incorrectly called for local_disk_replica instead of get_local_disk_descriptor(), an unsafe map access in transfer_task.cpp that could lead to crashes, and inaccurate documentation for batch_get_offload_object in pyclient.h and batch_get_into_offload_object_internal in real_client.h. Furthermore, a naming inconsistency was noted, suggesting batched_slices be renamed to batch_slices for consistency across the codebase.

Comment on lines +100 to 102
} else if (replica.is_local_disk_replica()) {
total_length = replica.get_disk_descriptor().object_size;
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There is a critical bug here. When replica.is_local_disk_replica() is true, you are calling replica.get_disk_descriptor(). This will throw a std::runtime_error because the underlying variant holds a LocalDiskDescriptor, not a DiskDescriptor, leading to a crash. You should call replica.get_local_disk_descriptor() instead.

Suggested change
} else if (replica.is_local_disk_replica()) {
total_length = replica.get_disk_descriptor().object_size;
} else {
} else if (replica.is_local_disk_replica()) {
total_length = replica.get_local_disk_descriptor().object_size;
} else {

<< transfer_engine_addr;
return std::nullopt;
}
const auto& slice = batched_slices.find(key)->second;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This is an unsafe map access. batched_slices.find(key) can return end(), and dereferencing it with ->second will cause a crash. You should check if the key exists in the map before accessing it.

Suggested change
const auto& slice = batched_slices.find(key)->second;
auto it = batched_slices.find(key);
if (it == batched_slices.end()) {
LOG(ERROR) << "Key '" << key << "' not found in batched_slices map.";
return std::nullopt;
}
const auto& slice = it->second;

Comment on lines +37 to +40
* @param keys Map from object key to size (bytes);
* @param complete_handler Callback function executed when the batch get
* operation completes.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The documentation for batch_get_offload_object is inaccurate. The parameter is named objects, but it's documented as keys. Additionally, complete_handler is mentioned in the documentation but is not a parameter of the function. The description for the objects parameter could also be clarified to indicate it's an in-out parameter.

Suggested change
* @param keys Map from object key to size (bytes);
* @param complete_handler Callback function executed when the batch get
* operation completes.
*/
* @param objects Map from object key to a `Slice`. The `size` field of the
* slice should be pre-filled with the object size, and the
* `ptr` field should point to the destination buffer.
*/

Comment on lines +433 to +436
* @param target_rpc_service_addr Address of the remote RPC service (e.g.,
"ip:port").
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The documentation for batch_get_into_offload_object_internal is incomplete as it's missing the description for the objects parameter. Please add it for clarity.

     * @param target_rpc_service_addr Address of the remote RPC service (e.g.,
     "ip:port").
     * @param objects Map from object key to a `Slice` where the `size` field
     * indicates the object size and the `ptr` field is the destination buffer.
     */

const std::string& transfer_engine_addr,
const std::vector<std::string>& keys,
const std::vector<uintptr_t>& pointers,
const std::unordered_map<std::string, Slice>& batched_slices);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The parameter name batched_slices is inconsistent with other parts of the codebase where it has been renamed to batch_slices (e.g., in client_service.h). For consistency, please rename it here and in the corresponding implementation file.

Suggested change
const std::unordered_map<std::string, Slice>& batched_slices);
const std::unordered_map<std::string, Slice>& batch_slices);

const std::string& transfer_engine_addr,
const std::vector<std::string>& keys,
const std::vector<uintptr_t>& pointers,
const std::unordered_map<std::string, Slice>& batched_slices) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The parameter name batched_slices is inconsistent with other parts of the code where it has been renamed to batch_slices (e.g., in client_service.h). For consistency, please rename it here and in the function body.

Suggested change
const std::unordered_map<std::string, Slice>& batched_slices) {
const std::unordered_map<std::string, Slice>& batch_slices) {

@XucSh
Copy link
Collaborator

XucSh commented Dec 12, 2025

@YiXR could you take a look about this PR?

@ykwd ykwd requested a review from maheshrbapatu December 12, 2025 07:10
Copy link
Collaborator

@maheshrbapatu maheshrbapatu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just had a couple of minor comments, Will take a look at this once again

std::vector<std::string> keys;
std::vector<int64_t> sizes;

for (const auto &object_it : objects) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::vectorstd::string keys;
std::vector<int64_t> sizes;
keys.reserve(objects.size());
sizes.reserve(objects.size());

for (const auto& [key, slice] : objects) {
keys.emplace_back(key);
sizes.emplace_back(slice.size);
}

We can reserve key sizes so that we dont grow vector dynamically

std::chrono::duration_cast<std::chrono::microseconds>(
end_time - start_read_store_time)
.count();
LOG(INFO) << "Time taken for batch_get_into: " << elapsed_time
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use DEBUG or TRACE to log this if this method is called frequently. If this is called frequently this log might pollute log files which might rollout quickly and make debugging difficult.
Alternatively we can also print a WARNING log if the time_taken > certain_threshold

@ykwd
Copy link
Collaborator

ykwd commented Dec 16, 2025

Thanks for the work!

Regarding the current approach for reading data from remote SSD: the remote client reads data from storage into its memory and then writes it into the local client’s memory. However, the correctness of RDMA writes is hard to guarantee. For example, if the RPC from the local client to the remote client fails or is interrupted, the remote client may still continue issuing RDMA writes. If the local memory is freed and reused for other tasks at that point, this can lead to data races and memory corruption.

Given this risk, would it be possible to change this to a pull-based model, where the local client reads from remote memory instead? This aligns with what we discussed in this RFC: #1054

@zhangzuo21
Copy link
Collaborator

Thanks for your work, the code seems ok for me. Would you like to change this to a pull-based model just as @ykwd described?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants