-
Notifications
You must be signed in to change notification settings - Fork 500
feat(Store): Support get local ssd object #1203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @zhuxinjie-nz, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces comprehensive support for reading objects stored on local SSDs. It establishes an end-to-end read capability by providing a new RPC interface on the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new ClientRequester class and related functionality to support batch offload operations, specifically for batch_get_offload_object and batch_put_offload_object. Key changes include adding ClientRequester to pyclient.h and PyClient, implementing batch_get_offload_object and batch_get_into_offload_object_internal in RealClient, and registering the RPC handler for batch_get_offload_object. The TransferSubmitter now includes submit_batch_put_offload_object to handle the actual data transfer. Additionally, the client_buffer.cpp file was updated to correctly handle local_disk_replica types when calculating total size and allocating slices. Review comments highlighted several issues: a critical bug in client_buffer.cpp where get_disk_descriptor() was incorrectly called for local_disk_replica instead of get_local_disk_descriptor(), an unsafe map access in transfer_task.cpp that could lead to crashes, and inaccurate documentation for batch_get_offload_object in pyclient.h and batch_get_into_offload_object_internal in real_client.h. Furthermore, a naming inconsistency was noted, suggesting batched_slices be renamed to batch_slices for consistency across the codebase.
| } else if (replica.is_local_disk_replica()) { | ||
| total_length = replica.get_disk_descriptor().object_size; | ||
| } else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a critical bug here. When replica.is_local_disk_replica() is true, you are calling replica.get_disk_descriptor(). This will throw a std::runtime_error because the underlying variant holds a LocalDiskDescriptor, not a DiskDescriptor, leading to a crash. You should call replica.get_local_disk_descriptor() instead.
| } else if (replica.is_local_disk_replica()) { | |
| total_length = replica.get_disk_descriptor().object_size; | |
| } else { | |
| } else if (replica.is_local_disk_replica()) { | |
| total_length = replica.get_local_disk_descriptor().object_size; | |
| } else { |
| << transfer_engine_addr; | ||
| return std::nullopt; | ||
| } | ||
| const auto& slice = batched_slices.find(key)->second; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an unsafe map access. batched_slices.find(key) can return end(), and dereferencing it with ->second will cause a crash. You should check if the key exists in the map before accessing it.
| const auto& slice = batched_slices.find(key)->second; | |
| auto it = batched_slices.find(key); | |
| if (it == batched_slices.end()) { | |
| LOG(ERROR) << "Key '" << key << "' not found in batched_slices map."; | |
| return std::nullopt; | |
| } | |
| const auto& slice = it->second; |
| * @param keys Map from object key to size (bytes); | ||
| * @param complete_handler Callback function executed when the batch get | ||
| * operation completes. | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation for batch_get_offload_object is inaccurate. The parameter is named objects, but it's documented as keys. Additionally, complete_handler is mentioned in the documentation but is not a parameter of the function. The description for the objects parameter could also be clarified to indicate it's an in-out parameter.
| * @param keys Map from object key to size (bytes); | |
| * @param complete_handler Callback function executed when the batch get | |
| * operation completes. | |
| */ | |
| * @param objects Map from object key to a `Slice`. The `size` field of the | |
| * slice should be pre-filled with the object size, and the | |
| * `ptr` field should point to the destination buffer. | |
| */ |
| * @param target_rpc_service_addr Address of the remote RPC service (e.g., | ||
| "ip:port"). | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation for batch_get_into_offload_object_internal is incomplete as it's missing the description for the objects parameter. Please add it for clarity.
* @param target_rpc_service_addr Address of the remote RPC service (e.g.,
"ip:port").
* @param objects Map from object key to a `Slice` where the `size` field
* indicates the object size and the `ptr` field is the destination buffer.
*/| const std::string& transfer_engine_addr, | ||
| const std::vector<std::string>& keys, | ||
| const std::vector<uintptr_t>& pointers, | ||
| const std::unordered_map<std::string, Slice>& batched_slices); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameter name batched_slices is inconsistent with other parts of the codebase where it has been renamed to batch_slices (e.g., in client_service.h). For consistency, please rename it here and in the corresponding implementation file.
| const std::unordered_map<std::string, Slice>& batched_slices); | |
| const std::unordered_map<std::string, Slice>& batch_slices); |
| const std::string& transfer_engine_addr, | ||
| const std::vector<std::string>& keys, | ||
| const std::vector<uintptr_t>& pointers, | ||
| const std::unordered_map<std::string, Slice>& batched_slices) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameter name batched_slices is inconsistent with other parts of the code where it has been renamed to batch_slices (e.g., in client_service.h). For consistency, please rename it here and in the function body.
| const std::unordered_map<std::string, Slice>& batched_slices) { | |
| const std::unordered_map<std::string, Slice>& batch_slices) { |
|
@YiXR could you take a look about this PR? |
maheshrbapatu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just had a couple of minor comments, Will take a look at this once again
| std::vector<std::string> keys; | ||
| std::vector<int64_t> sizes; | ||
|
|
||
| for (const auto &object_it : objects) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::vectorstd::string keys;
std::vector<int64_t> sizes;
keys.reserve(objects.size());
sizes.reserve(objects.size());
for (const auto& [key, slice] : objects) {
keys.emplace_back(key);
sizes.emplace_back(slice.size);
}
We can reserve key sizes so that we dont grow vector dynamically
| std::chrono::duration_cast<std::chrono::microseconds>( | ||
| end_time - start_read_store_time) | ||
| .count(); | ||
| LOG(INFO) << "Time taken for batch_get_into: " << elapsed_time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use DEBUG or TRACE to log this if this method is called frequently. If this is called frequently this log might pollute log files which might rollout quickly and make debugging difficult.
Alternatively we can also print a WARNING log if the time_taken > certain_threshold
|
Thanks for the work! Regarding the current approach for reading data from remote SSD: the remote client reads data from storage into its memory and then writes it into the local client’s memory. However, the correctness of RDMA writes is hard to guarantee. For example, if the RPC from the local client to the remote client fails or is interrupted, the remote client may still continue issuing RDMA writes. If the local memory is freed and reused for other tasks at that point, this can lead to data races and memory corruption. Given this risk, would it be possible to change this to a pull-based model, where the local client reads from remote memory instead? This aligns with what we discussed in this RFC: #1054 |
|
Thanks for your work, the code seems ok for me. Would you like to change this to a pull-based model just as @ykwd described? |
Support end-to-end read capability for local SSD objects: