-
Notifications
You must be signed in to change notification settings - Fork 0
Add lsm tree support #43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
✅ Unit tests passed. 📊 Coverage Summary: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces LSM Tree support to the key-value store system. The implementation provides a block-based LSM Tree structure with automatic memtable flushing to persistent SST files on disk when size thresholds are reached.
- Adds BlockLSMTree class with put/get/delete operations and automatic memtable flushing
- Implements SST file persistence using WAL entry serialization format
- Integrates LSM configuration into the runtime configuration system
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/storage/block_lsm_tree.h | Defines the BlockLSMTree class interface with memtable and SST file management |
| src/storage/block_lsm_tree.cpp | Implements LSM tree operations, memtable flushing, and SST file search functionality |
| tests/unit/block_lsm_tree_test.cpp | Adds unit test for flush-on-threshold behavior and file verification |
| src/config/runtime_config.json | Adds LSM configuration section with memtable size and directory settings |
| CMakeLists.txt | Integrates new source and test files into build system |
Comments suppressed due to low confidence (1)
tests/unit/block_lsm_tree_test.cpp:21
- Test does not verify cleanup failure scenarios. If rmdir or std::remove fail, subsequent test runs may be affected by leftover files.
ASSERT_EQ(stat("lsm_test/sst_0.sst", &st), 0);
| if (fd < 0) | ||
| return; |
Copilot
AI
Jul 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File descriptor is not checked for error before use. If open() fails, fd will be negative but the code continues to use it in subsequent operations.
| if (fd < 0) | |
| return; | |
| if (fd < 0) { | |
| std::cerr << "Error opening file '" << path << "': " << std::strerror(errno) << std::endl; | |
| return; | |
| } |
| return; | ||
| std::string path = dir_ + "/sst_" + std::to_string(next_file_id_++) + ".sst"; | ||
| int fd = ::open(path.c_str(), O_CREAT | O_WRONLY | O_TRUNC, 0666); | ||
| if (fd < 0) |
Copilot
AI
Jul 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Early return on file open failure leaves the SST file path untracked, but memtable data is lost. This could cause data loss as the memtable should only be cleared after successful flush.
| entry = {WalOpType::DELETE, kv.first, ""}; | ||
| } | ||
| auto buf = WalEntrySerializer::serialize(entry); | ||
| if (!writeAll(fd, buf.data(), buf.size())) { |
Copilot
AI
Jul 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When writeAll fails, the function breaks out of the loop but still adds the partially written file to sst_files_ and clears the memtable, potentially causing data loss.
| } | ||
| uint32_t klen, vlen; | ||
| std::memcpy(&klen, header+1, sizeof(uint32_t)); | ||
| std::memcpy(&vlen, header+1+sizeof(uint32_t), sizeof(uint32_t)); |
Copilot
AI
Jul 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No validation of klen and vlen values read from file. Malformed or malicious SST files could cause excessive memory allocation or integer overflow.
| std::memcpy(&vlen, header+1+sizeof(uint32_t), sizeof(uint32_t)); | |
| std::memcpy(&vlen, header+1+sizeof(uint32_t), sizeof(uint32_t)); | |
| if (klen > MAX_KEY_LENGTH || vlen > MAX_VALUE_LENGTH) { | |
| ::close(fd); | |
| return false; | |
| } |
| uint8_t header[1 + sizeof(uint32_t)*2]; | ||
| ssize_t r = ::read(fd, header, sizeof(header)); |
Copilot
AI
Jul 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Magic number for header size calculation. Consider defining a constant like HEADER_SIZE to make the code more readable and maintainable.
| uint8_t header[1 + sizeof(uint32_t)*2]; | |
| ssize_t r = ::read(fd, header, sizeof(header)); | |
| uint8_t header[HEADER_SIZE]; | |
| ssize_t r = ::read(fd, header, HEADER_SIZE); |
No description provided.