-
Notifications
You must be signed in to change notification settings - Fork 14
feat(slot_persister): add periodic missed slot fetch #465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Summary by CodeRabbit
WalkthroughThis pull request introduces new environment variables for configuring slot persistence and enhances the backfilling functionality to support both missed and regular slots. It adds asynchronous functions for processing missed slots in the ingester module, along with a cancellation mechanism. The slot persister module is updated with new constants and command-line arguments to manage missed slots through periodic checks and database updates. Additionally, the RocksDB storage layer is expanded with new columns, enum variants, and key encoders to facilitate tracking and processing of missed slots. Changes
Sequence Diagram(s)sequenceDiagram
participant Main
participant Executor
participant Backfiller
participant DB
Main->>Executor: Spawn run_backfill task (RegularSlots)
Main->>Executor: Spawn run_backfill task (MissedSlots)
Executor->>Backfiller: run_backfill(BackfillTarget::MissedSlots)
Backfiller->>DB: Retrieve last processed missed slot
Backfiller->>DB: Process missed slots in a loop
Note over Backfiller: Loop until cancellation signal is received
sequenceDiagram
participant Persister as SlotPersisterMain
participant BackfillSource
participant DB
participant CancellationToken
Persister->>Persister: Spawn periodic get_missed_slots task
loop Check for missed slots
Persister->>DB: Retrieve last consistency checkpoint
DB-->>Persister: Return checkpoint
Persister->>BackfillSource: Fetch missed slots within range
BackfillSource-->>Persister: Return missed slots data
Persister->>DB: Update consistency checkpoint after processing
CancellationToken-->>Persister: Check cancellation signal
end
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🧰 Additional context used🧬 Code Graph Analysis (1)nft_ingester/src/bin/slot_persister/main.rs (5)
🔇 Additional comments (2)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
86286c9 to
4db4e8f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (8)
.env.example(2 hunks)nft_ingester/src/backfiller.rs(4 hunks)nft_ingester/src/bin/ingester/main.rs(2 hunks)nft_ingester/src/bin/slot_persister/main.rs(6 hunks)rocks-db/src/columns/parameters.rs(1 hunks)rocks-db/src/columns/raw_block.rs(2 hunks)rocks-db/src/key_encoders.rs(1 hunks)rocks-db/src/lib.rs(4 hunks)
🧰 Additional context used
🧬 Code Definitions (3)
nft_ingester/src/backfiller.rs (2)
rocks-db/src/columns/raw_block.rs (8)
encode_key(18-20)encode_key(41-43)encode_key(59-61)encode_key(85-87)decode_key(22-24)decode_key(45-47)decode_key(63-65)decode_key(89-91)rocks-db/src/columns/parameters.rs (2)
encode_key(26-28)decode_key(30-33)
rocks-db/src/columns/raw_block.rs (2)
rocks-db/src/lib.rs (3)
Self(441-441)new(93-105)new(227-320)rocks-db/src/key_encoders.rs (4)
encode_u64x2(157-162)decode_u64x2(164-166)encode_u64(88-90)decode_u64(92-95)
nft_ingester/src/bin/slot_persister/main.rs (1)
rocks-db/src/columns/raw_block.rs (4)
decode_key(22-24)decode_key(45-47)decode_key(63-65)decode_key(89-91)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: build-base-image
- GitHub Check: test
- GitHub Check: lint
🔇 Additional comments (16)
rocks-db/src/columns/parameters.rs (1)
11-11: Adding new Parameter enum variant for missed slot trackingThis addition of
LastProcessedMissedSlotto the Parameter enum looks good. It effectively extends the parameter tracking capabilities to support the new missed slot processing functionality..env.example (1)
42-46: Configuration parameters for slot persister functionalityThe new environment variables for slot persistence are well-structured and provide the necessary configuration options for the missed slot fetch functionality:
- Path to slots data
- Chunk size for processing
- Backfill interval (21600 seconds = 6 hours)
- Backfill offset (1,000,000 slots)
These values look reasonable as defaults for an example configuration.
nft_ingester/src/bin/ingester/main.rs (1)
569-590:Details
❓ Verification inconclusive
Parallel task for missed slot backfilling looks good
The addition of a separate task for backfilling missed slots follows the same pattern as the existing backfill task, which is good for consistency. This implementation correctly reuses the same components (consumer, database, metrics) for both regular and missed slot backfilling.
One thing to note: this task runs only when both
run_backfillerandrun_bubblegum_backfillerare true. Is this dependency on the Bubblegum backfiller intentional?
🏁 Script executed:
#!/bin/bash # Check if the run_backfill_missed function has any additional flags or options compared to run_backfill_slots # to understand if the dependency on Bubblegum backfiller is intentional or if it should have its own flag rpc_source_files=$(fd -e rs -e toml backfill nft_ingester/) echo "Searching for run_backfill_missed implementation to check for flags..." grep -A 10 "run_backfill_missed" $rpc_source_files echo "Checking if there are other places where run_backfill_missed is called..." grep -l "run_backfill_missed" --include="*.rs" -r . | grep -v "$(basename "$0")"Length of output: 2880
Missed Slot Backfiller Task: Structure Looks Good – Please Verify Flag Dependency
The new asynchronous task for missed slot backfilling is well aligned with the established backfiller pattern by reusing the same components (consumer, database, and metrics). However, note that this task is executed only when both
run_backfillerandrun_bubblegum_backfillerare enabled. Our investigation confirms that therun_backfill_missedfunction itself does not introduce any additional flags or options compared to its standard backfiller counterparts, which suggests that the dependency on the Bubblegum backfiller flag is imposed externally.
- Location:
nft_ingester/src/bin/ingester/main.rs(lines 569–590)Please verify that coupling the missed slot backfiller’s execution to both flags is intentional and that no separate flag should control this behavior.
rocks-db/src/columns/raw_block.rs (2)
8-8: Good addition of serialization/deserialization imports.Importing
SerializeandDeserializeis consistent with usage in the new typed columns.
50-74: Struct-based key-only storage looks fine.By returning an empty vector in
encode, this approach relies solely on the key. This is acceptable if there's no need to store associated data forMissedSlotsIdx. Everything else looks correct.nft_ingester/src/backfiller.rs (3)
14-16: Appropriate import additions.These newly introduced columns and parameters are aligned with the missed slots backfilling logic.
23-23: Expanded logging.Adding
warnto the logging import set is aligned with the new missed slots warnings below.
252-308: Verify the decision to break on missing raw block.When
raw_blockis missing, the code logs a warning and breaks from iteration, preventing subsequent missed slots from processing. This may be intentional to avoid marking progress, but could also stall future missed slots indefinitely.Would you like to skip or remove the slot instead, allowing the code to continue with the next missed slot?
rocks-db/src/lib.rs (5)
23-26: Concurrency imports look suitable.Atomic operations are necessary to coordinate the missed slot sequence safely across threads.
86-88: New columns and sequence tracker introduced.Declaring
slot_consistency_checkpoint,missed_slots_idx, andmissed_slots_last_seqextends slot-related tracking capabilities.
93-105: Constructor initialization is correct.All new columns are properly instantiated via
Storage::column, and the atomic sequence is set to zero for a fresh start.
107-115: Column families updated.Including
SlotConsistencyCheckpoint::NAMEandMissedSlotsIdx::NAMEincf_names()ensures database column families are recognized.
156-172: Double-check partial key decoding.Currently, only the first 8 bytes of the last key are decoded to obtain
seq. If the second 64-bit portion of(u64, u64)is relevant, consider decoding it as well or clarifying why it's ignored.Would you like a script to search for all calls referencing
next_missed_slots_seqto confirm the usage is strictly tied to the sequence portion?nft_ingester/src/bin/slot_persister/main.rs (3)
16-20: Imports look appropriate.
These imports correctly bring in the required types and column families for slot indexing and checkpointing.
37-43: Constants for slot backfill configuration look good.
Defining these default values and accompanying documentation is a clear way to control slot backfill scheduling and offset.
237-242: Concurrent task spawn is appropriate.
Spawningget_missed_slotsin a separate task ensures the main loop isn’t blocked by periodic scanning, maintaining responsiveness.
4db4e8f to
44939d4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🔭 Outside diff range comments (2)
nft_ingester/src/bin/ingester/main.rs (1)
548-592: 🧹 Nitpick (assertive)Implementation of dual backfill processes for regular and missed slots
The code effectively sets up two parallel backfill processes - one for regular slots and another for missed slots. Both use identical infrastructure components but target different types of slots to be processed.
However, there's significant code duplication between the two spawned tasks. The initialization of components (consumer, db, metrics, slot_db) is identical in both blocks.
Consider extracting the common setup into a helper function to reduce duplication:
- usecase::executor::spawn({ - let cancellation_token = cancellation_token.child_token(); - let consumer = Arc::new(DirectBlockParser::new( - tx_ingester.clone(), - primary_rocks_storage.clone(), - metrics_state.backfiller_metrics.clone(), - )); - let db = primary_rocks_storage.clone(); - let metrics: Arc<BackfillerMetricsConfig> = - metrics_state.backfiller_metrics.clone(); - let slot_db = slot_db.clone(); - async move { - nft_ingester::backfiller::run_backfill( - cancellation_token, - db, - slot_db, - consumer, - metrics, - nft_ingester::backfiller::BackfillTarget::RegularSlots, - ) - .await; - } - }); - usecase::executor::spawn({ - let cancellation_token = cancellation_token.child_token(); - let consumer = Arc::new(DirectBlockParser::new( - tx_ingester.clone(), - primary_rocks_storage.clone(), - metrics_state.backfiller_metrics.clone(), - )); - let db = primary_rocks_storage.clone(); - let metrics: Arc<BackfillerMetricsConfig> = - metrics_state.backfiller_metrics.clone(); - let slot_db = slot_db.clone(); + // Helper function to spawn a backfill task with the specified target + let spawn_backfill_task = |target: nft_ingester::backfiller::BackfillTarget| { + let cancellation_token = cancellation_token.child_token(); + let consumer = Arc::new(DirectBlockParser::new( + tx_ingester.clone(), + primary_rocks_storage.clone(), + metrics_state.backfiller_metrics.clone(), + )); + let db = primary_rocks_storage.clone(); + let metrics: Arc<BackfillerMetricsConfig> = + metrics_state.backfiller_metrics.clone(); + let slot_db = slot_db.clone(); + + usecase::executor::spawn(async move { + nft_ingester::backfiller::run_backfill( + cancellation_token, + db, + slot_db, + consumer, + metrics, + target, + ) + .await; + }); + }; + + // Spawn tasks for both types of backfilling + spawn_backfill_task(nft_ingester::backfiller::BackfillTarget::RegularSlots); + spawn_backfill_task(nft_ingester::backfiller::BackfillTarget::MissedSlots); - async move { - nft_ingester::backfiller::run_backfill( - cancellation_token, - db, - slot_db, - consumer, - metrics, - nft_ingester::backfiller::BackfillTarget::MissedSlots, - ) - .await; - } - });.env.example (1)
89-91: 🧹 Nitpick (assertive)Consider standardizing file ending format
The file ends with a vim mode line but has an empty line before it. Consider standardizing the format of all configuration files to follow the same pattern for EOF handling.
- # vim: set ft=bash
♻️ Duplicate comments (1)
nft_ingester/src/bin/slot_persister/main.rs (1)
551-654: 🛠️ Refactor suggestionAddress nested function naming and improve error handling.
- The nested function
fn get_missed_slotsshares the same name as the enclosingasync fn get_missed_slots, which can be confusing. A different helper name (e.g.,retrieve_missed_slots_range) clarifies intent.- Multiple
.expect(...)calls for database writes could lead to panics under failure. Consider returning errors or using a graceful fallback strategy to avoid abruptly terminating the process.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Lite
📒 Files selected for processing (8)
.env.example(2 hunks)nft_ingester/src/backfiller.rs(5 hunks)nft_ingester/src/bin/ingester/main.rs(1 hunks)nft_ingester/src/bin/slot_persister/main.rs(6 hunks)rocks-db/src/columns/parameters.rs(1 hunks)rocks-db/src/columns/raw_block.rs(2 hunks)rocks-db/src/key_encoders.rs(1 hunks)rocks-db/src/lib.rs(4 hunks)
🧰 Additional context used
🧠 Learnings (1)
rocks-db/src/key_encoders.rs (2)
Learnt from: armyhaylenko
PR: metaplex-foundation/aura#465
File: rocks-db/src/key_encoders.rs:157-162
Timestamp: 2025-03-31T14:24:35.687Z
Learning: The `encode_u64x2` function in the key_encoders module intentionally returns a fixed-size array `[u8; 16]` instead of `Vec<u8>` to avoid heap allocations (syscalls) for performance reasons, despite the API inconsistency with other encoder functions.
Learnt from: armyhaylenko
PR: metaplex-foundation/aura#465
File: rocks-db/src/key_encoders.rs:164-166
Timestamp: 2025-03-31T14:25:35.651Z
Learning: The `decode_u64x2` function in the key_encoders module doesn't need explicit size validation because the `try_into()` method will implicitly validate the input size when converting byte slices to fixed-size arrays.
🧬 Code Definitions (5)
nft_ingester/src/bin/ingester/main.rs (1)
nft_ingester/src/backfiller.rs (3)
db(192-192)db(245-245)run_backfill(135-180)
rocks-db/src/columns/raw_block.rs (2)
rocks-db/src/lib.rs (3)
Self(441-441)new(93-105)new(227-320)rocks-db/src/key_encoders.rs (4)
encode_u64x2(157-162)decode_u64x2(164-166)encode_u64(88-90)decode_u64(92-95)
rocks-db/src/lib.rs (1)
rocks-db/src/columns/asset.rs (1)
new(3345-3347)
nft_ingester/src/backfiller.rs (2)
rocks-db/src/columns/raw_block.rs (8)
encode_key(18-20)encode_key(41-43)encode_key(59-61)encode_key(85-87)decode_key(22-24)decode_key(45-47)decode_key(63-65)decode_key(89-91)rocks-db/src/columns/parameters.rs (2)
encode_key(26-28)decode_key(30-33)
nft_ingester/src/bin/slot_persister/main.rs (1)
rocks-db/src/columns/raw_block.rs (4)
decode_key(22-24)decode_key(45-47)decode_key(63-65)decode_key(89-91)
🔇 Additional comments (19)
rocks-db/src/columns/parameters.rs (1)
10-11: New parameter for tracking missed slots processingThe addition of
LastProcessedMissedSlotto the Parameter enum properly supports the missed slots backfilling functionality. This parameter will be used to track the last processed missed slot, which is essential for the missed slots backfill process..env.example (1)
42-46: Environment variables added for slot persister configurationThe new environment variables for slot persister are well-documented and provide sensible default values for:
- Path to slot data
- Chunk size for processing
- Interval for backfilling
- Offset for backfilling
These additions properly support the new missed slots backfill functionality.
rocks-db/src/key_encoders.rs (2)
157-162: Optimized encode_u64x2 implementation using fixed-size arrayThe implementation efficiently encodes a tuple of two u64s using a fixed-size array instead of Vec. This avoids heap allocations (syscalls) for better performance.
I note this intentionally differs from other encoder return types for performance reasons.
164-166: Concise decode_u64x2 implementationThe decoder function is correctly implemented without explicit size validation as the
try_into()method will implicitly validate input size when converting to fixed-size arrays.rocks-db/src/columns/raw_block.rs (2)
8-8: Serde usage is appropriate.
Importingserde::{Deserialize, Serialize}is correct for implementing data serialization in newly added structs.
50-74: Verify empty vector encoding forMissedSlotsIdx.
This struct encodes and decodes to empty byte slices, which may be intentional if the key alone is meaningful. Confirm that no actual body data needs to be persisted for each entry.rocks-db/src/lib.rs (4)
86-88: Check concurrency implications formissed_slots_last_seq.
Storing the sequence inAtomicU64withRelaxedordering may cause stale reads if used concurrently across threads. Verify if stronger ordering is required.
93-105: Fields' initialization matches their usage.
The constructor properly wires up the new columns andmissed_slots_last_seq. No issues spotted.
108-115: Column families updated correctly.
AddingSlotConsistencyCheckpoint::NAMEandMissedSlotsIdx::NAMEtocf_names()ensures the DB columns are properly recognized.
155-172: Validate logic innext_missed_slots_seq.
- The function attempts to load the highest sequence from the last key if
missed_slots_last_seqis zero.- The iteration accessor
iter_endreturns the last item, but confirm there's no edge case if the database is empty.- The function uses
Relaxedordering, which might cause concurrency concerns with partial updates.nft_ingester/src/backfiller.rs (3)
6-6: Imports and logging additions look fine.
The additions ofEither, new column references, and thewarnlog level are consistent with the code.Also applies to: 15-17, 24-24
135-180: Clean approach to branching backfill logic.
UsingEitherfor concurrent calls to specializedbackfill_missedorbackfill_slotsis concise. Code duplication is minimized.
235-291: Evaluate the early break condition whenraw_blockis unavailable.
Currently, if araw_blockis missing at a certain(seq, slot), the loop breaks, potentially skipping subsequent missed slots. Confirm this is desired behavior or consider continuing with later entries.nft_ingester/src/bin/slot_persister/main.rs (6)
16-20: Imports look good.
No issues found with referencingMissedSlotsIdxandSlotConsistencyCheckpointfrom the new columns in RocksDB.
45-45: Annotation is fine.
No concerns with the#[derive(Parser, Debug, Clone)]attribute.
86-87: Command-line argument interval looks good.
Properly references the newly introducedDEFAULT_SLOT_BACKFILL_INTERVAL_SECS.
89-90: Backfill offset parameter is acceptable.
This aligns with the constant defined above, though see prior note regarding large offsets.
102-127: Potential risk in unchecked usage ofunwrap().
While the design might be sufficient for a strictly controlled environment, errors with missing column families or key decoding will panic. Consider robust error handling to prevent unexpected panics.
237-242: Spawning the missed slots routine is appropriate.
No immediate concurrency issues spotted, as RocksDB can handle multiple writers. Continue monitoring for potential concurrency interactions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Lite
📒 Files selected for processing (7)
backfill_rpc/src/rpc.rs(1 hunks)nft_ingester/src/backfiller.rs(6 hunks)nft_ingester/src/bin/slot_checker/main.rs(1 hunks)nft_ingester/src/bin/slot_persister/main.rs(6 hunks)nft_ingester/src/inmemory_slots_dumper.rs(2 hunks)usecase/src/bigtable.rs(1 hunks)usecase/src/slots_collector.rs(1 hunks)
🧰 Additional context used
🧬 Code Definitions (2)
nft_ingester/src/bin/slot_persister/main.rs (4)
rocks-db/src/columns/raw_block.rs (4)
decode_key(22-24)decode_key(45-47)decode_key(63-65)decode_key(89-91)nft_ingester/src/inmemory_slots_dumper.rs (1)
new(18-20)usecase/src/slots_collector.rs (1)
new(81-87)metrics_utils/src/lib.rs (8)
new(34-41)new(68-86)new(138-144)new(174-183)new(249-256)new(302-318)new(417-431)new(668-675)
nft_ingester/src/inmemory_slots_dumper.rs (3)
nft_ingester/src/backfiller.rs (3)
new(49-66)new(121-127)new(372-377)usecase/src/slots_collector.rs (1)
new(81-87)usecase/src/bigtable.rs (1)
new(27-32)
🔇 Additional comments (13)
usecase/src/bigtable.rs (1)
20-20: Adding Clone derive is a good improvement for concurrent usage.This change enables
BigTableClientinstances to be cloned efficiently, which is useful for concurrent processing scenarios. Since the struct fields are already wrapped inArc, cloning is efficient as it only increases reference counts rather than duplicating the underlying data.usecase/src/slots_collector.rs (1)
65-65: Implementing Clone for SlotsCollector aligns with concurrent processing improvements.Adding the
Clonetrait toSlotsCollectorenables more flexible ownership management in asynchronous contexts, which aligns with the PR objectives for enhanced slot processing. Since the struct's fields are already usingArc, cloning is efficient.backfill_rpc/src/rpc.rs (1)
40-40: Adding Clone derive provides consistent behavior with other components.This change makes
BackfillRPCcloneable, which is consistent with the changes to other structs in the codebase. This helps support the periodic missed slot fetching functionality by allowing the RPC client to be easily shared across tasks.nft_ingester/src/inmemory_slots_dumper.rs (3)
1-1: Added Arc import for shared ownership.The addition of
Arcto the imports supports the change in the struct's field type. This is necessary for the new implementation that enables shared ownership of the slots collection.
7-9: Enhanced for thread-safety with Clone and Arc.Making
InMemorySlotsDumpercloneable and changingslotsto useArc<Mutex<BTreeSet<u64>>>improves thread safety and enables shared state between cloned instances. This is important for the periodic missed slot fetch functionality where multiple tasks might access the same slot collection.
19-19: Updated constructor to initialize with Arc-wrapped Mutex.The constructor now correctly initializes the
slotsfield with anArc-wrappedMutex, ensuring that cloned instances ofInMemorySlotsDumpershare the same underlying collection. This aligns with the thread-safety improvements in the struct definition.nft_ingester/src/bin/slot_checker/main.rs (3)
3-3: HashSet import usage looks fine.No immediate issues identified. The
HashSetusage aligns well with deduplication logic below.
15-15: Inconsistent summary regardingInMemorySlotsDumper.The AI-generated summary mentions removing
InMemorySlotsDumper, but this import and subsequent uses remain. This discrepancy indicates that the summary may be out of date or incomplete.Likely an incorrect or invalid review comment.
20-20: Graceful shutdown approach is valid.Leveraging
tokio::signalandCancellationTokenfor Ctrl+C handling is a robust solution.nft_ingester/src/backfiller.rs (2)
136-181: Flexible control flow for slot backfilling.Using
Eitherto branch betweenMissedSlotsandRegularSlotsis clean and extensible. The periodic loop logic also covers cancellation gracefully. No major issues found.
193-234:Details
✅ Verification successful
Leveraging database parameter for last backfilled slot.
Retrieving and updating the last backfilled slot via
Parameter::LastBackfilledSlotensures continuity between backfill iterations. Ensure that calling code or external logic properly sets/reads this parameter if external changes occur.Run the following script to confirm the presence of references to
Parameter::LastBackfilledSlotin the codebase:
🏁 Script executed:
#!/bin/bash # Searching for references to Parameter::LastBackfilledSlot beyond this file ast-grep --pattern $'Parameter::LastBackfilledSlot'Length of output: 3543
Parameter
LastBackfilledSlotUsage ConfirmedThe verification confirms that references to
Parameter::LastBackfilledSlotare consistently present across the codebase (including tests and client implementations). The usage innft_ingester/src/backfiller.rsis in line with its application elsewhere. As a reminder, ensure that any external logic or calling code properly sets and updates this parameter to maintain continuity between backfill iterations.nft_ingester/src/bin/slot_persister/main.rs (2)
1-7: New imports align with additional slot checking logic.The newly introduced items (e.g.,
HashMap,Duration,Mutex) facilitate missed slot handling and concurrency. No immediate issues.
44-50: Useful defaults for backfill settings.
DEFAULT_SLOT_BACKFILL_INTERVAL_SECSandDEFAULT_SLOT_BACKFILL_OFFSETappear to be reasonable defaults. Ensure that large offsets don’t cause excessive memory usage or scanning overhead.Please confirm in the documentation or tests that using the default offset of 1,000,000 slots won’t exceed memory constraints under production workloads.
ec53185 to
cd61aad
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
nft_ingester/src/backfiller.rs (1)
129-134: 🧹 Nitpick (assertive)Add doc comments to clarify usage of
BackfillTarget.Providing a short doc comment explaining when to use
MissedSlotsvs.RegularSlotswould help readers quickly understand the backfill variants.#[derive(Debug)] +/// Defines which type of slots to backfill: `RegularSlots` or `MissedSlots`. pub enum BackfillTarget { RegularSlots, MissedSlots, }nft_ingester/src/bin/slot_persister/main.rs (1)
92-97: 🛠️ Refactor suggestionConsider validating user-supplied intervals.
If a user sets
slot_backfill_intervalorslot_backfill_offsetto zero or an unexpectedly large value, it could lead to performance or logic issues. A sanity check can help avoid misconfiguration.#[arg(long, env, help = "How often to check for missed slots (in seconds)", default_value_t = DEFAULT_SLOT_BACKFILL_INTERVAL_SECS)] pub slot_backfill_interval: u64, #[arg(long, env, help = "The number of slots by which to check & advance the next sequential slots", default_value_t = DEFAULT_SLOT_BACKFILL_OFFSET)] pub slot_backfill_offset: u64, + // Example validation at startup: + if args.slot_backfill_interval == 0 { + panic!("slot_backfill_interval must not be zero!"); + } + if args.slot_backfill_offset == 0 { + panic!("slot_backfill_offset must not be zero!"); + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Lite
📒 Files selected for processing (7)
backfill_rpc/src/rpc.rs(1 hunks)nft_ingester/src/backfiller.rs(6 hunks)nft_ingester/src/bin/slot_checker/main.rs(1 hunks)nft_ingester/src/bin/slot_persister/main.rs(6 hunks)nft_ingester/src/inmemory_slots_dumper.rs(2 hunks)usecase/src/bigtable.rs(1 hunks)usecase/src/slots_collector.rs(1 hunks)
🧰 Additional context used
🧬 Code Definitions (3)
nft_ingester/src/inmemory_slots_dumper.rs (4)
nft_ingester/src/backfiller.rs (3)
new(49-66)new(121-127)new(372-377)usecase/src/bigtable.rs (1)
new(27-32)usecase/src/slots_collector.rs (1)
new(81-87)metrics_utils/src/lib.rs (11)
new(34-41)new(68-86)new(138-144)new(174-183)new(249-256)new(302-318)new(417-431)new(668-675)new(732-749)new(844-853)new(939-958)
nft_ingester/src/backfiller.rs (1)
rocks-db/src/columns/parameters.rs (2)
encode_key(26-28)decode_key(30-33)
nft_ingester/src/bin/slot_persister/main.rs (5)
rocks-db/src/lib.rs (3)
column(444-452)new(93-105)new(227-320)rocks-db/src/columns/raw_block.rs (4)
decode_key(22-24)decode_key(45-47)decode_key(63-65)decode_key(89-91)nft_ingester/src/backfiller.rs (5)
new(49-66)new(121-127)new(372-377)db(193-193)db(246-246)nft_ingester/src/inmemory_slots_dumper.rs (1)
new(18-20)metrics_utils/src/lib.rs (8)
new(34-41)new(68-86)new(138-144)new(174-183)new(249-256)new(302-318)new(417-431)new(668-675)
🔇 Additional comments (18)
usecase/src/bigtable.rs (1)
20-20: Appropriate addition of Clone derivation for BigTableClientAdding
#[derive(Clone)]to theBigTableClientstruct is a good change since it enables cloning of client instances, which is likely needed for the periodic missed slot fetch functionality. Since the struct fields are already wrapped inArc, cloning will be efficient and only copy the reference-counted pointers.backfill_rpc/src/rpc.rs (1)
40-40: Appropriate addition of Clone derivation for BackfillRPCAdding
#[derive(Clone)]to theBackfillRPCstruct is sensible because it allows the creation of duplicate instances when needed for async operations. The containedBackoffRpcClientalready implementsClone(as seen on line 206), so this is a consistent change.usecase/src/slots_collector.rs (1)
65-65: Appropriate addition of Clone derivation for SlotsCollectorAdding
#[derive(Clone)]to theSlotsCollectorstruct is appropriate. All internal fields already useArcfor shared ownership, making the Clone implementation efficient by only copying the Arc pointers rather than the underlying data. This change aligns with the other modifications in this PR and supports the new missed slot fetching functionality.nft_ingester/src/backfiller.rs (3)
6-6: Imports and Clone derivation look good.The newly introduced imports (
Either, expanded columns, and additional tracing macros) and the#[derive(Clone)]annotation forBackfillSourceare appropriate and pose no issues.Also applies to: 15-17, 24-24, 42-42
136-181: Good approach to unify backfill entry points.
run_backfillcleanly delegates to eitherbackfill_missedorbackfill_slotsbased on theBackfillTargetenum. This is straightforward to maintain, and the logging is suitably descriptive. If further logic overlaps, consider factoring out shared steps.
236-292: Consider refining the break condition.When a raw block is missing, you break the entire loop to avoid advancing
LastProcessedMissedSlot. This ensures no missed slots are skipped, but it also permanently halts backfilling. A single missing block can stall future missed slots.nft_ingester/src/inmemory_slots_dumper.rs (1)
1-1: Switch toArc<Mutex<_>>and#[derive(Clone)]is appropriate.Wrapping
BTreeSet<u64>inArc<Mutex<...>>and derivingCloneallows multiple components to share and synchronize writes. This change is consistent with typical concurrency patterns.Also applies to: 7-7, 9-9, 19-19
nft_ingester/src/bin/slot_checker/main.rs (1)
3-3: New imports are well-aligned with the refactor.Adding
HashSetensures straightforward slot deduplication, and usingtokio::signalplusInMemorySlotsDumperis consistent with the revised slot-handling logic.Also applies to: 15-15, 20-20
nft_ingester/src/bin/slot_persister/main.rs (10)
1-7: Imports look good.No issues found with these newly introduced imports.
10-10: No concerns for BUBBLEGUM_PROGRAM_ID import.This change appears correct and consistent with the rest of the codebase.
17-17: Metrics import looks fine.No issues found with the newly introduced metrics imports.
23-27: RocksDB columns usage is consistent.No concerns regarding these references to typed columns.
44-50: Large default offset can be memory-intensive.This offset (1,000,000) may cause significant memory usage if the range of missing slots is massive. Consider chunking or streaming.
52-52: Deriving Parser, Debug, and Clone looks standard.No additional feedback needed.
109-134: Consider robust error handling for missing/corrupted column families.Reusing a comment from a previous review. The current approach relies on
unwrap()andunwrap_or_default(), which may not gracefully handle unexpected DB corruption or missing column families.
559-559: No additional feedback.This blank line change requires no further remarks.
560-770: Naming conflict with nestedget_missed_slots.A local async function named
get_missed_slotsexists within another function of the same name. This can be confusing; consider renaming the inner function (e.g.,retrieve_missed_slots_range) for clarity.
560-770: Potential memory overhead with large offset.Collecting up to
slot_backfill_offsetworth of slots in memory may be costly if extremely large. This concern was previously noted. A chunked or streaming approach might be more robust.
cd61aad to
7d944de
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
nft_ingester/src/bin/slot_persister/main.rs (1)
560-606: 🧹 Nitpick (assertive)Inner function shadowing the parent function name
Declaringasync fn get_missed_slots(...)insideasync fn get_missed_slots(...)may confuse future maintainers. Consider renaming the inner helper (e.g.,retrieve_missed_slots_range) to differentiate them clearly.- async fn get_missed_slots( + async fn retrieve_missed_slots_range( in_mem_dumper: Arc<InMemorySlotsDumper>, ... ) -> Option<(u64, u64, Vec<u64>)> { ... }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Lite
📒 Files selected for processing (7)
backfill_rpc/src/rpc.rs(1 hunks)nft_ingester/src/backfiller.rs(6 hunks)nft_ingester/src/bin/slot_checker/main.rs(1 hunks)nft_ingester/src/bin/slot_persister/main.rs(6 hunks)nft_ingester/src/inmemory_slots_dumper.rs(2 hunks)usecase/src/bigtable.rs(1 hunks)usecase/src/slots_collector.rs(1 hunks)
🧰 Additional context used
🧬 Code Definitions (3)
nft_ingester/src/inmemory_slots_dumper.rs (3)
nft_ingester/src/backfiller.rs (3)
new(49-66)new(121-127)new(372-377)usecase/src/slots_collector.rs (1)
new(81-87)usecase/src/bigtable.rs (1)
new(27-32)
nft_ingester/src/backfiller.rs (2)
rocks-db/src/columns/raw_block.rs (8)
encode_key(18-20)encode_key(41-43)encode_key(59-61)encode_key(85-87)decode_key(22-24)decode_key(45-47)decode_key(63-65)decode_key(89-91)rocks-db/src/columns/parameters.rs (2)
encode_key(26-28)decode_key(30-33)
nft_ingester/src/bin/slot_persister/main.rs (3)
rocks-db/src/columns/raw_block.rs (4)
decode_key(22-24)decode_key(45-47)decode_key(63-65)decode_key(89-91)nft_ingester/src/inmemory_slots_dumper.rs (1)
new(18-20)usecase/src/slots_collector.rs (1)
new(81-87)
🔇 Additional comments (14)
usecase/src/bigtable.rs (1)
20-20: Approved addingClonetraitAdding the
Clonetrait to theBigTableClientstruct is appropriate since both fields are already wrapped inArc, making cloning cheap and safe.backfill_rpc/src/rpc.rs (1)
40-40: Appropriate addition ofClonetraitAdding the
Clonetrait toBackfillRPCis appropriate since its only field (BackoffRpcClient) already implementsClone(as seen on line 206). This change is consistent with the PR objective to enhance slot processing capabilities.usecase/src/slots_collector.rs (1)
65-65: Approved addingClonetraitThe
Clonetrait addition is appropriate forSlotsCollectoras all fields are already wrapped inArc, making the clone operation efficient by only cloning the reference-counted pointers rather than the underlying data.nft_ingester/src/inmemory_slots_dumper.rs (1)
1-1: Appropriate modification for shared state between clonesWrapping the
Mutexin anArcand adding theClonetrait ensures that all clones ofInMemorySlotsDumpershare the same underlying state. This is a good approach for maintaining consistency when the struct needs to be cloned.Also applies to: 7-9, 19-19
nft_ingester/src/bin/slot_checker/main.rs (3)
3-3: Proper use of HashSet for duplicate filtering
The addition ofHashSetis a straightforward approach to eliminate duplicate slots. This looks correct and is used effectively below.
15-15: ImportingInMemorySlotsDumper
The import is used when creating thein_mem_dumperinstance. This aligns with the logic for collecting slots and seems valid.
20-20: Graceful shutdown signal import
Usingtokio::signalfor handling Ctrl+C helps ensure a graceful shutdown. No concerns here.nft_ingester/src/bin/slot_persister/main.rs (4)
1-7: New imports for HashMap, Arc, Mutex, etc.
These imports (e.g.,HashMap,Arc,Mutex) are consistent with the concurrency and data structures used later in the file.
93-98: Added CLI arguments for slot backfill interval & offset
Introducingslot_backfill_intervalandslot_backfill_offsetprovides flexible, configurable backfill scheduling. Be mindful of large default offsets (1,000,000) if the data range is extremely large.
109-134: New functionget_checkpointmerges checkpoint logic
This function retrieves the last consistency checkpoint or falls back to the firstRawBlock. The approach is clean and idiomatic. One minor caution: if theSlotConsistencyCheckpointCF is missing or corrupted, the default fallback is0. Ensure this behavior is intentional.
708-770: Periodic loop for missed slots backfill
The main loop repeatedly attempts to find and process missed slots, then sleeps based onslot_backfill_interval. The logic is sound. Confirm that partial progress is persisted before shutdown to avoid redoing work.nft_ingester/src/backfiller.rs (3)
6-6: IntroducingEitherfor branching futures
UsingEitherfromfutures::futureis an elegant way to conditionally runbackfill_missedorbackfill_slots. This is a clean abstraction.
136-168:run_backfillorchestrating different backfill targets
The loop pattern withEitherfor missed or regular slots is straightforward. The retry interval and cancellation token usage are appropriate. This structure is well-organized for extension.
236-292: Newbackfill_missedfunction
Backfilling missed slots by iterating overMissedSlotsIdxand stopping on unreadable slots ensures we don’t lose track of unprocessed data. Logging warnings for missing blocks is helpful. Possibly queue them for later re-try if feasible.
7d944de to
702f0ed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
♻️ Duplicate comments (1)
nft_ingester/src/bin/slot_persister/main.rs (1)
244-251: 🧹 Nitpick (assertive)Spawned task lacks join handle management.
Spawningget_missed_slotsand not storing theJoinHandlecan obscure potential runtime errors or panics. Consider capturing and handling the task result for better observability.+ let missed_slots_handle = usecase::executor::spawn(get_missed_slots( + backfill_source.clone(), + target_db.clone(), + args.clone(), + metrics_state.backfiller_metrics.clone(), + cancellation_token.child_token(), + )); + // Optionally: missed_slots_handle.await?;
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Lite
📒 Files selected for processing (7)
backfill_rpc/src/rpc.rs(1 hunks)nft_ingester/src/backfiller.rs(6 hunks)nft_ingester/src/bin/slot_checker/main.rs(1 hunks)nft_ingester/src/bin/slot_persister/main.rs(6 hunks)nft_ingester/src/inmemory_slots_dumper.rs(2 hunks)usecase/src/bigtable.rs(1 hunks)usecase/src/slots_collector.rs(1 hunks)
🧰 Additional context used
🧬 Code Definitions (3)
nft_ingester/src/inmemory_slots_dumper.rs (3)
nft_ingester/src/backfiller.rs (3)
new(49-66)new(121-127)new(372-377)usecase/src/slots_collector.rs (1)
new(81-87)usecase/src/bigtable.rs (1)
new(27-32)
nft_ingester/src/backfiller.rs (2)
rocks-db/src/columns/raw_block.rs (8)
encode_key(18-20)encode_key(41-43)encode_key(59-61)encode_key(85-87)decode_key(22-24)decode_key(45-47)decode_key(63-65)decode_key(89-91)rocks-db/src/columns/parameters.rs (2)
encode_key(26-28)decode_key(30-33)
nft_ingester/src/bin/slot_persister/main.rs (2)
rocks-db/src/columns/raw_block.rs (4)
decode_key(22-24)decode_key(45-47)decode_key(63-65)decode_key(89-91)nft_ingester/src/inmemory_slots_dumper.rs (1)
new(18-20)
🔇 Additional comments (17)
usecase/src/bigtable.rs (1)
20-20: Added Clone derive: Good for shared instance needs.This addition of
#[derive(Clone)]toBigTableClientenables instances to be cloned, which is useful when sharing the client across different components or tasks. This is a safe change because the internal fields are already wrapped inArc, ensuring that cloning will be shallow and efficient.usecase/src/slots_collector.rs (1)
65-65: Added Clone derive: Appropriate for this Arc-containing struct.Adding
#[derive(Clone)]toSlotsCollectoris a good enhancement that enables cloning of collector instances. Since all fields are already wrapped inArc, this implementation of Clone will perform shallow cloning, which is efficient and appropriate for this use case.backfill_rpc/src/rpc.rs (1)
40-40: Added Clone derive: Good addition for RPC client sharing.Adding
#[derive(Clone)]toBackfillRPCaligns with the struct's design and allows for easier sharing of RPC client instances. The fieldclientis already of typeBackoffRpcClientwhich itself derives Clone (line 206), so this change is consistent with the existing architecture.nft_ingester/src/inmemory_slots_dumper.rs (2)
1-1: Modified struct design for shared state across clones.The changes to add
#[derive(Clone)]and wrap theMutexin anArcare good architectural improvements. This ensures that whenInMemorySlotsDumperinstances are cloned, they will share the same underlying slot collection, allowing multiple components to access and modify the same set of slots.Also applies to: 7-7, 9-9
19-19: Constructor updated to match the field type change.The constructor has been properly updated to initialize the
slotsfield with anArc-wrapped Mutex, consistent with the field's new type definition. This ensures proper initialization when new instances are created.nft_ingester/src/bin/slot_checker/main.rs (3)
3-3: Confirm intended usage ofHashSet.
Importingcollections::HashSetfor deduplicating or membership checks is perfectly valid. If you ever need sorted ordering when removing duplicates, considerBTreeSet.
15-15: New import forInMemorySlotsDumperis consistent.
TheInMemorySlotsDumperaligns well with your slot-collection logic. No issues found.
20-20: Graceful shutdown approach is appropriate.
Usingtokio::signalfor Ctrl+C handling is a good practice for clean cancellation.nft_ingester/src/bin/slot_persister/main.rs (5)
1-7: No issues with the newly introduced imports.
Imports forHashMap,PathBuf,Arc<Mutex>, andDurationappear standard and correctly utilized for concurrency, path handling, and timing.
10-10: Bubblegum program ID import is straightforward.
Referencing theBUBBLEGUM_PROGRAM_IDhere is consistent with the ingester's usage.
15-15: Metrics utilities import looks clean.
No functionality issues spotted; metrics usage seems well-structured elsewhere in the file.
23-24: New columns for missed slots and slot consistency.
AddingMissedSlotsIdxandSlotConsistencyCheckpointcolumns aligns with the new backfill logic. Ensure the DB migration and usage patterns are well-tested to avoid potential key collisions.
559-769: Nestedget_missed_slotsnaming and potential memory overhead.
- Nested function name clash: Declaring a nested function with the same name (
get_missed_slots) can be confusing. Consider renaming the inner helper to avoid ambiguity.- Memory usage: Large slot offset ranges may collect many slots in memory. For extremely large ranges, implement chunked processing to avoid excessive memory usage.
These concerns mirror previous suggestions in older commits regarding naming and chunked or streaming approaches, so tagging as repeated feedback.
nft_ingester/src/backfiller.rs (4)
6-6: Conditional concurrency utility is well-chosen.
Importingfutures::future::Eitherprovides a concise way to branch based on conditions, particularly inrun_backfill.
15-17: Adding new columns to handle missed slots and parameters.
Your usage ofparameters::Parameterandraw_block::MissedSlotsIdxis consistent with indexing missed slots and saving config. Just ensure that your DB schema or migrations handle the new columns properly.
24-24: Tracing macros for error/info/warn are standard.
No issues found.
136-181:run_backfilllogic is well-structured with branching.
UsingEither::LeftorRightis a neat pattern for picking the correct backfill function. The small sleep approach is simple, though you might want to ensure no busy waiting in advanced scenarios.
702f0ed to
4d29349
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Lite
📒 Files selected for processing (7)
backfill_rpc/src/rpc.rs(1 hunks)nft_ingester/src/backfiller.rs(6 hunks)nft_ingester/src/bin/slot_checker/main.rs(1 hunks)nft_ingester/src/bin/slot_persister/main.rs(6 hunks)nft_ingester/src/inmemory_slots_dumper.rs(2 hunks)usecase/src/bigtable.rs(1 hunks)usecase/src/slots_collector.rs(1 hunks)
🧰 Additional context used
🧬 Code Definitions (2)
nft_ingester/src/inmemory_slots_dumper.rs (3)
nft_ingester/src/backfiller.rs (3)
new(49-66)new(121-127)new(372-377)usecase/src/slots_collector.rs (1)
new(81-87)usecase/src/bigtable.rs (1)
new(27-32)
nft_ingester/src/bin/slot_persister/main.rs (3)
rocks-db/src/columns/raw_block.rs (4)
decode_key(22-24)decode_key(45-47)decode_key(63-65)decode_key(89-91)nft_ingester/src/inmemory_slots_dumper.rs (1)
new(18-20)usecase/src/slots_collector.rs (1)
new(81-87)
🔇 Additional comments (17)
usecase/src/slots_collector.rs (1)
65-65: LGTM: Adding Clone derive is appropriate.Adding
#[derive(Clone)]to theSlotsCollectorstruct is a good addition since all fields are already wrapped inArc, which makes them safely shareable across multiple clones. This change aligns with the PR objective of supporting periodic missed slot fetch functionality.usecase/src/bigtable.rs (1)
20-20: LGTM: Adding Clone derive is appropriate.Adding
#[derive(Clone)]to theBigTableClientstruct is correct since both fields are already wrapped inArc, enabling safe sharing across threads. This facilitates using this struct in the new missed slot fetch functionality.backfill_rpc/src/rpc.rs (1)
40-40: LGTM: Adding Clone derive is appropriate.Adding
#[derive(Clone)]to theBackfillRPCstruct is a good addition. The struct has a single fieldclientof typeBackoffRpcClient, which is alreadyClone, making this change straightforward and safe. This enhancement supports the periodic missed slot fetch functionality.nft_ingester/src/inmemory_slots_dumper.rs (4)
1-1: LGTM: Adding Arc import is correct.The import of
Arcfromstd::syncis necessary to support the changes made to the internal field type in this struct.
7-7: LGTM: Adding Clone derive is appropriate.The
#[derive(Clone)]addition is consistent with the other structs modified in this PR and enables sharing this struct across multiple consumers in the slot persistence system.
9-9: LGTM: Wrapping Mutex in Arc enables shared ownership.Changing the
slotsfield fromMutex<BTreeSet<u64>>toArc<Mutex<BTreeSet<u64>>>is a good enhancement. This allows multiple clones of theInMemorySlotsDumperto share ownership of the same underlying collection, which is important for the periodic missed slot fetch functionality.
19-19: LGTM: Constructor updated to use Arc.The constructor has been correctly updated to initialize the
slotsfield with anArcwrapper around theMutex.nft_ingester/src/bin/slot_checker/main.rs (3)
3-3: UsingHashSetfor duplicate handling looks good.
This import is appropriate to ensure we can efficiently track distinct slots.
15-15: AddingInMemorySlotsDumperusage.
Ensures consistency with in-memory slot-logging logic. No further issues noted.
20-20: Graceful shutdown mechanism viatokio::signal.
Usingsignal::ctrl_cis standard practice for clean teardown.nft_ingester/src/bin/slot_persister/main.rs (5)
1-7: Consolidated imports fromstd.
These additions (e.g.,Ordering,HashMap, etc.) appear necessary and well-organized.
10-10: Bubblegum program import.
No immediate concerns; usage looks aligned with broader references toBUBBLEGUM_PROGRAM_ID.
17-27: Including metrics and RocksDB columns (MissedSlotsIdx, SlotConsistencyCheckpoint).
These imports align with the missed slot backfill feature. They appear correctly referenced elsewhere.
52-52: Struct derivingClonefor command-line parser.
This is helpful, especially if the struct is cloned across threads. Looks good.
560-789: Nestedget_missed_slotsnaming and large range handling.
- The inner function shadowing the outer function name can confuse maintainers; consider a distinct name (e.g.,
retrieve_missed_slots_range).- When handling large offsets, be mindful of potential resource usage.
nft_ingester/src/backfiller.rs (2)
6-6: Introducingfutures::future::Either.
No issues noted; it simplifies branching logic for asynchronous tasks.
136-181: Refinedrun_backfillwith aBackfillTargetparameter.
The branching withEither::LeftandEither::Rightis a neat approach to unify missed vs. regular slot handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
♻️ Duplicate comments (3)
rocks-db/src/columns/raw_block.rs (1)
79-103: 🧹 Nitpick (assertive)Consider documenting the
SlotConsistencyCheckpointsemantics.While the implementation is correct, adding a brief docstring describing the purpose and usage patterns of this checkpoint would improve maintainability for future contributors.
+/// Represents a checkpoint for slot consistency verification. +/// Used to track which slots have been verified for consistency +/// in the missed slots processing pipeline. #[derive(Serialize, Deserialize, Clone)] pub struct SlotConsistencyCheckpoint;nft_ingester/src/bin/slot_persister/main.rs (2)
248-255: 🛠️ Refactor suggestionSpawned missed slot processing without join handle
Consider capturing the join handle to detect panics or failures in
get_missed_slots. Unnoticed panic may suppress error visibility.-usecase::executor::spawn(get_missed_slots( +let missed_slots_handle = usecase::executor::spawn(get_missed_slots( backfill_source.clone(), target_db.clone(), args.clone(), metrics_state.backfiller_metrics.clone(), cancellation_token.child_token(), )); +// Optionally: missed_slots_handle.await?;
97-101: 🛠️ Refactor suggestionNew CLI fields for missed slot backfill
slot_backfill_intervalandslot_backfill_offsetcould benefit from minimal validation (e.g., ensuring nonzero values) to prevent misconfiguration.#[arg(long, env, help = "How often to check for missed slots (in seconds)", default_value_t = DEFAULT_SLOT_BACKFILL_INTERVAL_SECS)] pub slot_backfill_interval: u64, #[arg(long, env, help = "The number of slots by which to check & advance the next sequential slots", default_value_t = DEFAULT_SLOT_BACKFILL_OFFSET)] pub slot_backfill_offset: u64, + // Example validation approach (pseudocode): + fn validate_backfill_args(&self) -> Result<(), String> { + if self.slot_backfill_interval == 0 || self.slot_backfill_offset == 0 { + return Err("Values must be nonzero".into()); + } + Ok(()) + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Lite
📒 Files selected for processing (6)
.env.example(2 hunks)nft_ingester/src/backfiller.rs(6 hunks)nft_ingester/src/bin/ingester/main.rs(1 hunks)nft_ingester/src/bin/slot_persister/main.rs(7 hunks)rocks-db/src/columns/parameters.rs(1 hunks)rocks-db/src/columns/raw_block.rs(2 hunks)
🧰 Additional context used
🧬 Code Definitions (3)
rocks-db/src/columns/raw_block.rs (2)
rocks-db/src/columns/parameters.rs (4)
encode_key(26-28)decode_key(30-33)encode(35-38)decode(40-43)rocks-db/src/key_encoders.rs (4)
encode_u64x2(157-162)decode_u64x2(164-166)encode_u64(88-90)decode_u64(92-95)
nft_ingester/src/backfiller.rs (2)
rocks-db/src/columns/raw_block.rs (8)
encode_key(18-20)encode_key(41-43)encode_key(62-64)encode_key(88-90)decode_key(22-24)decode_key(45-47)decode_key(66-68)decode_key(92-94)rocks-db/src/columns/parameters.rs (2)
encode_key(26-28)decode_key(30-33)
nft_ingester/src/bin/slot_persister/main.rs (7)
nft_ingester/src/config.rs (1)
parse_json(554-556)nft_ingester/src/api/dapi/change_logs.rs (1)
rocks_db(146-166)rocks-db/src/columns/raw_block.rs (4)
decode_key(22-24)decode_key(45-47)decode_key(66-68)decode_key(92-94)nft_ingester/src/backfiller.rs (5)
new(49-66)new(121-127)new(362-367)db(192-192)db(245-245)nft_ingester/src/inmemory_slots_dumper.rs (1)
new(18-20)usecase/src/slots_collector.rs (1)
new(81-87)metrics_utils/src/lib.rs (8)
new(34-41)new(68-86)new(138-144)new(174-183)new(249-256)new(302-318)new(417-431)new(668-675)
🔇 Additional comments (18)
rocks-db/src/columns/raw_block.rs (1)
50-77: Good implementation ofMissedSlotsIdxwith clear documentation.The implementation correctly handles the encoding and decoding of the composite key type
(u64, u64)using the appropriate key encoders. The comment explaining the purpose of the sequence number for decoupling missed slots order is helpful..env.example (1)
42-44: Well-structured environment variables for slot persistence.The added environment variables provide good defaults for configuring the slot backfill process:
SLOT_BACKFILL_INTERVAL=120- Sets a reasonable default interval (in seconds)SLOT_BACKFILL_OFFSET=1000000- Sets a large enough offset to handle historical missed slotsThese values will help users configure the slot persister's operation without code changes.
nft_ingester/src/bin/ingester/main.rs (2)
553-571: Good refactoring to support both regular and missed slot processing.The implementation properly initializes a separate task for regular slot backfilling, reusing the existing consumer, database, metrics and slot storage. This maintains the original functionality while setting up for the new missed slots feature.
572-590: Effective implementation of missed slots processing.You've correctly updated the task to use
BackfillTarget::MissedSlotswhile sharing the consumer with the regular slots processing. This approach avoids code duplication and maintains consistent behavior across both backfill types.rocks-db/src/columns/parameters.rs (1)
11-11: Good addition of theLastProcessedMissedSlotKeyparameter.This new parameter provides the necessary state tracking for the missed slots processing feature, following the same pattern as the other slot-tracking parameters in the enum.
nft_ingester/src/bin/slot_persister/main.rs (7)
1-7: Added imports for ordering, collections, and concurrencyNo immediate concerns. These imports align with the new missed slots logic for comparisons and data handling.
10-10: Alias bubblegum program IDNo issues with importing and aliasing the bubblegum program ID for reference.
17-17: Ensure all newly imported traits are used
MetricsTraitis imported. Confirm that it’s consistently utilized in the code to avoid dead imports.
23-27: New columns for missed slots and checkpointNo issues.
MissedSlotsIdxandSlotConsistencyCheckpointhelp track missed slots and store consistency checkpoints.
48-54: Default backfill settings
DEFAULT_SLOT_BACKFILL_INTERVAL_SECS= 120 seconds andDEFAULT_SLOT_BACKFILL_OFFSET= 1,000,000. Verify they suit the expected load and memory constraints, especially if large slot ranges are processed.
113-138: Checkpoint fallback logicThis function checks
SlotConsistencyCheckpointfirst, then falls back to scanningRawBlock. The approach is sound. For large databases, consider indexing or early exit if no checkpoints.
738-803: Periodic missed slot backfill loopThe scheduling loop updates the checkpoint, processes missed slots, and then sleeps. This is straightforward. Ensure logs remain manageable if missed slot checks happen frequently without results.
nft_ingester/src/backfiller.rs (6)
6-6: IntroducedEitherfrom futuresThis is used to branch logic in
run_backfill. No issues identified.
15-17: Adding columns and parameters references
MissedSlotsIdxandParameterare integrated here. No immediate concerns.
24-24: Expanded tracing levelsUsing
error,info, andwarn. No issues; consistent practice for diagnostic visibility.
136-142: Refined backfill dispatcher
run_backfilldelegates to eitherbackfill_slotsorbackfill_missedbased ontarget. Clean approach. Confirm test coverage.
155-161: Handling missed slots withEither::LeftCombining
Eitherwithtokio::select!is an elegant way to handle concurrent tasks. No immediate issues.
235-282: Partial processing inbackfill_missedWhen a missed slot has no raw block, the loop breaks and subsequent missed slots remain unprocessed. If partial missed-slot handling is undesirable, consider skipping only that missing slot or logging it differently.
Would you like a script to scan for any large sequences of missed slots that might remain unprocessed?
97da515 to
40df4ee
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
♻️ Duplicate comments (1)
rocks-db/src/key_encoders.rs (1)
157-166: 🧹 Nitpick (assertive)Missing unit tests for new encode/decode functions.
While the implementation is correct, there are no unit tests for the new
encode_u64x2anddecode_u64x2functions, unlike other encoder/decoder pairs in this file.Consider adding tests to maintain consistent test coverage:
#[test] fn test_encode_decode_u64x2() { let val1 = 12345u64; let val2 = 67890u64; let encoded = encode_u64x2((val1, val2)); let decoded = decode_u64x2(&encoded).unwrap(); assert_eq!(decoded.0, val1); assert_eq!(decoded.1, val2); } #[test] fn test_decode_u64x2_invalid_data() { let invalid_data = [1u8, 2, 3]; // Too short assert!(decode_u64x2(&invalid_data).is_err()); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Lite
📒 Files selected for processing (13)
.env.example(2 hunks)backfill_rpc/src/rpc.rs(1 hunks)nft_ingester/src/backfiller.rs(6 hunks)nft_ingester/src/bin/ingester/main.rs(1 hunks)nft_ingester/src/bin/slot_checker/main.rs(1 hunks)nft_ingester/src/bin/slot_persister/main.rs(7 hunks)nft_ingester/src/inmemory_slots_dumper.rs(2 hunks)rocks-db/src/columns/parameters.rs(1 hunks)rocks-db/src/columns/raw_block.rs(2 hunks)rocks-db/src/key_encoders.rs(1 hunks)rocks-db/src/lib.rs(4 hunks)usecase/src/bigtable.rs(1 hunks)usecase/src/slots_collector.rs(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
rocks-db/src/key_encoders.rs (2)
Learnt from: armyhaylenko
PR: metaplex-foundation/aura#465
File: rocks-db/src/key_encoders.rs:157-162
Timestamp: 2025-04-03T14:15:26.033Z
Learning: The `encode_u64x2` function in the key_encoders module intentionally returns a fixed-size array `[u8; 16]` instead of `Vec<u8>` to avoid heap allocations (syscalls) for performance reasons, despite the API inconsistency with other encoder functions.
Learnt from: armyhaylenko
PR: metaplex-foundation/aura#465
File: rocks-db/src/key_encoders.rs:164-166
Timestamp: 2025-04-03T14:15:26.033Z
Learning: The `decode_u64x2` function in the key_encoders module doesn't need explicit size validation because the `try_into()` method will implicitly validate the input size when converting byte slices to fixed-size arrays.
🧬 Code Definitions (6)
nft_ingester/src/bin/ingester/main.rs (1)
nft_ingester/src/backfiller.rs (3)
db(192-192)db(245-245)run_backfill(136-180)
nft_ingester/src/inmemory_slots_dumper.rs (3)
nft_ingester/src/backfiller.rs (3)
new(49-66)new(121-127)new(362-367)usecase/src/slots_collector.rs (1)
new(81-87)usecase/src/bigtable.rs (1)
new(27-32)
rocks-db/src/columns/raw_block.rs (2)
rocks-db/src/lib.rs (3)
Self(441-441)new(93-105)new(227-320)rocks-db/src/key_encoders.rs (4)
encode_u64x2(157-162)decode_u64x2(164-166)encode_u64(88-90)decode_u64(92-95)
rocks-db/src/lib.rs (2)
rocks-db/src/columns/asset.rs (1)
new(3345-3347)rocks-db/src/column.rs (1)
iter(301-308)
nft_ingester/src/bin/slot_persister/main.rs (6)
nft_ingester/src/config.rs (1)
parse_json(554-556)rocks-db/src/columns/raw_block.rs (4)
decode_key(22-24)decode_key(45-47)decode_key(66-68)decode_key(92-94)nft_ingester/src/backfiller.rs (5)
new(49-66)new(121-127)new(362-367)db(192-192)db(245-245)nft_ingester/src/inmemory_slots_dumper.rs (1)
new(18-20)usecase/src/slots_collector.rs (1)
new(81-87)metrics_utils/src/lib.rs (8)
new(34-41)new(68-86)new(138-144)new(174-183)new(249-256)new(302-318)new(417-431)new(668-675)
nft_ingester/src/backfiller.rs (2)
rocks-db/src/columns/raw_block.rs (8)
encode_key(18-20)encode_key(41-43)encode_key(62-64)encode_key(88-90)decode_key(22-24)decode_key(45-47)decode_key(66-68)decode_key(92-94)rocks-db/src/columns/parameters.rs (2)
encode_key(26-28)decode_key(30-33)
🔇 Additional comments (38)
.env.example (1)
42-44: LGTM: Clear and well-documented environment variable additions.The two new environment variables for slot persister configuration are properly documented and have sensible default values. This aligns well with the PR objectives of adding periodic missed slot fetch functionality.
usecase/src/bigtable.rs (1)
20-20: AppropriateClonederivation added.Adding
#[derive(Clone)]toBigTableClientis appropriate as it enables multiple references to the same client instance without transferring ownership, which is particularly useful in asynchronous contexts for the slot persistence functionality.backfill_rpc/src/rpc.rs (1)
40-40: AppropriateClonederivation added.Adding
#[derive(Clone)]toBackfillRPCis appropriate as it enables multiple references to the same RPC client, which is necessary for the new periodic missed slot fetch functionality.usecase/src/slots_collector.rs (1)
65-65: AppropriateClonederivation added.Adding
#[derive(Clone)]toSlotsCollectorallows the collector to be cloned when needed, which aligns with the other similar changes inBigTableClientandBackfillRPC. This modification enables better handling in concurrent contexts for the new missed slot processing functionality.nft_ingester/src/bin/ingester/main.rs (2)
552-571: Added separate task for regular slot backfilling.This addition properly implements a dedicated task for processing regular slots using the backfill functionality with the appropriate target enum value. Using a separate asynchronous task allows for parallel processing of different slot types.
572-590: Added parallel task for missed slot backfilling.The implementation correctly reuses the same consumer instance for both backfill tasks, which is efficient. This missed slots backfill task runs concurrently with the regular slots task, allowing the system to process both types of slots simultaneously.
rocks-db/src/columns/parameters.rs (1)
10-11: Added parameter enum variant for tracking missed slots.The new
LastProcessedMissedSlotKeyenum variant provides storage capability for tracking the last processed missed slot, which is essential for the new missed slots backfilling functionality.nft_ingester/src/inmemory_slots_dumper.rs (3)
1-1: Updated imports to include Arc.The import of
Arcfromstd::syncis needed for the modification to theInMemorySlotsDumperstruct.
7-9: Made InMemorySlotsDumper cloneable with thread-safe state.The addition of
#[derive(Clone)]and changing theslotsfield to useArc<Mutex<...>>enables safe sharing of theInMemorySlotsDumperinstance across multiple concurrent tasks.This change is necessary to support the parallel backfill tasks for both regular and missed slots.
19-19: Updated constructor to initialize with Arc-wrapped Mutex.The constructor now correctly initializes the
slotsfield with anArcwrapper around theMutex, which ensures consistent behavior with the updated struct definition.rocks-db/src/key_encoders.rs (2)
157-162: Added efficient encoder for u64 pairs.The implementation of
encode_u64x2returns a fixed-size array instead of a Vec to avoid heap allocations, which is a deliberate performance optimization.
164-166: Added decoder for u64 pairs with implicit size validation.The implementation of
decode_u64x2correctly handles conversion from byte slices to u64 values. Thetry_into()method implicitly validates the input size when converting byte slices to fixed-size arrays.nft_ingester/src/bin/slot_checker/main.rs (3)
3-3: Use of HashSet for duplicate filtering
ImportingHashSetsimplifies deduplication of slots. This is a clean and efficient approach.
15-15: Potential inconsistency with InMemorySlotsDumper usage
The AI-generated summary mentioned removingInMemorySlotsDumper, yet this addition reintroduces its usage. Please ensure consistency with the rest of the slot-management logic to avoid confusion or orphaned references.Likely an incorrect or invalid review comment.
20-20: Graceful shutdown handling
Usingtokio::signalis a standard and effective way to handle Ctrl+C and other termination signals. No issues found.rocks-db/src/lib.rs (5)
23-26: Importing concurrency primitives
These imports (AtomicU64,Arc, etc.) are necessary for thread-safe slot tracking. No issues found here.
64-64: Importing typed columns for missed slot tracking
BringingMissedSlotsIdxandSlotConsistencyCheckpointinto scope aligns with the new missed slots and consistency checkpoint logic. This is consistent with their usage below.
86-88: Extending SlotStorage fields
Addingslot_consistency_checkpoint,missed_slots_idx, andmissed_slots_last_seqsuitably extends the storage capabilities for missed slot indexing and slot consistency tracking.
95-104: Consistent initialization of new columns
Initializing the new SlotStorage fields ensures they are ready for use upon creation. The approach of returning aSelfwith all references wired up is standard.
108-114: Adding CF names for the new tables
Explicitly includingSlotConsistencyCheckpoint::NAMEandMissedSlotsIdx::NAMEincf_names()ensures these column families are recognized on database creation or opening.nft_ingester/src/bin/slot_persister/main.rs (14)
1-7: Approving new standard library imports
These imports (Ordering, HashMap, PathBuf, Arc/Mutex, Duration) appear correct and necessary for the subsequent logic.
10-10: New import for Bubblegum program ID
This aligns well with the usage incollect_slots. No issues found.
17-17: Confirmed correct usage of metrics utilities
Importingstart_metrics,BackfillerMetricsConfig, andMetricStateis consistent with how these are used below.
23-27: Imports for missed slots tracking
Bringing inMissedSlotsIdxandSlotConsistencyCheckpointis appropriate for managing missed slot data in RocksDB.
47-55: Well-documented constants
DefiningDEFAULT_SLOT_BACKFILL_INTERVAL_SECSandDEFAULT_SLOT_BACKFILL_OFFSETwith inline comments is clear.
56-57: Derived CLI parser for new arguments
Marking the struct withParseris correct to expose new fields as command-line flags.
65-65: Environment-based DB path
Linkingtarget_db_pathto theROCKS_SLOTS_DB_PATHenvironment variable is a flexible approach.
69-69: Enabling environment variable for RPC host
No issues. This helps configure the RPC endpoint dynamically.
86-86: Chunk size default
A default of 200 slots per batch is generally reasonable.
90-90: Max concurrency
Using a default concurrency of 20 can balance load and performance.
93-96: Additional CLI argument configurations
Your new or modified#[arg]attributes are consistent with the rest of the CLI fields. No immediate concerns.
113-138: Retrieving slot consistency checkpoint
Good fallback logic: triesSlotConsistencyCheckpoint, thenRawBlock.
Optionally, log when fallback is used for easier debugging.
563-563: No functional changes on this blank line
738-803: Periodic missed slot loop
Sleeping and re-checking is straightforward. Confirm that partial progress is handled if the process is interrupted mid-backfill.nft_ingester/src/backfiller.rs (4)
6-6: New import for branching futures
Pulling inEitherfromfutures::futureis a valid approach for the backfill logic.
25-25: New tracing import
Addingwarnis consistent with the usage for backfill operations.
130-134: New enum for backfill target
DistinguishingRegularSlotsfromMissedSlotsis a clean approach. Adding doc strings for each variant could further improve clarity.
136-169: Refactoredrun_backfill
Switching execution paths based onBackfillTargetis straightforward. Verify that resource usage is handled uniformly in both branches.
StanChe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job, thank you!
Some minor concerns for the potential infinite loop and some optional refactor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Lite
📒 Files selected for processing (2)
docker-compose.yaml(0 hunks)nft_ingester/src/bin/slot_persister/main.rs(7 hunks)
💤 Files with no reviewable changes (1)
- docker-compose.yaml
🧰 Additional context used
🧬 Code Definitions (1)
nft_ingester/src/bin/slot_persister/main.rs (6)
nft_ingester/src/sequence_consistent.rs (1)
None(72-72)rocks-db/src/columns/raw_block.rs (4)
decode_key(22-24)decode_key(45-47)decode_key(66-68)decode_key(92-94)nft_ingester/src/backfiller.rs (5)
new(49-66)new(121-127)new(362-367)db(192-192)db(245-245)nft_ingester/src/inmemory_slots_dumper.rs (1)
new(18-20)usecase/src/slots_collector.rs (1)
new(81-87)metrics_utils/src/lib.rs (8)
new(34-41)new(68-86)new(138-144)new(174-183)new(249-256)new(302-318)new(417-431)new(668-675)
🔇 Additional comments (3)
nft_ingester/src/bin/slot_persister/main.rs (3)
48-55: Validate or cap default values to avoid extreme scenarios.A default backfill offset of 1,000,000 slots could trigger high memory usage for edge cases or enormous backlogs. Although your code uses
min(slot_backfill_offset), consider adding explicit validation or bounding logic to assure the offset is never excessively large and to avoid memory or performance bottlenecks.
113-138: Handle the empty database case gracefully inget_checkpoint.Falling back to zero when neither
SlotConsistencyCheckpointnorRawBlockhas entries might lead to processing from slot 0 if the database is actually empty or corrupted. Ensure that this scenario aligns with requirements and won't re-ingest large ancient slot ranges unintentionally.
711-718: Prevent repeated scanning when no slots exist.If the RocksDB iterator yields no new valid keys and the collected slots are empty (or all missing), the checkpoint may not advance properly, so you might repeatedly scan the same range. Review the logic to ensure the next checkpoint moves forward or an alternate exit condition is used to avoid indefinite looping on empty ranges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
♻️ Duplicate comments (1)
nft_ingester/src/backfiller.rs (1)
130-134: Consider adding doc comments forBackfillTarget.
This mirrors a previous suggestion to document the usage ofRegularSlotsvs.MissedSlots.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Lite
📒 Files selected for processing (1)
nft_ingester/src/backfiller.rs(6 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
nft_ingester/src/backfiller.rs (2)
rocks-db/src/columns/raw_block.rs (8)
encode_key(18-20)encode_key(41-43)encode_key(62-64)encode_key(88-90)decode_key(22-24)decode_key(45-47)decode_key(66-68)decode_key(92-94)rocks-db/src/columns/parameters.rs (2)
encode_key(26-28)decode_key(30-33)
🔇 Additional comments (8)
nft_ingester/src/backfiller.rs (8)
6-6: Use ofEitherimport is clear and fine.
This import helps handle the branching between missed and regular slots in a single code path.
24-24: Consistent use ofwarnlog level.
Includingwarnalongsideinfoanderroris consistent with best practices for nuanced log reporting.
42-42:#[derive(Clone)]is appropriate.
Cloning is only duplicating references withinArc, so it remains efficient.
146-146: Logging target type is helpful.
Using structured logging with?targetprovides clarity when checking logs for each backfill mode.
154-168: Split futures withEither.
Branching intobackfill_missedorbackfill_slotswithEitherneatly avoids a complex match. This is clean and maintainable.
175-175: Graceful shutdown logic is consistent.
Repeating the same message for a shutdown signal keeps logs consistent across backfill flows.
235-248: InitializeLastProcessedMissedSlotKeywith care.
Storing(0u64, 0u64)uponNoneensures a starting point, but ensure concurrency or re-entrancy doesn't overwrite any in-progress updates from another process or thread.
276-278: Ignored consumer failure.
Unlike the missing block scenario, consumer failures do not halt processing. If these errors represent unrecoverable states, a consistent break or retry might be warranted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Lite
📒 Files selected for processing (1)
nft_ingester/src/bin/slot_persister/main.rs(8 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
nft_ingester/src/bin/slot_persister/main.rs (5)
rocks-db/src/columns/raw_block.rs (4)
decode_key(22-24)decode_key(45-47)decode_key(66-68)decode_key(92-94)nft_ingester/src/sequence_consistent.rs (1)
None(72-72)nft_ingester/src/backfiller.rs (5)
new(49-66)new(121-127)new(364-369)db(192-192)db(245-245)nft_ingester/src/inmemory_slots_dumper.rs (1)
new(18-20)usecase/src/slots_collector.rs (1)
new(81-87)
🔇 Additional comments (7)
nft_ingester/src/bin/slot_persister/main.rs (7)
47-55: Defaults look reasonable, but watch for extreme scenarios.The defaults for
DEFAULT_SLOT_BACKFILL_INTERVAL_SECS(2 minutes) andDEFAULT_SLOT_BACKFILL_OFFSET(500k slots) make sense for moderate workloads. If usage grows, consider verifying that these defaults won't cause delays or excessive memory usage.
56-66: Good use of environment variables and CLI arguments.The newly introduced CLI fields (
slot_backfill_intervalandslot_backfill_offset) facilitate configuration flexibility. This approach is a solid pattern. Ensure to handle invalid or zero values gracefully in practice, if there's a risk of misconfiguration.
69-70: RPC host CLI argument is clear.Using
#[arg(short, long, env)]forrpc_hostis straightforward. No immediate concerns noticed.
228-228: Straightforward usage for reading last persisted slot.Calling
get_last_persisted_slothere is appropriate. No concerns.
572-572: No functional changes observed.Line 572 seems to be a closing brace or whitespace. No further action needed.
573-579: Newget_missed_slotsfunction signature.This high-level async function integrates backfill logic. Be mindful of potential large internal state if the range is huge. Consider short-circuiting or chunking for extreme scenarios.
Would you like a script to check if usage patterns in your codebase always stay within safe bounds?
580-603: Informative example doc comments.The doc comments illustrate boundary calculations well, helping maintainers understand the arrangement of offsets. The examples are well-structured, though ensure any future changes to offset logic are reflected here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Lite
📒 Files selected for processing (1)
nft_ingester/src/bin/slot_persister/main.rs(8 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
nft_ingester/src/bin/slot_persister/main.rs (4)
nft_ingester/src/config.rs (1)
parse_json(554-556)rocks-db/src/columns/raw_block.rs (4)
decode_key(22-24)decode_key(45-47)decode_key(66-68)decode_key(92-94)nft_ingester/src/backfiller.rs (5)
new(49-66)new(121-127)new(364-369)db(192-192)db(245-245)nft_ingester/src/inmemory_slots_dumper.rs (1)
new(18-20)
🔇 Additional comments (8)
nft_ingester/src/bin/slot_persister/main.rs (8)
48-48: Documentation note.
This doc comment helps new contributors quickly see the default interval. No issues here.
49-55: Descriptive commentary is clear.
These lines accurately describe how the checkpoint gap is computed and used.
56-56: Clap usage looks good.
Deriving multiple traits for Args is a common practice and poses no concerns.
65-65: Consistent env usage.
ReferencingROCKS_SLOTS_DB_PATHmatches the pattern used elsewhere.
69-69: Environment variable reference.
No issues, this is a typical Clap attribute for environment variables.
228-228: Trivial usage of helper function.
Callingget_last_persisted_slothere is straightforward and consistent with the rest of the code.
572-572: No revision needed.
This blank line doesn’t introduce problems or changes worthy of comment.
573-808: Naming conflict and potential memory usage concerns.
- The outer
async fn get_missed_slotscontains another function also namedget_missed_slots. This can lead to confusion. Consider renaming the inner helper (e.g.,retrieve_missed_slots_rangeor similar).- While this implementation is robust, it accumulates missed slots in memory. For extremely large ranges, consider chunked or streaming approaches to limit peak memory usage.
[duplicate_comment, suggest_optional_refactor]
| // Offset to start collecting missed slots from. This exists to prevent concurrent collection of | ||
| // later slots and marking of those as missed. | ||
| const MISSED_SLOT_COLLECTION_OFFSET: u64 = 10 * SLOT_COLLECTION_OFFSET; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Watch for large offset memory usage.
A multiplier of 10× can create substantial in-memory ranges if real-world offsets become large. Consider verifying that this is acceptable in your deployment environment or adding chunking logic to avoid potential high memory usage.
| #[arg(long, env, help = "How often to check for missed slots (in seconds)", default_value_t = DEFAULT_SLOT_BACKFILL_INTERVAL_SECS)] | ||
| slot_backfill_interval: u64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Validate interval.
It may be helpful to ensure this value is above zero or within a practical range to avoid extremes.
| #[arg(long, env, help = "The number of slots by which to check & advance the next sequential slots", default_value_t = DEFAULT_SLOT_BACKFILL_OFFSET)] | ||
| slot_backfill_offset: u64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Ensure a reasonable backfill offset.
Similarly, consider validating that slot_backfill_offset is non-zero and does not exceed safe bounds to avoid extended loops or excessive memory usage.
| } | ||
|
|
||
| pub fn get_last_persisted_slot(rocks_db: Arc<SlotStorage>) -> u64 { | ||
| pub fn get_last_persisted_slot(rocks_db: &SlotStorage) -> u64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Log potential decode errors.
Using unwrap_or_default() may silently hide data corruption. Consider logging an error before falling back to zero.
| pub fn get_first_persisted_slot(rocks_db: &SlotStorage) -> u64 { | ||
| let mut it = rocks_db.db.raw_iterator_cf(&rocks_db.db.cf_handle(RawBlock::NAME).unwrap()); | ||
| it.seek_to_first(); | ||
| if !it.valid() { | ||
| return 0; | ||
| } | ||
| it.key().map(|b| RawBlock::decode_key(b.to_vec()).unwrap_or_default()).unwrap_or_default() | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Check for abnormal earliest slot conditions.
This logic is consistent with get_last_persisted_slot, but a warning log on decode failure might be helpful for diagnostic purposes.
| pub fn get_checkpoint(rocks_db: &SlotStorage) -> u64 { | ||
| let mut checkpoint = None; | ||
| let mut it = rocks_db | ||
| .db | ||
| .raw_iterator_cf(&rocks_db.db.cf_handle(SlotConsistencyCheckpoint::NAME).unwrap()); | ||
| it.seek_to_first(); | ||
| if it.valid() { | ||
| checkpoint = Some( | ||
| it.key() | ||
| .map(|b| SlotConsistencyCheckpoint::decode_key(b.to_vec()).unwrap_or_default()) | ||
| .unwrap_or_default(), | ||
| ) | ||
| } else { | ||
| let mut it = rocks_db.db.raw_iterator_cf(&rocks_db.db.cf_handle(RawBlock::NAME).unwrap()); | ||
| it.seek_to_first(); | ||
| if it.valid() { | ||
| checkpoint = Some( | ||
| it.key() | ||
| .map(|b| RawBlock::decode_key(b.to_vec()).unwrap_or_default()) | ||
| .unwrap_or_default(), | ||
| ); | ||
| } | ||
| } | ||
|
|
||
| checkpoint.unwrap_or_default() | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Silent fallback for missing checkpoint.
Currently, missing or corrupted data sets the checkpoint to zero without notification. Consider logging a warning or returning an error if that’s not the intended design.
| usecase::executor::spawn(get_missed_slots( | ||
| backfill_source.clone(), | ||
| target_db.clone(), | ||
| args.clone(), | ||
| metrics_state.backfiller_metrics.clone(), | ||
| cancellation_token.child_token(), | ||
| )); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Consider capturing the spawned task handle.
Without storing the JoinHandle, the main thread cannot detect critical failures in get_missed_slots. If that’s important, returning or awaiting the handle might help.
6c87eac to
19d207b
Compare
No description provided.