fix(source): propagate I/O errors instead of silently stopping#26
fix(source): propagate I/O errors instead of silently stopping#26
Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (3)
✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
📝 Coding Plan
Comment |
There was a problem hiding this comment.
1 issue found across 3 files
Confidence score: 3/5
- There is a concrete user-impacting risk in
src/source/stdin.rs: stdin is fully buffered up front, which can defeat the batched build loop behavior and increase memory usage significantly. - Given the issue’s medium severity (6/10) and high confidence (9/10), this looks like a meaningful regression risk rather than a minor housekeeping concern.
- This is likely to show up with large piped wordlists, where memory pressure could cause slowdowns or failures during normal usage.
- Pay close attention to
src/source/stdin.rs- eager stdin buffering may break batching assumptions and cause high memory consumption on large inputs.
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="src/source/stdin.rs">
<violation number="1" location="src/source/stdin.rs:28">
P2: This buffers all of stdin up front, which defeats the batched build loop and can blow up memory on large piped wordlists.</violation>
</file>
Architecture diagram
sequenceDiagram
participant Builder as Build Engine
participant Source as Source Implementation<br/>(File, SecLists, Stdin)
participant FS as Operating System<br/>(FS / Stdin)
Note over Builder,FS: Word Collection Flow
Builder->>Source: words()
Source->>FS: Open handle & initialize BufReader
loop For each line in stream
FS-->>Source: line data OR io::Error
alt Success
Source->>Source: Buffer line in Vec
else NEW: I/O Error (e.g. Connection reset, Disk error)
Source-->>Builder: CHANGED: Return Result::Err (Propagation)
Note right of Source: Build process aborts immediately
end
end
opt All lines read successfully
Source->>Source: filter(!empty)
Source-->>Builder: Return Ok(Iterator)
loop Build Loop
Builder->>Builder: Insert words into HashSet
end
end
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| .map_while(Result::ok) | ||
| .filter(|line| !line.is_empty()), | ||
| )) | ||
| let lines: Vec<String> = reader |
There was a problem hiding this comment.
P2: This buffers all of stdin up front, which defeats the batched build loop and can blow up memory on large piped wordlists.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/source/stdin.rs, line 28:
<comment>This buffers all of stdin up front, which defeats the batched build loop and can blow up memory on large piped wordlists.</comment>
<file context>
@@ -25,12 +25,13 @@ impl Source for StdinSource {
- .map_while(Result::ok)
- .filter(|line| !line.is_empty()),
- ))
+ let lines: Vec<String> = reader
+ .lines()
+ .collect::<io::Result<Vec<_>>>()?
</file context>
All file-based sources (FileSource, SecListsSource, StdinSource) used
.map_while(Result::ok)on line iterators. If a read failed mid-file, the iterator just stopped and the build continued with partial data - no error, no warning.Switched to eager
.collect::<io::Result<Vec<_>>>()which surfaces the first I/O error through the?chain. The words are buffered in memory anyway (the build loop puts everything into aHashSet), so there's no streaming penalty.Closes #22