Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughRule-start and timestamp detection were made whitespace-tolerant (matching on left-stripped lines). Indented/grouped log lines are routed to a new indented-line handler that recognizes indented timestamps and rule starts and flushes pending ERROR state. Completed-job attribution now uses a JOBID→rule mapping. Tests for group/pipe log parsing were added. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #52 +/- ##
==========================================
+ Coverage 88.59% 88.92% +0.32%
==========================================
Files 52 52
Lines 4928 4955 +27
==========================================
+ Hits 4366 4406 +40
+ Misses 562 549 -13
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
snakesee/parser/core.py (1)
266-270:⚠️ Potential issue | 🟠 Major
parse_rules_from_log()still miscounts grouped completions.This function still increments the last seen rule for every finish line. In grouped output, multiple rules can be opened before any of them finish, so the last rule block wins. With the grouped fixture added in this PR, both
consumercompletions would still be counted asproducer, leaving rule-level historical stats wrong.🔧 Suggested direction
def parse_rules_from_log(log_path: Path) -> dict[str, int]: rule_counts: dict[str, int] = {} current_rule: str | None = None + job_rules: dict[str, str] = {} try: for line in log_path.read_text().splitlines(): # Track current rule being executed if match := RULE_START_PATTERN.match(line.lstrip()): current_rule = match.group(1) + elif match := JOBID_PATTERN.match(line): + if current_rule is not None: + job_rules[match.group(1)] = current_rule # Count "Finished job" as rule completion - elif "Finished job" in line and current_rule is not None: - rule_counts[current_rule] = rule_counts.get(current_rule, 0) + 1 + elif match := FINISHED_JOB_PATTERN.search(line): + if rule := job_rules.get(match.group(1), current_rule): + rule_counts[rule] = rule_counts.get(rule, 0) + 1🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@snakesee/parser/core.py` around lines 266 - 270, parse_rules_from_log() misattributes "Finished job" lines to the last seen rule by using current_rule; instead maintain a stack of open rules (e.g., rules_stack) where on RULE_START_PATTERN.match(line.lstrip()) you push match.group(1), and on seeing "Finished job" you pop from rules_stack (if non-empty) and increment rule_counts for the popped rule; update references to current_rule (remove or keep only for convenience) and ensure you guard against popping an empty stack.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@snakesee/parser/core.py`:
- Around line 332-333: TIMESTAMP_PATTERN is currently used with .search(), which
finds timestamps anywhere in a line and produces false positives; update every
check to use TIMESTAMP_PATTERN.match(line.lstrip()) instead (i.e., strip leading
whitespace and anchor to the start) in the functions
parse_running_jobs_from_log(), parse_failed_jobs_from_log(), and
_get_first_log_timestamp(), and where record_pending_error() is gated by a
TIMESTAMP_PATTERN check—replace the .search(...) calls at those sites so
timestamp detection only matches at the start of the trimmed line.
---
Outside diff comments:
In `@snakesee/parser/core.py`:
- Around line 266-270: parse_rules_from_log() misattributes "Finished job" lines
to the last seen rule by using current_rule; instead maintain a stack of open
rules (e.g., rules_stack) where on RULE_START_PATTERN.match(line.lstrip()) you
push match.group(1), and on seeing "Finished job" you pop from rules_stack (if
non-empty) and increment rule_counts for the popped rule; update references to
current_rule (remove or keep only for convenience) and ensure you guard against
popping an empty stack.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b74cfa2c-513d-4f1b-9708-1d0353eb8458
📒 Files selected for processing (3)
snakesee/parser/core.pysnakesee/parser/line_parser.pytests/test_parser.py
227ea32 to
3f2475d
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
snakesee/parser/line_parser.py (1)
276-285: Missingcheckpoint/localcheckpointhandling for indented lines.The non-indented rule start check (line 188-189) handles
checkpointandlocalcheckpointprefixes, but this indented handler only checks forruleandlocalrule. If a checkpoint rule ever appears inside a group block, it won't be recognized as a rule start.♻️ Proposed fix to add checkpoint support
# Indented rule start: " rule X:" or " localrule X:" - if (first_stripped == "r" and stripped.startswith("rule ")) or ( - first_stripped == "l" and stripped.startswith("localrule ") + if ( + (first_stripped == "r" and stripped.startswith("rule ")) + or (first_stripped == "l" and stripped.startswith("localrule ")) + or (first_stripped == "c" and stripped.startswith("checkpoint ")) + or (first_stripped == "l" and stripped.startswith("localcheckpoint ")) ):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@snakesee/parser/line_parser.py` around lines 276 - 285, The indented-line rule-start branch only checks for "r"/"rule" and "l"/"localrule" and thus misses "checkpoint"/"localcheckpoint"; update the conditional that examines first_stripped and stripped (the block using RULE_START_PATTERN.match(stripped)) to also accept the checkpoint prefixes the same way the non-indented handler does, so that when RULE_START_PATTERN matches you still call self.context.get_pending_error(), self.context.reset_for_new_rule(rule) and append ParseEvent(ParseEventType.RULE_START, {"rule": rule}) for checkpoint/localcheckpoint names as well (keep the existing use of RULE_START_PATTERN, match.group(1), pending handling, reset_for_new_rule, and ParseEvent creation).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@snakesee/parser/line_parser.py`:
- Around line 276-285: The indented-line rule-start branch only checks for
"r"/"rule" and "l"/"localrule" and thus misses "checkpoint"/"localcheckpoint";
update the conditional that examines first_stripped and stripped (the block
using RULE_START_PATTERN.match(stripped)) to also accept the checkpoint prefixes
the same way the non-indented handler does, so that when RULE_START_PATTERN
matches you still call self.context.get_pending_error(),
self.context.reset_for_new_rule(rule) and append
ParseEvent(ParseEventType.RULE_START, {"rule": rule}) for
checkpoint/localcheckpoint names as well (keep the existing use of
RULE_START_PATTERN, match.group(1), pending handling, reset_for_new_rule, and
ParseEvent creation).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a4f515dc-65cb-4dc2-9fc7-d4075c8b4df5
📒 Files selected for processing (3)
snakesee/parser/core.pysnakesee/parser/line_parser.pytests/test_parser.py
46b52ac to
41b0ffa
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/integration/test_workflows.py`:
- Around line 52-56: The assertion in tests/integration/test_workflows.py is too
strict given EventState.add_event (in snakesee/state_comparison.py) and the
optional Event.total_jobs (in snakesee/types.py) which can leave total_jobs as
0,1,2,3,4; update the assertion that checks result.total_jobs so it allows any
integer in the inclusive range 0–4 (rather than only 0 or 4), i.e. validate
result.total_jobs is >= 0 and <= 4 (or use membership in range(0,5)) to cover
partial PROGRESS snapshots.
In `@tests/test_parser.py`:
- Around line 2822-2855: The tests never exercise a mid-line timestamp fragment,
so they don't catch regressions that use TIMESTAMP_PATTERN.search(); update the
fixtures so a non-timestamp line contains an embedded "[Mon ...]" fragment and
ensure the parser is driven into a pending-error state so the mid-line fragment
would matter — e.g., in test_midline_timestamp_does_not_corrupt_running_jobs (or
add a new test) change the log_content to start an error block (use
parse_failed_jobs_from_log or construct a pending error with the same shape as
other tests) then include a line like "some message [Mon Jan 6 10:00:01 2026]
continued text" to verify parse_running_jobs_from_log /
parse_failed_jobs_from_log does not prematurely close the block; this will
ensure the code path that would be broken by TIMESTAMP_PATTERN.search() is
actually exercised.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 74b448f7-c1a7-4328-8474-9a19b6dd14b2
📒 Files selected for processing (4)
snakesee/parser/core.pysnakesee/parser/line_parser.pytests/integration/test_workflows.pytests/test_parser.py
Snakemake indents log output for jobs within group/pipe blocks by
4 spaces. The parser used RULE_START_PATTERN.match() anchored at
position 0 and line.startswith("[") checks that both fail on indented
lines, causing group jobs to be invisible or assigned wrong rule names.
Fix by using RULE_START_PATTERN.match(line.lstrip()) for rule detection
and TIMESTAMP_PATTERN.search() for timestamp detection across all
parser functions. Add _parse_indented_or_group_line() to LogLineParser
for the same handling in the streaming path.
Closes #42
41b0ffa to
3c42206
Compare
Summary
pipe()/group:blocks by 4 spaces, causing the parser to miss rule names and timestamps for grouped jobsRULE_START_PATTERN.match()→.match(line.lstrip())across all 8 call sites incore.pyand add indented-line handling toLogLineParserinline_parser.pyTIMESTAMP_PATTERN.match()→.search()and removeline.startswith("[")guards that blocked indented timestampsTest plan
parse_running_jobs_from_log,parse_failed_jobs_from_log,parse_completed_jobs_from_log,parse_all_jobs_from_log, andLogLineParserCloses #42
Summary by CodeRabbit
Bug Fixes
Tests