refactor: unify node-kind tables & fix marker budget#26
Conversation
Issue #24: Consolidate duplicated node-kind mapping tables into a single `node_kind_info()` function in utils.rs. Both `to_static_node_kind()` and `score_node_kind()` are now thin wrappers, eliminating sync drift. Fixes `class_definition` being Priority 2 in structure.rs but Priority 5 in scoring (Python classes ARE the type system). Adds missing kinds (struct_specifier, enum_specifier, declaration, etc.) to the mapping. Issue #25: Replace static MARKER_RESERVE=2 with select-then-trim algorithm that counts actual markers from position-sorted selection. Greedy loop now selects against full budget, then trims lowest-priority spans until content + markers fit. Prevents mid-span clipping that occurred when marker count exceeded the static reserve.
- Destructure dropped tuple for clearer field access - Remove redundant re-sort of already position-sorted spans - Replace unreachable let-else guard with direct indexing
| result | ||
| ); | ||
| // If trimming happened, import should be dropped before function | ||
| if !result.contains("import B") && result.contains("fn foo()") { |
There was a problem hiding this comment.
Weak test assertions (MEDIUM - Blocking)
The test uses if/else if conditional branches to check behavior, but the else case (wrong priority behavior) is silently accepted. The test would pass even if the algorithm drops higher-priority spans before lower-priority ones.
Fix: Add an explicit failure on wrong behavior with an else branch that panics on incorrect priority ordering.
| // If one type was dropped, it should be type B (higher position) | ||
| if result.contains("type A") && !result.contains("type B") { | ||
| // Correct tie-break: dropped higher position | ||
| } else if result.contains("type A") && result.contains("type B") { |
There was a problem hiding this comment.
Weak test assertions (MEDIUM - Blocking)
Same pattern as above: the else case (wrong tiebreak behavior) is silently accepted. A regression in position-based tiebreaking would not be caught.
Fix: Add an explicit failure case that panics when tiebreak behavior is reversed.
| if is_signature_node(kind, node_types) { | ||
| if let Some(sig) = extract_signature(node, source, node_types)? { | ||
| let static_kind = crate::transform::structure::to_static_node_kind(kind); | ||
| let static_kind = crate::transform::utils::to_static_node_kind(kind); |
There was a problem hiding this comment.
Inconsistent import style (MEDIUM - Should Fix)
to_static_node_kind uses a fully-qualified inline path (crate::transform::utils::to_static_node_kind) while other cross-module references like NodeSpan are imported via use statements. Since this PR changed the import path, it's a good opportunity to normalize to the shorter style.
Fix: Add use crate::transform::utils::to_static_node_kind; at the top and use the short name instead.
| if is_type_node(kind, node_types) { | ||
| if let Some(type_def) = extract_type_definition(node, source, node_types)? { | ||
| let static_kind = crate::transform::structure::to_static_node_kind(kind); | ||
| let static_kind = crate::transform::utils::to_static_node_kind(kind); |
There was a problem hiding this comment.
Inconsistent import style (MEDIUM - Should Fix)
to_static_node_kind uses a fully-qualified inline path while other references are imported via use statements. Since this PR changed the import path, normalize to match the style in structure.rs.
Fix: Add use crate::transform::utils::to_static_node_kind; at the top and use the short name instead.
Code Review Summary: PR #26Status: 4 inline comments created for blocking/should-fix issues Issues Addressed
Pre-existing Issues (Not Blocking This PR)Complexity (Pre-existing)
Performance (Pre-existing)
Architecture (Pre-existing)
Import Style (Pre-existing)
Review ConsensusOverall Assessment: APPROVED (with conditions) Scores by Category:
Key Strengths
Conditions for Merge
Reviewed by: Claude Code |
- Add performance note to trim loop explaining why O(n^2) is
acceptable (bounded by top-level AST node count)
- Add else { panic!() } to test_trim_prefers_dropping_low_priority
and test_trim_tiebreak_drops_last_position so unexpected outcomes
fail loudly instead of passing silently
Co-Authored-By: Claude <noreply@anthropic.com>
Replace fully-qualified paths with use imports in signatures.rs and types.rs to match the existing pattern in structure.rs: - to_static_node_kind from crate::transform::utils - extract_markdown_headers_with_spans from crate::transform::structure Co-Authored-By: Claude <noreply@anthropic.com>
…#24, #25) Add CLI integration tests that exercise --max-lines on real fixture files (mixed_priority.ts, mixed_priority.rs) which produce non-contiguous selected spans during truncation. These tests validate the marker budget fix end-to-end through the full pipeline (parse -> transform -> truncate) rather than only through synthetic NodeSpan vectors. Tests verify: - Output never exceeds --max-lines across multiple budget values - Omission markers appear between non-contiguous gaps - High-priority spans (types, interfaces) are preserved over lower-priority - Cross-language coverage (TypeScript and Rust fixtures) Co-Authored-By: Claude <noreply@anthropic.com>
The test_trim_prefers_dropping_low_priority test had an assertion gap revealed by adding panic branches: dropping a middle span creates gap markers that can cascade further drops. Updated assertions to validate the priority invariant (import never survives when function is dropped) rather than expecting a specific outcome.
The enumerate() index in the scored (priority, idx, span) tuple was never referenced — sort used transformed_range.start for position tie-breaking and the loop destructured it as (priority, _, span). Simplified to (priority, span) 2-tuples.
Summary
to_static_node_kindinstructure.rsandscore_node_kindinutils.rs) into a singlenode_kind_info()function inutils.rs, eliminating table drift risktruncate.rswhere non-contiguous spans could cause output to exceedmax_lines— replaces fixedMARKER_RESERVE = 2with dynamic select-then-trim algorithmTest plan
test_node_kind_info_consistencyvalidates all 41 known kinds map correctlytest_noncontiguous_spans_marker_accountingcovers the fixed bugtest_count_markers_*(4 tests) cover empty, contiguous, gap, and edge casestest_trim_prefers_dropping_low_priorityandtest_trim_tiebreak_drops_last_positionverify trim algorithm