refactor: unify node-kind tables & fix marker budget by dean0x · Pull Request #26 · dean0x/skim

dean0x · 2026-03-13T23:05:40Z

Summary

Unifies the two separate node-kind tables (to_static_node_kind in structure.rs and score_node_kind in utils.rs) into a single node_kind_info() function in utils.rs, eliminating table drift risk
Fixes marker budget bug in truncate.rs where non-contiguous spans could cause output to exceed max_lines — replaces fixed MARKER_RESERVE = 2 with dynamic select-then-trim algorithm
Self-review cleanup: destructured tuples, removed redundant re-sort, simplified unreachable guard

Test plan

All 254 tests pass (103 core + integration/CLI)
test_node_kind_info_consistency validates all 41 known kinds map correctly
test_noncontiguous_spans_marker_accounting covers the fixed bug
test_count_markers_* (4 tests) cover empty, contiguous, gap, and edge cases
test_trim_prefers_dropping_low_priority and test_trim_tiebreak_drops_last_position verify trim algorithm
Clippy zero warnings

Issue #24: Consolidate duplicated node-kind mapping tables into a single `node_kind_info()` function in utils.rs. Both `to_static_node_kind()` and `score_node_kind()` are now thin wrappers, eliminating sync drift. Fixes `class_definition` being Priority 2 in structure.rs but Priority 5 in scoring (Python classes ARE the type system). Adds missing kinds (struct_specifier, enum_specifier, declaration, etc.) to the mapping. Issue #25: Replace static MARKER_RESERVE=2 with select-then-trim algorithm that counts actual markers from position-sorted selection. Greedy loop now selects against full budget, then trims lowest-priority spans until content + markers fit. Prevents mid-span clipping that occurred when marker count exceeded the static reserve.

- Destructure dropped tuple for clearer field access - Remove redundant re-sort of already position-sorted spans - Replace unreachable let-else guard with direct indexing

dean0x · 2026-03-13T23:11:03Z

crates/rskim-core/src/transform/truncate.rs

+            result
+        );
+        // If trimming happened, import should be dropped before function
+        if !result.contains("import B") && result.contains("fn foo()") {


Weak test assertions (MEDIUM - Blocking)

The test uses if/else if conditional branches to check behavior, but the else case (wrong priority behavior) is silently accepted. The test would pass even if the algorithm drops higher-priority spans before lower-priority ones.

Fix: Add an explicit failure on wrong behavior with an else branch that panics on incorrect priority ordering.

dean0x · 2026-03-13T23:11:05Z

crates/rskim-core/src/transform/truncate.rs

+        // If one type was dropped, it should be type B (higher position)
+        if result.contains("type A") && !result.contains("type B") {
+            // Correct tie-break: dropped higher position
+        } else if result.contains("type A") && result.contains("type B") {


Weak test assertions (MEDIUM - Blocking)

Same pattern as above: the else case (wrong tiebreak behavior) is silently accepted. A regression in position-based tiebreaking would not be caught.

Fix: Add an explicit failure case that panics when tiebreak behavior is reversed.

dean0x · 2026-03-13T23:11:06Z

crates/rskim-core/src/transform/signatures.rs

    if is_signature_node(kind, node_types) {
        if let Some(sig) = extract_signature(node, source, node_types)? {
-            let static_kind = crate::transform::structure::to_static_node_kind(kind);
+            let static_kind = crate::transform::utils::to_static_node_kind(kind);


Inconsistent import style (MEDIUM - Should Fix)

to_static_node_kind uses a fully-qualified inline path (crate::transform::utils::to_static_node_kind) while other cross-module references like NodeSpan are imported via use statements. Since this PR changed the import path, it's a good opportunity to normalize to the shorter style.

Fix: Add use crate::transform::utils::to_static_node_kind; at the top and use the short name instead.

dean0x · 2026-03-13T23:11:08Z

crates/rskim-core/src/transform/types.rs

    if is_type_node(kind, node_types) {
        if let Some(type_def) = extract_type_definition(node, source, node_types)? {
-            let static_kind = crate::transform::structure::to_static_node_kind(kind);
+            let static_kind = crate::transform::utils::to_static_node_kind(kind);


Inconsistent import style (MEDIUM - Should Fix)

to_static_node_kind uses a fully-qualified inline path while other references are imported via use statements. Since this PR changed the import path, normalize to match the style in structure.rs.

Fix: Add use crate::transform::utils::to_static_node_kind; at the top and use the short name instead.

dean0x · 2026-03-13T23:11:21Z

Code Review Summary: PR #26

Status: 4 inline comments created for blocking/should-fix issues

Issues Addressed

2 BLOCKING (MEDIUM): Weak test assertions in test_trim_prefers_dropping_low_priority and test_trim_tiebreak_drops_last_position need explicit failure branches
2 SHOULD FIX (MEDIUM): Inconsistent import styles in signatures.rs and types.rs

Pre-existing Issues (Not Blocking This PR)

Complexity (Pre-existing)

truncate_to_lines function exceeds cyclomatic complexity threshold (~162 lines)
Trim loop uses Vec::remove() which is O(n) - acceptable given bounded AST sizes but worth documenting

Performance (Pre-existing)

text.lines().collect() allocates Vec unnecessarily when output fits in budget
node_kind_info() match arms repetitive (45 arms) but necessary for static strings

Architecture (Pre-existing)

Parallel node-kind tables for language-specific NodeTypes remain across modules (intentional by SRP)
NodeSpan::line_count() defined but unused - manual calculations used instead

Import Style (Pre-existing)

extract_markdown_headers_with_spans uses fully-qualified inline path in signatures.rs and types.rs

Review Consensus

Overall Assessment: APPROVED (with conditions)

Scores by Category:

Architecture: 9/10
Rust Quality: 9/10
Performance: 8/10
Tests: 7/10 (pending assertion fixes)
Regression Risk: 9/10 (excellent test coverage, 297 tests pass)
Security: 9/10
Complexity: 7/10
Consistency: 8/10 (pending import style fixes)

Key Strengths

✅ Eliminates real DRY violation - Single source of truth for node-kind mapping
✅ Fixes correctness bug - Select-then-trim algorithm properly accounts for markers
✅ Comprehensive test coverage - 9 new tests, all 297 existing tests pass
✅ Clean dependency graph - No circular dependencies, proper module layering
✅ Excellent documentation - Step-labeled algorithm with clear intent

Conditions for Merge

Tighten the two weak test assertions with explicit else branches that panic on wrong behavior
Normalize import styles in signatures.rs and types.rs to match structure.rs

Reviewed by: Claude Code
Branch: refactor/unify-node-kind-tables → main
Files changed: 5
Commits analyzed: 2 (8e4d42b, 1ce4560)
Date: 2026-03-14

- Add performance note to trim loop explaining why O(n^2) is acceptable (bounded by top-level AST node count) - Add else { panic!() } to test_trim_prefers_dropping_low_priority and test_trim_tiebreak_drops_last_position so unexpected outcomes fail loudly instead of passing silently Co-Authored-By: Claude <noreply@anthropic.com>

Replace fully-qualified paths with use imports in signatures.rs and types.rs to match the existing pattern in structure.rs: - to_static_node_kind from crate::transform::utils - extract_markdown_headers_with_spans from crate::transform::structure Co-Authored-By: Claude <noreply@anthropic.com>

…#24, #25) Add CLI integration tests that exercise --max-lines on real fixture files (mixed_priority.ts, mixed_priority.rs) which produce non-contiguous selected spans during truncation. These tests validate the marker budget fix end-to-end through the full pipeline (parse -> transform -> truncate) rather than only through synthetic NodeSpan vectors. Tests verify: - Output never exceeds --max-lines across multiple budget values - Omission markers appear between non-contiguous gaps - High-priority spans (types, interfaces) are preserved over lower-priority - Cross-language coverage (TypeScript and Rust fixtures) Co-Authored-By: Claude <noreply@anthropic.com>

The test_trim_prefers_dropping_low_priority test had an assertion gap revealed by adding panic branches: dropping a middle span creates gap markers that can cascade further drops. Updated assertions to validate the priority invariant (import never survives when function is dropped) rather than expecting a specific outcome.

The enumerate() index in the scored (priority, idx, span) tuple was never referenced — sort used transformed_range.start for position tie-breaking and the loop destructured it as (priority, _, span). Simplified to (priority, span) 2-tuples.

Dean Sharon added 2 commits March 14, 2026 00:52

refactor: simplify truncation code in self-review

1ce4560

- Destructure dropped tuple for clearer field access - Remove redundant re-sort of already position-sorted spans - Replace unreachable let-else guard with direct indexing

dean0x commented Mar 13, 2026

View reviewed changes

Dean Sharon and others added 5 commits March 14, 2026 01:17

dean0x merged commit 7b2530a into main Mar 13, 2026
5 checks passed

dean0x deleted the refactor/unify-node-kind-tables branch March 13, 2026 23:47

This was referenced Mar 14, 2026

refactor: Duplicated node-kind mapping tables (to_static_node_kind / score_node_kind) #24

Closed

refactor: Greedy span selection does not account for inter-span marker overhead #25

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: unify node-kind tables & fix marker budget#26

refactor: unify node-kind tables & fix marker budget#26
dean0x merged 7 commits intomainfrom
refactor/unify-node-kind-tables

dean0x commented Mar 13, 2026

Uh oh!

dean0x Mar 13, 2026

Uh oh!

dean0x Mar 13, 2026

Uh oh!

dean0x Mar 13, 2026

Uh oh!

dean0x Mar 13, 2026

Uh oh!

dean0x commented Mar 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dean0x commented Mar 13, 2026

Summary

Test plan

Uh oh!

dean0x Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

dean0x Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

dean0x Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

dean0x Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

dean0x commented Mar 13, 2026

Code Review Summary: PR #26

Issues Addressed

Pre-existing Issues (Not Blocking This PR)

Review Consensus

Key Strengths

Conditions for Merge

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant