Skip to content

refactor: unify node-kind tables & fix marker budget#26

Merged
dean0x merged 7 commits intomainfrom
refactor/unify-node-kind-tables
Mar 13, 2026
Merged

refactor: unify node-kind tables & fix marker budget#26
dean0x merged 7 commits intomainfrom
refactor/unify-node-kind-tables

Conversation

@dean0x
Copy link
Owner

@dean0x dean0x commented Mar 13, 2026

Summary

  • Unifies the two separate node-kind tables (to_static_node_kind in structure.rs and score_node_kind in utils.rs) into a single node_kind_info() function in utils.rs, eliminating table drift risk
  • Fixes marker budget bug in truncate.rs where non-contiguous spans could cause output to exceed max_lines — replaces fixed MARKER_RESERVE = 2 with dynamic select-then-trim algorithm
  • Self-review cleanup: destructured tuples, removed redundant re-sort, simplified unreachable guard

Test plan

  • All 254 tests pass (103 core + integration/CLI)
  • test_node_kind_info_consistency validates all 41 known kinds map correctly
  • test_noncontiguous_spans_marker_accounting covers the fixed bug
  • test_count_markers_* (4 tests) cover empty, contiguous, gap, and edge cases
  • test_trim_prefers_dropping_low_priority and test_trim_tiebreak_drops_last_position verify trim algorithm
  • Clippy zero warnings

Dean Sharon added 2 commits March 14, 2026 00:52
Issue #24: Consolidate duplicated node-kind mapping tables into a single
`node_kind_info()` function in utils.rs. Both `to_static_node_kind()`
and `score_node_kind()` are now thin wrappers, eliminating sync drift.
Fixes `class_definition` being Priority 2 in structure.rs but Priority 5
in scoring (Python classes ARE the type system). Adds missing kinds
(struct_specifier, enum_specifier, declaration, etc.) to the mapping.

Issue #25: Replace static MARKER_RESERVE=2 with select-then-trim
algorithm that counts actual markers from position-sorted selection.
Greedy loop now selects against full budget, then trims lowest-priority
spans until content + markers fit. Prevents mid-span clipping that
occurred when marker count exceeded the static reserve.
- Destructure dropped tuple for clearer field access
- Remove redundant re-sort of already position-sorted spans
- Replace unreachable let-else guard with direct indexing
result
);
// If trimming happened, import should be dropped before function
if !result.contains("import B") && result.contains("fn foo()") {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weak test assertions (MEDIUM - Blocking)

The test uses if/else if conditional branches to check behavior, but the else case (wrong priority behavior) is silently accepted. The test would pass even if the algorithm drops higher-priority spans before lower-priority ones.

Fix: Add an explicit failure on wrong behavior with an else branch that panics on incorrect priority ordering.

// If one type was dropped, it should be type B (higher position)
if result.contains("type A") && !result.contains("type B") {
// Correct tie-break: dropped higher position
} else if result.contains("type A") && result.contains("type B") {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weak test assertions (MEDIUM - Blocking)

Same pattern as above: the else case (wrong tiebreak behavior) is silently accepted. A regression in position-based tiebreaking would not be caught.

Fix: Add an explicit failure case that panics when tiebreak behavior is reversed.

if is_signature_node(kind, node_types) {
if let Some(sig) = extract_signature(node, source, node_types)? {
let static_kind = crate::transform::structure::to_static_node_kind(kind);
let static_kind = crate::transform::utils::to_static_node_kind(kind);
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent import style (MEDIUM - Should Fix)

to_static_node_kind uses a fully-qualified inline path (crate::transform::utils::to_static_node_kind) while other cross-module references like NodeSpan are imported via use statements. Since this PR changed the import path, it's a good opportunity to normalize to the shorter style.

Fix: Add use crate::transform::utils::to_static_node_kind; at the top and use the short name instead.

if is_type_node(kind, node_types) {
if let Some(type_def) = extract_type_definition(node, source, node_types)? {
let static_kind = crate::transform::structure::to_static_node_kind(kind);
let static_kind = crate::transform::utils::to_static_node_kind(kind);
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent import style (MEDIUM - Should Fix)

to_static_node_kind uses a fully-qualified inline path while other references are imported via use statements. Since this PR changed the import path, normalize to match the style in structure.rs.

Fix: Add use crate::transform::utils::to_static_node_kind; at the top and use the short name instead.

@dean0x
Copy link
Owner Author

dean0x commented Mar 13, 2026

Code Review Summary: PR #26

Status: 4 inline comments created for blocking/should-fix issues

Issues Addressed

  • 2 BLOCKING (MEDIUM): Weak test assertions in test_trim_prefers_dropping_low_priority and test_trim_tiebreak_drops_last_position need explicit failure branches
  • 2 SHOULD FIX (MEDIUM): Inconsistent import styles in signatures.rs and types.rs

Pre-existing Issues (Not Blocking This PR)

Complexity (Pre-existing)

  • truncate_to_lines function exceeds cyclomatic complexity threshold (~162 lines)
  • Trim loop uses Vec::remove() which is O(n) - acceptable given bounded AST sizes but worth documenting

Performance (Pre-existing)

  • text.lines().collect() allocates Vec unnecessarily when output fits in budget
  • node_kind_info() match arms repetitive (45 arms) but necessary for static strings

Architecture (Pre-existing)

  • Parallel node-kind tables for language-specific NodeTypes remain across modules (intentional by SRP)
  • NodeSpan::line_count() defined but unused - manual calculations used instead

Import Style (Pre-existing)

  • extract_markdown_headers_with_spans uses fully-qualified inline path in signatures.rs and types.rs

Review Consensus

Overall Assessment: APPROVED (with conditions)

Scores by Category:

  • Architecture: 9/10
  • Rust Quality: 9/10
  • Performance: 8/10
  • Tests: 7/10 (pending assertion fixes)
  • Regression Risk: 9/10 (excellent test coverage, 297 tests pass)
  • Security: 9/10
  • Complexity: 7/10
  • Consistency: 8/10 (pending import style fixes)

Key Strengths

  1. Eliminates real DRY violation - Single source of truth for node-kind mapping
  2. Fixes correctness bug - Select-then-trim algorithm properly accounts for markers
  3. Comprehensive test coverage - 9 new tests, all 297 existing tests pass
  4. Clean dependency graph - No circular dependencies, proper module layering
  5. Excellent documentation - Step-labeled algorithm with clear intent

Conditions for Merge

  1. Tighten the two weak test assertions with explicit else branches that panic on wrong behavior
  2. Normalize import styles in signatures.rs and types.rs to match structure.rs

Reviewed by: Claude Code
Branch: refactor/unify-node-kind-tables → main
Files changed: 5
Commits analyzed: 2 (8e4d42b, 1ce4560)
Date: 2026-03-14

Dean Sharon and others added 5 commits March 14, 2026 01:17
- Add performance note to trim loop explaining why O(n^2) is
  acceptable (bounded by top-level AST node count)
- Add else { panic!() } to test_trim_prefers_dropping_low_priority
  and test_trim_tiebreak_drops_last_position so unexpected outcomes
  fail loudly instead of passing silently

Co-Authored-By: Claude <noreply@anthropic.com>
Replace fully-qualified paths with use imports in signatures.rs and
types.rs to match the existing pattern in structure.rs:
- to_static_node_kind from crate::transform::utils
- extract_markdown_headers_with_spans from crate::transform::structure

Co-Authored-By: Claude <noreply@anthropic.com>
…#24, #25)

Add CLI integration tests that exercise --max-lines on real fixture files
(mixed_priority.ts, mixed_priority.rs) which produce non-contiguous
selected spans during truncation. These tests validate the marker budget
fix end-to-end through the full pipeline (parse -> transform -> truncate)
rather than only through synthetic NodeSpan vectors.

Tests verify:
- Output never exceeds --max-lines across multiple budget values
- Omission markers appear between non-contiguous gaps
- High-priority spans (types, interfaces) are preserved over lower-priority
- Cross-language coverage (TypeScript and Rust fixtures)

Co-Authored-By: Claude <noreply@anthropic.com>
The test_trim_prefers_dropping_low_priority test had an assertion gap
revealed by adding panic branches: dropping a middle span creates gap
markers that can cascade further drops. Updated assertions to validate
the priority invariant (import never survives when function is dropped)
rather than expecting a specific outcome.
The enumerate() index in the scored (priority, idx, span) tuple was
never referenced — sort used transformed_range.start for position
tie-breaking and the loop destructured it as (priority, _, span).
Simplified to (priority, span) 2-tuples.
@dean0x dean0x merged commit 7b2530a into main Mar 13, 2026
5 checks passed
@dean0x dean0x deleted the refactor/unify-node-kind-tables branch March 13, 2026 23:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant