diff --git a/docs/adr/ADR-0004-query-ir-binary-format.md b/docs/adr/ADR-0004-query-ir-binary-format.md index b21cfcd8..ebcd2a3f 100644 --- a/docs/adr/ADR-0004-query-ir-binary-format.md +++ b/docs/adr/ADR-0004-query-ir-binary-format.md @@ -1,4 +1,4 @@ -# ADR-0004: Query IR Binary Format +# ADR-0004: Compiled Query Binary Format - **Status**: Accepted - **Date**: 2024-12-12 @@ -6,15 +6,15 @@ ## Context -The Query IR lives in a single contiguous allocation—cache-friendly, zero fragmentation, portable to WASM. This ADR defines the binary layout. Graph structures are in [ADR-0005](ADR-0005-transition-graph-format.md). Type metadata is in [ADR-0007](ADR-0007-type-metadata-format.md). +The compiled query lives in a single contiguous allocation—cache-friendly, zero fragmentation, portable to WASM. This ADR defines the binary layout. Graph structures are in [ADR-0005](ADR-0005-transition-graph-format.md). Type metadata is in [ADR-0007](ADR-0007-type-metadata-format.md). ## Decision ### Container ```rust -struct QueryIR { - ir_buffer: QueryIRBuffer, +struct CompiledQuery { + buffer: CompiledQueryBuffer, successors_offset: u32, effects_offset: u32, negated_fields_offset: u32, @@ -23,18 +23,18 @@ struct QueryIR { type_defs_offset: u32, type_members_offset: u32, entrypoints_offset: u32, - ignored_kinds_offset: u32, // 0 = no ignored kinds + trivia_kinds_offset: u32, // 0 = no trivia kinds } ``` -Transitions start at offset 0. Default entrypoint is always at offset 0. +Transitions start at buffer offset 0. The default entrypoint is **Transition 0** (the root of the graph). The `entrypoints` table provides named exports for multi-definition queries; it does not affect the default entrypoint. -### QueryIRBuffer +### CompiledQueryBuffer ```rust const BUFFER_ALIGN: usize = 64; // cache-line alignment for transitions -struct QueryIRBuffer { +struct CompiledQueryBuffer { ptr: *mut u8, len: usize, owned: bool, // true if allocated, false if mmap'd @@ -50,7 +50,7 @@ Allocated via `Layout::from_size_align(len, BUFFER_ALIGN)`. Standard `Box<[u8]>` | `true` | `std::alloc::alloc` | Reconstruct `Layout`, call `std::alloc::dealloc` | | `false` | `mmap` / external | No-op (caller manages lifetime) | -For mmap'd queries, the OS maps file pages directly into address space. The 64-byte header ensures buffer data starts aligned. `QueryIRBuffer` with `owned: false` provides a view without taking ownership—the backing file mapping must outlive the `QueryIR`. +For mmap'd queries, the OS maps file pages directly into address space. The 64-byte header ensures buffer data starts aligned. `CompiledQueryBuffer` with `owned: false` provides a view without taking ownership—the backing file mapping must outlive the `CompiledQuery`. **Deallocation**: When `owned: true`, `Drop` must reconstruct the exact `Layout` (size + 64-byte alignment) and call `std::alloc::dealloc`. Using `Box::from_raw` or similar would assume align=1 and cause undefined behavior. @@ -67,7 +67,7 @@ For mmap'd queries, the OS maps file pages directly into address space. The 64-b | Type Defs | `[TypeDef; T]` | `type_defs_offset` | 4 | | Type Members | `[TypeMember; U]` | `type_members_offset` | 2 | | Entrypoints | `[Entrypoint; V]` | `entrypoints_offset` | 4 | -| Ignored Kinds | `[NodeTypeId; W]` | `ignored_kinds_offset` | 2 | +| Trivia Kinds | `[NodeTypeId; W]` | `trivia_kinds_offset` | 2 | Each offset is aligned: `(offset + align - 1) & !(align - 1)`. @@ -82,7 +82,7 @@ type StringId = u16; #[repr(C)] struct StringRef { - offset: u32, // into string_bytes + offset: u32, // byte offset into string_bytes (NOT element index) len: u16, _pad: u16, } @@ -128,7 +128,7 @@ Header (64 bytes): type_defs_offset: u32 type_members_offset: u32 entrypoints_offset: u32 - ignored_kinds_offset: u32 + trivia_kinds_offset: u32 _pad: [u8; 12] reserved, zero-filled Buffer Data (buffer_len bytes) @@ -169,6 +169,7 @@ Buffer layout: 0x0300 Type Defs [Record{...}, Enum{...}, ...] 0x0340 Type Members [{name,Str}, {Ident,Ty5}, ...] 0x0380 Entrypoints [{name=Func, target=Tr0, type=Ty3}, ...] +0x03A0 Trivia Kinds [comment, ...] ``` `"name"` stored once, used by both `@name` captures. diff --git a/docs/adr/ADR-0005-transition-graph-format.md b/docs/adr/ADR-0005-transition-graph-format.md index da062d3a..b3cee6ec 100644 --- a/docs/adr/ADR-0005-transition-graph-format.md +++ b/docs/adr/ADR-0005-transition-graph-format.md @@ -27,32 +27,34 @@ Relative range within a segment: ```rust #[repr(C)] struct Slice { - start: u32, - len: u32, + start_index: u32, // element index into segment array (NOT byte offset) + len: u16, // 65k elements per slice is sufficient _phantom: PhantomData, } +// 6 bytes, align 4 ``` +`start_index` is an **element index**, not a byte offset. This naming distinguishes it from byte offsets like `StringRef.offset` and `CompiledQuery.*_offset`. The distinction matters for typed array access. + ### Transition ```rust #[repr(C, align(64))] struct Transition { // --- 32 bytes metadata --- - matcher: Matcher, // 16 - pre_nav: PreNav, // 2 (see ADR-0008) - _pad1: [u8; 2], // 2 - effects: Slice, // 8 - ref_marker: RefTransition, // 4 + matcher: Matcher, // 16 (offset 0) + ref_marker: RefTransition, // 4 (offset 16) + successor_count: u32, // 4 (offset 20) + effects: Slice, // 6 (offset 24, when no effects: start and len are zero) + nav: Nav, // 2 (offset 30, see ADR-0008) // --- 32 bytes control flow --- - successor_count: u32, // 4 - successor_data: [u32; 7], // 28 + successor_data: [u32; 8], // 32 (offset 32) } // 64 bytes, align 64 (cache-line aligned) ``` -Navigation is fully determined by `pre_nav`—no runtime dispatch based on previous matcher. See [ADR-0008](ADR-0008-tree-navigation.md) for `PreNav` definition and semantics. +Navigation is fully determined by `nav`—no runtime dispatch based on previous matcher. See [ADR-0008](ADR-0008-tree-navigation.md) for `Nav` definition and semantics. Single `ref_marker` slot—sequences like `Enter(A) → Enter(B)` remain as epsilon chains. @@ -62,18 +64,18 @@ Successors use a small-size optimization to avoid indirection for the common cas | `successor_count` | Layout | | ----------------- | ----------------------------------------------------------------------------------- | -| 0–7 | `successor_data[0..count]` contains `TransitionId` values directly | -| > 7 | `successor_data[0]` is index into `successors` segment, `successor_count` is length | +| 0–8 | `successor_data[0..count]` contains `TransitionId` values directly | +| > 8 | `successor_data[0]` is index into `successors` segment, `successor_count` is length | -Why 7 slots: 32 available bytes / 4 bytes per `TransitionId` = 8 slots, minus 1 for the count field leaves 7. +Why 8 slots: Moving `successor_count` into the metadata block frees 32 bytes for `successor_data`, giving 32 / 4 = 8 inline slots. Coverage: - Linear sequences: 1 successor - Simple branches, quantifiers: 2 successors -- Most alternations: 2–7 branches +- Most alternations: 2–8 branches -Only massive alternations (8+ branches) spill to the external buffer. +Only massive alternations (9+ branches) spill to the external buffer. Cache benefits: @@ -104,7 +106,7 @@ enum Matcher { `Option` uses 0 for `None` (niche optimization). -Navigation (descend/ascend) is handled by `PreNav`, not matchers. Matchers are purely for node matching. +Navigation (descend/ascend) is handled by `Nav`, not matchers. Matchers are purely for node matching. ### RefTransition @@ -167,20 +169,18 @@ enum EffectOp { // 4 bytes, align 2 ``` -`CaptureNode` is explicit—graph construction places it at the correct position relative to container effects. - -**Invariant**: The interpreter clears `matched_node` slot on `Enter` and backtrack restore. This prevents stale captures if a graph construction bug produces `Epsilon → CaptureNode` without a preceding `Match`. With proper graphs, `CaptureNode` always follows a successful match that populates the slot. +**Graph construction invariant**: `CaptureNode` may only appear in the effects list of a transition where `matcher` is `Node`, `Anonymous`, or `Wildcard`. Placing `CaptureNode` on an `Epsilon` transition is illegal—graph construction must enforce this at build time. ### View Types ```rust struct TransitionView<'a> { - query_ir: &'a QueryIR, + query: &'a CompiledQuery, raw: &'a Transition, } struct MatcherView<'a> { - query_ir: &'a QueryIR, + query: &'a CompiledQuery, raw: &'a Matcher, } @@ -191,7 +191,7 @@ Views resolve `Slice` to `&[T]`. `TransitionView::successors()` returns `&[Tr ### Quantifiers -Examples in this section show graph structure and effects. Navigation (`pre_nav`) is omitted for brevity—see [ADR-0008](ADR-0008-tree-navigation.md) for full transition examples with navigation. +Examples in this section show graph structure and effects. Navigation (`nav`) is omitted for brevity—see [ADR-0008](ADR-0008-tree-navigation.md) for full transition examples with navigation. **Greedy `*`**: @@ -228,8 +228,8 @@ Before elimination: ``` T0: ε [StartArray] → [T1] T1: ε (branch) → [T2, T4] -T2: Match(identifier) → [T3] -T3: ε [CaptureNode, PushElement] → [T1] +T2: Match(identifier) [CaptureNode] → [T3] +T3: ε [PushElement] → [T1] T4: ε [EndArray] → [T5] T5: ε [Field("params")] → [...] ``` @@ -277,7 +277,7 @@ Partial—full elimination impossible due to single `ref_marker` and effect orde **Execution order** (all transitions, including epsilon): -1. Execute `pre_nav` and matcher +1. Execute `nav` and matcher 2. On success: emit `effects` in order With explicit `CaptureNode`, effect order is unambiguous. When eliminating epsilon chains, concatenate effect lists in traversal order. diff --git a/docs/adr/ADR-0006-dynamic-query-execution.md b/docs/adr/ADR-0006-dynamic-query-execution.md index f14c15d3..81be2c82 100644 --- a/docs/adr/ADR-0006-dynamic-query-execution.md +++ b/docs/adr/ADR-0006-dynamic-query-execution.md @@ -1,4 +1,4 @@ -# ADR-0006: Dynamic Query Execution +# ADR-0006: Query Execution - **Status**: Accepted - **Date**: 2024-12-12 @@ -6,7 +6,7 @@ ## Context -Runtime interpretation of the transition graph ([ADR-0005](ADR-0005-transition-graph-format.md)). Proc-macro compilation is a future ADR. +Runtime execution of the transition graph ([ADR-0005](ADR-0005-transition-graph-format.md)). Proc-macro compilation is a future ADR. ## Decision @@ -14,18 +14,18 @@ Runtime interpretation of the transition graph ([ADR-0005](ADR-0005-transition-g For each transition: -1. Execute `pre_nav` initial movement (e.g., goto_first_child, goto_next_sibling) +1. Execute `nav` initial movement (e.g., goto_first_child, goto_next_sibling) 2. Search loop: try matcher, on fail apply skip policy (advance or fail) 3. On match success: store matched node, execute `effects` sequentially 4. Process successors with backtracking For `Up*` variants, step 2 becomes: validate exit constraint, ascend N levels (no search loop). -Navigation is fully determined by `pre_nav`—no runtime dispatch based on previous matcher. See [ADR-0008](ADR-0008-tree-navigation.md) for detailed semantics. +Navigation is fully determined by `nav`—no runtime dispatch based on previous matcher. See [ADR-0008](ADR-0008-tree-navigation.md) for detailed semantics. The matched node is stored in a temporary slot (`matched_node`) accessible to `CaptureNode` effect. Effects execute in order—`CaptureNode` reads from this slot and sets `executor.current`. -**Slot invariant**: The `matched_node` slot is cleared (set to `None`) at the start of each transition execution, before `pre_nav`. This prevents stale captures if a transition path has `Epsilon → CaptureNode` without a preceding match—such a path indicates a graph construction bug, and the clear-on-entry invariant ensures it manifests as a predictable panic rather than silently capturing a wrong node. +**Slot invariant**: The `matched_node` slot is cleared (set to `None`) at the start of each transition execution, before `nav`. This prevents stale captures if a transition path has `Epsilon → CaptureNode` without a preceding match—such a path indicates a graph construction bug, and the clear-on-entry invariant ensures it manifests as a predictable panic rather than silently capturing a wrong node. ### Effect Stream @@ -40,12 +40,12 @@ Effects are **recorded**, not eagerly executed. On match success, the transition On backtrack, both vectors truncate to their watermarks. On full match success, the executor replays `ops` sequentially, consuming from `nodes` for each `CaptureNode`. -### Executor +### Materializer -Converts effect stream to output value. +Materializes effect stream into output value. ```rust -struct Executor<'a> { +struct Materializer<'a> { current: Option>, stack: Vec>, } @@ -80,13 +80,13 @@ enum Container<'a> { Invalid state = IR bug → panic. -### Interpreter +### QueryInterpreter ```rust -struct Interpreter<'a> { - query_ir: &'a QueryIR, - backtrack_stack: BacktrackStack, - frame_arena: CallFrameArena, +struct QueryInterpreter<'a> { + query: &'a CompiledQuery, + checkpoints: CheckpointStack, + frames: FrameArena, cursor: TreeCursor<'a>, // created at tree root, never reset effects: EffectStream<'a>, } @@ -94,24 +94,25 @@ struct Interpreter<'a> { **Cursor constraint**: The cursor must be created once at the tree root and never call `reset()`. This preserves `descendant_index` validity for backtracking checkpoints. -No `prev_matcher` tracking needed—each transition's `pre_nav` encodes the exact navigation to perform. +No `prev_matcher` tracking needed—each transition's `nav` encodes the exact navigation to perform. -Two stacks interact: backtracking can restore to a point inside a previously-exited call, so the frame arena must preserve frames. +Two structures interact: backtracking can restore to a point inside a previously-exited call, so the frame arena must preserve frames. -### Backtracking +### Checkpoints ```rust -struct BacktrackStack { - points: Vec, +struct CheckpointStack { + points: Vec, max_frame_watermark: Option, // highest frame index referenced by any point } -struct BacktrackPoint { +struct Checkpoint { cursor_checkpoint: u32, // tree-sitter descendant_index effect_watermark: u32, recursion_frame: Option, // saved frame index + prev_max_watermark: Option, // restore on pop for O(1) maintenance transition_id: TransitionId, // source transition for alternatives - next_alt: u32, // index of next alternative to try + next_alt: u32, // index of next alternative to try } ``` @@ -131,12 +132,12 @@ Restore also truncates `effects` to `effect_watermark` and sets `frame_arena.cur **Solution**: Store returns in call frame at `Enter`, retrieve at `Exit`. O(1), no filtering. ```rust -struct CallFrameArena { - frames: Vec, // append-only, pruned by watermark +struct FrameArena { + frames: Vec, // append-only, pruned by watermark current: Option, // index into frames (the "stack pointer") } -struct CallFrame { +struct Frame { parent: Option, // index of caller's frame ref_id: RefId, // verify Exit matches Enter enter_transition: TransitionId, // to retrieve returns via successors()[1..] @@ -147,18 +148,19 @@ Returns are retrieved via `TransitionView::successors()[1..]` on the `enter_tran **Append-only invariant**: Frames persist for backtracking correctness. On `Exit`, set `current` to parent index. Backtracking restores `current`; the original frame is still accessible via its index. -**Frame pruning**: After `Exit`, frames at the stack top may be reclaimed if: +**Frame pruning**: After `Exit`, frames at the arena top may be reclaimed if: 1. Not the current frame (already exited) 2. Not referenced by any live backtrack point This bounds memory by `max(recursion_depth, backtrack_depth)` rather than total call count. Without pruning, `(Rule)*` over N items allocates N frames; with pruning, it remains O(1) for non-backtracking iteration. -**O(1) watermark tracking**: The `max_frame_watermark` is maintained incrementally: +**O(1) watermark tracking**: Each checkpoint stores the previous `max_frame_watermark`, enabling O(1) restore on pop: ```rust -impl BacktrackStack { - fn push(&mut self, point: BacktrackPoint) { +impl CheckpointStack { + fn push(&mut self, mut point: Checkpoint) { + point.prev_max_watermark = self.max_frame_watermark; if let Some(frame) = point.recursion_frame { self.max_frame_watermark = Some(match self.max_frame_watermark { Some(max) => max.max(frame), @@ -168,23 +170,18 @@ impl BacktrackStack { self.points.push(point); } - fn pop(&mut self) -> Option { + fn pop(&mut self) -> Option { let point = self.points.pop()?; - // Recompute watermark only if popped point held the max - if point.recursion_frame == self.max_frame_watermark { - self.max_frame_watermark = self.points.iter() - .filter_map(|p| p.recursion_frame) - .max(); - } + self.max_frame_watermark = point.prev_max_watermark; Some(point) - }WS + } } fn prune_high_water_mark( current: Option, - backtrack_stack: &BacktrackStack, + checkpoints: &CheckpointStack, ) -> Option { - match (current, backtrack_stack.max_frame_watermark) { + match (current, checkpoints.max_frame_watermark) { (None, None) => None, (Some(c), None) => Some(c), (None, Some(m)) => Some(m), @@ -207,14 +204,14 @@ Frames with index > high-water mark can be truncated. # Checking only last point would incorrectly allow pruning F0 ``` -The `max_frame_watermark` tracks the true maximum across all live points. Push is O(1). Pop is amortized O(1)—the O(n) rescan only triggers when popping the point that held the maximum, which can happen at most once per frame +The `max_frame_watermark` tracks the true maximum across all live points. Both push and pop are O(1)—each checkpoint stores the previous max, so pop simply restores it without scanning. -| Operation | Action | -| ----------------- | ------------------------------------------------------------------------------ | -| `Enter(ref_id)` | Push frame (parent = `current`), set `current = len-1`, follow `successors[0]` | -| `Exit(ref_id)` | Verify ref_id, set `current = frame.parent`, continue with `frame.returns` | -| Save backtrack | Store `current` | -| Restore backtrack | Set `current` to saved value | +| Operation | Action | +| ------------------ | ------------------------------------------------------------------------------ | +| `Enter(ref_id)` | Push frame (parent = `current`), set `current = len-1`, follow `successors[0]` | +| `Exit(ref_id)` | Verify ref_id, set `current = frame.parent`, continue with `frame.returns` | +| Save checkpoint | Store `current` | +| Restore checkpoint | Set `current` to saved value | **Why index instead of depth?** Using logical depth breaks on Enter-Exit-Enter sequences: @@ -243,11 +240,11 @@ Input: boolean 7. frames[current] = FB ✓ ``` -Frames form a forest of call chains. Each backtrack point references an exact frame, not a depth. +Frames form a forest of call chains. Each checkpoint references an exact frame, not a depth. ### Atomic Groups (Future) -Cut/commit (discard backtrack points) works correctly: unreachable frames become garbage but cause no issues. +Cut/commit (discard checkpoints) works correctly: unreachable frames become garbage but cause no issues. ### Variant Serialization @@ -266,7 +263,7 @@ Details deferred. ## Consequences -**Positive**: Append-only stacks make backtracking trivial. O(1) exit via stored returns. Navigation fully determined by `pre_nav`—no state tracking between transitions. +**Positive**: Append-only stacks make backtracking trivial. O(1) exit via stored returns. Navigation fully determined by `nav`—no state tracking between transitions. **Negative**: Interpretation overhead. Recursion stack memory grows monotonically (bounded by `recursion_fuel`). diff --git a/docs/adr/ADR-0007-type-metadata-format.md b/docs/adr/ADR-0007-type-metadata-format.md index 9bc56b8a..17fb57fb 100644 --- a/docs/adr/ADR-0007-type-metadata-format.md +++ b/docs/adr/ADR-0007-type-metadata-format.md @@ -50,28 +50,34 @@ The handle provides access to node metadata (kind, span, text) without copying t ```rust #[repr(C)] struct TypeDef { - kind: TypeKind, // 1 - _pad: u8, // 1 - name: StringId, // 2 - synthetic or explicit, 0xFFFF for wrappers - data: u32, // 4 - TypeId for wrappers, slice offset for composites - data_len: u16, // 2 - 0 for wrappers, member count for composites - _pad2: u16, // 2 + kind: TypeKind, // 1 + _pad: u8, // 1 + name: StringId, // 2 - synthetic or explicit, 0xFFFF for wrappers + members: Slice, // 6 - see interpretation below + _pad2: u16, // 2 } // 12 bytes, align 4 ``` -Uses `u16` for `data_len` instead of `Slice`'s `u32` — no type has 65k members. Saves 2 bytes per TypeDef. +The `members` field has dual semantics based on `kind`: + +| Kind | `members.start_index` | `members.len` | +| ---------------------------------- | ----------------------- | ------------- | +| Wrappers (Optional/Array\*/Array+) | Inner `TypeId` (as u32) | 0 | +| Composites (Record/Enum) | Index into type_members | Member count | + +This reuses `Slice` for consistency with [ADR-0005](ADR-0005-transition-graph-format.md), while keeping TypeDef compact. ### TypeKind ```rust #[repr(C, u8)] enum TypeKind { - Optional = 0, // T? — data: inner TypeId - ArrayStar = 1, // T* — data: element TypeId - ArrayPlus = 2, // T+ — data: element TypeId - Record = 3, // struct — data/data_len: slice into type_members - Enum = 4, // tagged union — data/data_len: slice into type_members + Optional = 0, // T? — members.start = inner TypeId + ArrayStar = 1, // T* — members.start = element TypeId + ArrayPlus = 2, // T+ — members.start = element TypeId + Record = 3, // struct — members = slice into type_members + Enum = 4, // tagged union — members = slice into type_members } ``` @@ -164,7 +170,7 @@ enum FuncBody { Optional runtime check for debugging: ```rust -fn validate(value: &Value, expected: TypeId, ir: &QueryIR) -> Result<(), TypeError>; +fn validate(value: &Value, expected: TypeId, query: &CompiledQuery) -> Result<(), TypeError>; ``` Walk the `Value` tree, verify shape matches `TypeId`. Mismatch indicates IR construction bug—panic in debug, skip in release. diff --git a/docs/adr/ADR-0008-tree-navigation.md b/docs/adr/ADR-0008-tree-navigation.md index e70c53db..e783ab92 100644 --- a/docs/adr/ADR-0008-tree-navigation.md +++ b/docs/adr/ADR-0008-tree-navigation.md @@ -8,10 +8,10 @@ Plotnik's query execution engine ([ADR-0006](ADR-0006-dynamic-query-execution.md)) navigates tree-sitter syntax trees. This ADR covers: 1. Which tree-sitter API to use (TreeCursor vs Node) -2. How `PreNav` encodes navigation and anchor constraints +2. How `Nav` encodes navigation and anchor constraints 3. How transitions execute navigation deterministically -Key insight: navigation decisions can be resolved at graph construction time, not runtime. Each transition carries its own `PreNav` instruction—no need to track previous matcher state. +Key insight: navigation decisions can be resolved at graph construction time, not runtime. Each transition carries its own `Nav` instruction—no need to track previous matcher state. ## Decision @@ -30,20 +30,20 @@ struct BacktrackCheckpoint { **Critical constraint**: The cursor must be created at the tree root and never call `reset()`. The `descendant_index` is relative to the cursor's root—`reset(node)` invalidates all checkpoints. -### PreNav +### Nav Navigation and anchor constraints unified into a single enum: ```rust #[repr(C)] -struct PreNav { - kind: PreNavKind, // 1 byte - level: u8, // 1 byte - ascent level count for Up*, ignored otherwise +struct Nav { + kind: NavKind, // 1 byte + level: u8, // 1 byte - ascent level count for Up*, ignored otherwise } // 2 bytes total #[repr(u8)] -enum PreNavKind { +enum NavKind { // No movement (first transition only, cursor at root) Stay = 0, @@ -70,15 +70,15 @@ For non-Up variants, `level` is ignored (conventionally 0). For Up variants, `le ### Trivia -**Trivia** = anonymous nodes + language-specific ignored named nodes (e.g., `comment`). +**Trivia** = anonymous nodes + language-specific trivia named nodes (e.g., `comment`). -The ignored kinds list is populated from the `Lang` binding during IR construction and stored in the `ignored_kinds` segment ([ADR-0004](ADR-0004-query-ir-binary-format.md)). Zero offset means no ignored kinds. +The trivia kinds list is populated from the `Lang` binding during IR construction and stored in the `trivia_kinds` segment ([ADR-0004](ADR-0004-query-ir-binary-format.md)). Zero offset means no trivia kinds. **Skip invariant**: A node is never skipped if its kind matches the current transition's matcher target. This ensures `(comment)` explicitly in a query still matches comment nodes, even though comments are typically ignored. ### Execution Semantics -Navigation and matching are intertwined in a search loop. The `PreNav` determines initial movement and skip policy for the loop. +Navigation and matching are intertwined in a search loop. The `Nav` determines initial movement and skip policy for the loop. **Stay**: No cursor movement. Used only for the first transition when cursor is already positioned at root. Then attempt match. @@ -119,17 +119,17 @@ Example: `(foo (bar))` matching `(foo (foo) (foo) (bar))`: ### Anchor Lowering -The anchor operator (`.`) in the query language compiles to `PreNav` variants: +The anchor operator (`.`) in the query language compiles to `Nav` variants: -| Query Pattern | PreNav on Following Transition | -| -------------------- | ------------------------------ | -| `(foo) (bar)` | `Next` | -| `(foo) . (bar)` | `NextSkipTrivia` | -| `"x" . (bar)` | `NextExact` | -| `(parent (child))` | `Down` on child's transition | -| `(parent . (child))` | `DownSkipTrivia` | -| `(parent (child) .)` | `UpSkipTrivia` on exit | -| `(parent "x" .)` | `UpExact` on exit | +| Query Pattern | Nav on Following Transition | +| -------------------- | ---------------------------- | +| `(foo) (bar)` | `Next` | +| `(foo) . (bar)` | `NextSkipTrivia` | +| `"x" . (bar)` | `NextExact` | +| `(parent (child))` | `Down` on child's transition | +| `(parent . (child))` | `DownSkipTrivia` | +| `(parent (child) .)` | `UpSkipTrivia` on exit | +| `(parent "x" .)` | `UpExact` on exit | Mode determined by what **precedes** the anchor: @@ -161,7 +161,7 @@ Cannot combine into `UpSkipTrivia(2)` because constraints apply at each level. ### Execution Flow ``` -1. MOVE pre_nav → initial cursor movement +1. MOVE nav → initial cursor movement 2. SEARCH loop: try matcher, on fail check skip policy, advance or fail 3. EFFECTS on match success: execute effects list (including explicit CaptureNode) ``` @@ -319,7 +319,7 @@ Previous design had `post_anchor` field validated after match. Rejected: - O(1) sibling traversal - 4-byte checkpoints -- No `prev_matcher` tracking—navigation fully determined by `PreNav` +- No `prev_matcher` tracking—navigation fully determined by `Nav` - Simpler execution loop: navigate → search → match (no post-validation) - Anchor constraints resolved at graph construction time