Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
783 changes: 0 additions & 783 deletions crates/plotnik-lib/src/bytecode/emit/typescript.rs

This file was deleted.

7 changes: 5 additions & 2 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ Plotnik is a strongly-typed pattern matching language for tree-sitter syntax tre

### Users

- [CLI Guide](cli.md) — Command-line tool usage
- [Language Reference](lang-reference.md) — Complete syntax and semantics
- [Type System](type-system.md) — How output types are inferred from queries

Expand All @@ -21,6 +22,7 @@ Plotnik is a strongly-typed pattern matching language for tree-sitter syntax tre
AGENTS.md # Project constitution (coding rules, testing, ADRs)
docs/
├── README.md # You are here
├── cli.md # CLI tool usage guide
├── lang-reference.md # Query language syntax and semantics
├── type-system.md # Type inference rules and output shapes
├── runtime-engine.md # VM state, backtracking, effects
Expand All @@ -37,8 +39,9 @@ docs/

New to Plotnik:

1. `lang-reference.md` — Learn the query syntax
2. `type-system.md` — Understand output shapes
1. `cli.md` — Get started with the CLI
2. `lang-reference.md` — Learn the query syntax
3. `type-system.md` — Understand output shapes

Building tooling:

Expand Down
6 changes: 3 additions & 3 deletions docs/binary-format/01-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,9 +90,9 @@ struct Header {

### Flags Field

| Bit | Name | Description |
| --- | ------- | -------------------------------------------------------- |
| 0 | LINKED | If set, bytecode contains grammar NodeTypeId/NodeFieldId |
| Bit | Name | Description |
| --- | ------ | -------------------------------------------------------- |
| 0 | LINKED | If set, bytecode contains grammar NodeTypeId/NodeFieldId |

**Linked vs Unlinked Bytecode**:

Expand Down
2 changes: 1 addition & 1 deletion docs/binary-format/02-strings.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Strings are stored in a centralized pool to eliminate redundancy and alignment p

### Reserved StringId(0)

`StringId(0)` is reserved and contains an easter egg: `"Beauty will save the world"` (Dostoevsky, *The Idiot*).
`StringId(0)` is reserved and contains an easter egg: `"Beauty will save the world"` (Dostoevsky, _The Idiot_).

This reservation has a practical purpose: since Match instructions use `0` to indicate "no constraint" (wildcard), `StringId(0)` can never appear in unlinked bytecode instructions. User strings start at index 1.

Expand Down
67 changes: 29 additions & 38 deletions docs/binary-format/06-transitions.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,20 +87,20 @@ EffectOp (u16)
- **Opcode**: 6 bits (0-63), currently 12 defined.
- **Payload**: 10 bits (0-1023), member/variant index.

| Opcode | Name | Payload |
| :----- | :------ | :--------------------- |
| 0 | `Node` | - |
| 1 | `A` | - |
| 2 | `Push` | - |
| 3 | `EndA` | - |
| 4 | `S` | - |
| 5 | `EndS` | - |
| 6 | `Set` | Member index (0-1023) |
| 7 | `E` | Variant index (0-1023) |
| 8 | `EndE` | - |
| 9 | `Text` | - |
| 10 | `Clear` | - |
| 11 | `Null` | - |
| Opcode | Name | Payload |
| :----- | :-------- | :--------------------- |
| 0 | `Node` | - |
| 1 | `Arr` | - |
| 2 | `Push` | - |
| 3 | `EndArr` | - |
| 4 | `Obj` | - |
| 5 | `EndObj` | - |
| 6 | `Set` | Member index (0-1023) |
| 7 | `Enum` | Variant index (0-1023) |
| 8 | `EndEnum` | - |
| 9 | `Text` | - |
| 10 | `Clear` | - |
| 11 | `Null` | - |

**Opcode Ranges** (future extensibility):

Expand Down Expand Up @@ -134,9 +134,9 @@ struct Match8 {

Bytes 2-5 (`node_type` and `node_field`) have different meanings based on the header's `linked` flag:

| Mode | `node_type` (bytes 2-3) | `node_field` (bytes 4-5) |
| -------- | ------------------------------- | -------------------------------- |
| Linked | `NodeTypeId` from tree-sitter | `NodeFieldId` from tree-sitter |
| Mode | `node_type` (bytes 2-3) | `node_field` (bytes 4-5) |
| -------- | -------------------------------- | --------------------------------- |
| Linked | `NodeTypeId` from tree-sitter | `NodeFieldId` from tree-sitter |
| Unlinked | `StringId` pointing to type name | `StringId` pointing to field name |

In **linked mode**, the runtime can directly compare against tree-sitter node types/fields.
Expand Down Expand Up @@ -217,8 +217,10 @@ The compiler selects the smallest step size that fits the payload. If the total

**Pre vs Post Effects**:

- `pre_effects`: Execute before match attempt. Used for scope openers (`S`, `A`, `E`) that must run regardless of which branch succeeds.
- `post_effects`: Execute after successful match. Used for capture/assignment ops (`Node`, `Set`, `EndS`, etc.) that depend on `matched_node`.
- `pre_effects`: Execute before match attempt (before nav, before node checks). Any effect can appear here.
- `post_effects`: Execute after successful match (after `matched_node` is set). Any effect can appear here.

The compiler places effects based on semantic requirements: scope openers often go in pre (to run regardless of which branch succeeds), captures often go in post (to access `matched_node`). But this is a compiler decision, not a bytecode-level restriction.

### 4.3. Epsilon Transitions

Expand All @@ -230,19 +232,20 @@ A Match8 or Match16–64 with `node_type: None`, `node_field: None`, and `nav: S

### 4.4. Call

Invokes another definition (recursion). Pushes return address to the call stack and jumps to target.
Invokes another definition (recursion). Executes navigation (with optional field constraint), pushes return address to the call stack, and jumps to target.

```rust
#[repr(C)]
struct Call {
type_id: u8, // segment(4) | 0x6
reserved: u8,
next: u16, // Return address (StepId, current segment)
target: u16, // Callee StepId (segment from type_id)
ref_id: u16, // Must match Return.ref_id
type_id: u8, // segment(4) | 0x6
nav: u8, // Nav
node_field: Option<NonZeroU16>, // None (0) means "any"
next: u16, // Return address (StepId, current segment)
target: u16, // Callee StepId (segment from type_id)
}
```

- **Nav + Field**: Call handles navigation and field constraint. The callee's first Match checks node type. This allows `field: (Ref)` patterns to check field and type on the same node.
- **Target Segment**: Defined by `type_id >> 4`.
- **Return Segment**: Implicitly the current segment.

Expand All @@ -254,22 +257,10 @@ Returns from a definition. Pops the return address from the call stack.
#[repr(C)]
struct Return {
type_id: u8, // segment(4) | 0x7
reserved: u8,
ref_id: u16, // Must match Call.ref_id
_pad: u32,
_pad: [u8; 7],
}
```

### 4.6. The `ref_id` Invariant

The `ref_id` field enforces stack discipline between `Call` and `Return`. Each definition gets a unique `ref_id` at compile time. At runtime:

1. `Call` pushes a frame with its `ref_id` onto the call stack.
2. `Return` verifies its `ref_id` matches the current frame's `ref_id`.
3. Mismatch indicates a malformed query or VM bug—panic in debug builds.

This catches errors like mismatched call/return pairs or corrupted stack state during backtracking. The check is O(1).

## 5. Execution Semantics

### 5.1. Match8 Execution
Expand Down
Loading