diff --git a/docs/adr/ADR-0003-query-intermediate-representation.md b/docs/adr/ADR-0003-query-intermediate-representation.md
index cd2d2f16..f16dfb74 100644
--- a/docs/adr/ADR-0003-query-intermediate-representation.md
+++ b/docs/adr/ADR-0003-query-intermediate-representation.md
@@ -51,14 +51,15 @@ The graph is immutable after construction. We use a single contiguous allocation
 
 ```rust
 struct TransitionGraph {
-    data: Box<[u8]>,
+    data: Arena,  // custom type, see Memory Layout & Alignment
     // segment offsets (aligned for each type)
-    successors_offset: usize,
-    effects_offset: usize,
-    negated_fields_offset: usize,
-    data_fields_offset: usize,
-    variant_tags_offset: usize,
-    entrypoints_offset: usize,
+    successors_offset: u32,
+    effects_offset: u32,
+    negated_fields_offset: u32,
+    data_fields_offset: u32,
+    variant_tags_offset: u32,
+    entrypoint_names_offset: u32,
+    entrypoints_offset: u32,
     default_entrypoint: TransitionId,
 }
 
@@ -69,30 +70,119 @@ impl TransitionGraph {
     fn default_entry(&self) -> TransitionView<'_>;
     fn field_name(&self, id: DataFieldId) -> &str;
     fn tag_name(&self, id: VariantTagId) -> &str;
+    fn entrypoint_name(&self, entry: &Entrypoint) -> &str;
 }
 ```
 
 ##### Memory Arena Design
 
-The single `Box<[u8]>` allocation is divided into typed segments. Each segment is properly aligned for its type, ensuring safe access across all architectures (x86, ARM, RISC-V).
+The single contiguous allocation is divided into typed segments. Each segment is properly aligned for its type, ensuring safe access across all architectures (x86, ARM, WASM).
 
 **Segment Layout**:
 
-- Transitions: `[Transition; N]` at offset 0
-- Successors: `[TransitionId; M]` at `successors_offset`
-- Effects: `[EffectOp; P]` at `effects_offset`
-- Negated Fields: `[NodeFieldId; Q]` at `negated_fields_offset`
-- Data Fields: `[u8; R]` (string data) at `data_fields_offset`
-- Variant Tags: `[u8; S]` (string data) at `variant_tags_offset`
-- Entrypoints: `[(name, TransitionId); T]` (length-prefixed strings) at `entrypoints_offset`
+| Segment          | Type                | Offset                    | Alignment |
+| ---------------- | ------------------- | ------------------------- | --------- |
+| Transitions      | `[Transition; N]`   | 0 (implicit)              | 4 bytes   |
+| Successors       | `[TransitionId; M]` | `successors_offset`       | 4 bytes   |
+| Effects          | `[EffectOp; P]`     | `effects_offset`          | 2 bytes   |
+| Negated Fields   | `[NodeFieldId; Q]`  | `negated_fields_offset`   | 2 bytes   |
+| Data Fields      | `[u8; R]`           | `data_fields_offset`      | 1 byte    |
+| Variant Tags     | `[u8; S]`           | `variant_tags_offset`     | 1 byte    |
+| Entrypoint Names | `[u8; U]`           | `entrypoint_names_offset` | 1 byte    |
+| Entrypoints      | `[Entrypoint; T]`   | `entrypoints_offset`      | 4 bytes   |
+
+Transitions always start at offset 0—no explicit offset stored. The arena base address is allocated with 4-byte alignment, satisfying `Transition`'s requirement.
 
 Note: `entry(&str)` performs linear scan — O(n) where n = definition count (typically <20).
 
-The offsets are computed during graph construction to ensure:
+##### Memory Layout & Alignment
+
+Casting `&u8` to `&T` when the address is not aligned to `T` causes traps on WASM and faults on strict ARM. We enforce alignment explicitly via a custom arena type.
+
+**A. Base Allocation Alignment**
+
+The arena must be allocated with alignment equal to the maximum of all segment types. We use a custom `Arena` wrapper that tracks the allocation layout for correct deallocation:
+
+```rust
+const ARENA_ALIGN: usize = 4; // align_of::<Transition>()
+
+struct Arena {
+    ptr: *mut u8,
+    len: usize,
+}
+
+impl Arena {
+    fn new(len: usize) -> Self {
+        let layout = std::alloc::Layout::from_size_align(len, ARENA_ALIGN).unwrap();
+        let ptr = unsafe { std::alloc::alloc(layout) };
+        if ptr.is_null() {
+            std::alloc::handle_alloc_error(layout);
+        }
+        Self { ptr, len }
+    }
+}
+
+impl Drop for Arena {
+    fn drop(&mut self) {
+        let layout = std::alloc::Layout::from_size_align(self.len, ARENA_ALIGN).unwrap();
+        unsafe { std::alloc::dealloc(self.ptr, layout) };
+    }
+}
+```
+
+Standard `Box<[u8]>` cannot be used here: it assumes 1-byte alignment and would pass incorrect layout to `dealloc`, causing undefined behavior.
+
+**B. Segment Offset Calculation**
+
+Each segment offset is rounded up to its type's alignment:
+
+```rust
+fn align_up(offset: usize, align: usize) -> usize {
+    (offset + align - 1) & !(align - 1)
+}
+
+// Example: if Transitions end at byte 103, Successors (align 4) start at 104
+let successors_offset = align_up(transitions_end, align_of::<TransitionId>());
+```
+
+**C. Entrypoints Structure**
+
+Entrypoints use fixed-size metadata with indirect string storage:
+
+```rust
+#[repr(C)]
+struct Entrypoint {
+    name_offset: u32,  // index into entrypoint_names segment
+    name_len: u32,
+    target: TransitionId, // Index into transitions segment (offset 0)
+}
+// Size: 12 bytes, Align: 4 bytes
+```
+
+The `name_offset` points into the `entrypoint_names` segment (u8 array), where alignment is irrelevant. This avoids the alignment hazards of inline variable-length strings. Note: entrypoint names are stored separately from data field names because they serve different purposes—entrypoint names identify subqueries for lookup, while data field names are used in output object construction.
+
+##### Construction Process
 
-1. Each segment starts at its type's natural alignment boundary
-2. No padding bytes are wasted between same-typed items
-3. String data is stored as length-prefixed UTF-8 bytes
+The graph is built in two passes to ensure a single contiguous allocation:
+
+1.  **Analysis Pass**: Traverse the Query AST to count all elements (transitions, effects, negated fields, strings).
+2.  **Layout & Allocation**: Compute aligned offsets for all segments and allocate the `Arena` once.
+3.  **Emission Pass**: Serialize data into the arena.
+    - **String Tables**: Written sequentially to `u8` segments.
+    - **Slices**: `Slice<T>` fields are populated with `start` indices relative to their segment base.
+    - **Structs**: `Transition`, `Entrypoint`, etc., are written using `std::ptr::write` to their calculated offsets.
+
+This approach eliminates dynamic resizing (`realloc`) and fragmentation.
+
+##### Slice Resolution
+
+`Slice<T>` handles are resolved to actual slices by combining:
+
+1. The segment's base offset (e.g., `effects_offset` for `Slice<EffectOp>`)
+2. The slice's `start` field (element index within segment)
+3. The slice's `len` field
+
+The `TransitionView` methods (`pre_effects()`, `post_effects()`, `next()`) perform this resolution internally, returning standard `&[T]` slices. Engine code never performs offset arithmetic directly.
 
 **Access Pattern**:
 
@@ -120,7 +210,7 @@ struct TransitionView<'a> {
 }
 
 impl<'a> TransitionView<'a> {
-    fn matcher(&self) -> Option<MatcherView<'a>>;
+    fn matcher(&self) -> MatcherView<'a>;
     fn next(&self) -> impl Iterator<Item = TransitionView<'a>>;
     fn pre_effects(&self) -> &[EffectOp];
     fn post_effects(&self) -> &[EffectOp];
@@ -139,10 +229,10 @@ impl<'a> MatcherView<'a> {
     fn node_kind(&self) -> Option<NodeTypeId>;
     fn field(&self) -> Option<NodeFieldId>;
     fn negated_fields(&self) -> &[NodeFieldId];  // resolved from Slice
-    fn matches(&self, cursor: &TreeCursor) -> bool;
+    fn matches(&self, cursor: &mut TreeCursor) -> bool;
 }
 
-enum MatcherKind { Node, Anonymous, Wildcard, Down, Up }
+enum MatcherKind { Epsilon, Node, Anonymous, Wildcard, Down, Up }
 ```
 
 **Execution Flow**:
@@ -170,9 +260,13 @@ struct Slice<T> {
     _phantom: PhantomData<T>,
 }
 
-impl<T> Slice<T> {
+impl Slice<T> {
     const EMPTY: Self = Self { start: 0, len: 0, _phantom: PhantomData };
 }
+
+// Note: PhantomData<T> does not prevent crossing segment boundaries (e.g. passing
+// Slice<EffectOp> where Slice<TransitionId> is expected). Implementers should
+// consider wrapping these in newtypes if stricter compile-time safety is required.
 ```
 
 Size: 8 bytes. Using `u32` for both fields fills the natural alignment with no padding waste, supporting up to 4B items per slice—well beyond any realistic query.
@@ -182,20 +276,23 @@ Size: 8 bytes. Using `u32` for both fields fills the natural alignment with no p
 Internal storage. Engine code uses `TransitionView` instead of accessing this directly.
 
 ```rust
+#[repr(C)]
 struct Transition {
-    matcher: Option<Matcher>,
-    pre_anchored: bool,
-    post_anchored: bool,
-    pre_effects: Slice<EffectOp>,
-    post_effects: Slice<EffectOp>,
-    ref_marker: Option<RefTransition>,
-    next: Slice<TransitionId>,
+    matcher: Matcher,            // 16 bytes (Epsilon variant for epsilon-transitions)
+    pre_anchored: bool,          // 1 byte
+    post_anchored: bool,         // 1 byte
+    _pad1: [u8; 2],              // 2 bytes padding
+    pre_effects: Slice<EffectOp>,  // 8 bytes
+    post_effects: Slice<EffectOp>, // 8 bytes
+    ref_marker: RefTransition,       // 4 bytes (explicit None variant)
+    next: Slice<TransitionId>,   // 8 bytes
 }
+// Size: 48 bytes, Align: 4 bytes
 ```
 
-The `TransitionView` resolves `Slice<T>` using the graph's `get_slice` method, hiding all offset calculations from the engine code.
+The `TransitionView` resolves `Slice<T>` by combining the graph's segment offset with the slice's start/len fields.
 
-**Design Note**: The `ref_marker` field is intentionally a single `Option<RefTransition>` rather than a `Slice`. This means a transition can carry at most one Enter or Exit marker. While this prevents full epsilon elimination for nested reference sequences (e.g., `Enter(A) → Enter(B)`), we accept this limitation for simplicity. Such sequences remain as chains of epsilon transitions in the final graph.
+**Design Note**: The `ref_marker` field is intentionally a single `RefTransition` rather than a `Slice`. This means a transition can carry at most one Enter or Exit marker. While this prevents full epsilon elimination for nested reference sequences (e.g., `Enter(A) → Enter(B)`), we accept this limitation for simplicity. Such sequences remain as chains of epsilon transitions in the final graph.
 
 ```rust
 type TransitionId = u32;
@@ -208,23 +305,26 @@ Each named definition has an entry point. The default entry is the last definiti
 
 #### Matcher
 
-Note: `NodeTypeId` and `NodeFieldId` are defined in `plotnik-core` (tree-sitter uses `u16` and `NonZeroU16` respectively).
+Note: `NodeTypeId` is `u16`. `NodeFieldId` is `NonZeroU16`, which guarantees `Option<NodeFieldId>` uses value `0` for `None` — a stable layout suitable for raw serialization. Both types are defined in `plotnik-core`.
 
 ```rust
+#[repr(C, u32)]
 enum Matcher {
+    Epsilon,                           // no payload
     Node {
-        kind: NodeTypeId,
-        field: Option<NodeFieldId>,
-        negated_fields: Slice<NodeFieldId>,
+        kind: NodeTypeId,              // 2 bytes
+        field: Option<NodeFieldId>,    // 2 bytes
+        negated_fields: Slice<NodeFieldId>, // 8 bytes
     },
     Anonymous {
-        kind: NodeTypeId,
-        field: Option<NodeFieldId>,
+        kind: NodeTypeId,              // 2 bytes
+        field: Option<NodeFieldId>,    // 2 bytes
     },
     Wildcard,
     Down,
     Up,
 }
+// Size: 16 bytes (4-byte discriminant + 12-byte largest variant). Align: 4 bytes
 ```
 
 Navigation variants `Down`/`Up` move the cursor without matching. They enable nested patterns like `(function_declaration (identifier) @name)` where we must descend into children.
@@ -232,13 +332,18 @@ Navigation variants `Down`/`Up` move the cursor without matching. They enable ne
 #### Reference Markers
 
 ```rust
+#[repr(C, u8)]  // Explicit 1-byte discriminant for stable serialization
 enum RefTransition {
-    Enter(RefId),  // push ref_id onto return stack
-    Exit(RefId),   // pop from return stack (must match ref_id)
+    None,          // no marker (discriminant 0)
+    Enter(RefId),  // push ref_id onto return stack (discriminant 1)
+    Exit(RefId),   // pop from return stack, must match ref_id (discriminant 2)
 }
+// Size: 4 bytes (1-byte discriminant + 1-byte padding + 2-byte RefId), Align: 2 bytes
 ```
 
-Thompson construction creates epsilon transitions with optional `Enter`/`Exit` markers. Epsilon elimination propagates these markers to surviving transitions. At runtime, the engine uses markers to filter which `next` transitions are valid based on return stack state. Multiple transitions can share the same `RefId` after epsilon elimination.
+The explicit `None` variant (rather than `Option<RefTransition>`) ensures stable binary layout for raw arena serialization. `Option<Enum>` relies on compiler-specific niche optimization whose bit-pattern is unspecified.
+
+Thompson construction creates epsilon transitions with `Enter`/`Exit` markers. Epsilon elimination propagates these markers to surviving transitions. At runtime, the engine uses markers to filter which `next` transitions are valid based on return stack state. Multiple transitions can share the same `RefId` after epsilon elimination.
 
 #### Effect Operations
 
@@ -246,6 +351,7 @@ Instructions stored in the transition graph. These are static, `Copy`, and conta
 
 ```rust
 #[derive(Clone, Copy)]
+#[repr(C)]
 enum EffectOp {
     StartArray,              // push new [] onto container stack
     PushElement,             // move current value into top array
@@ -257,14 +363,31 @@ enum EffectOp {
     EndVariant,              // pop variant from stack, wrap current, becomes current
     ToString,                // convert current Node value to String (source text)
 }
+// Size: 4 bytes (1-byte discriminant + 2-byte payload + 1-byte padding), Align: 2 bytes
 ```
 
-Size: 4 bytes (1-byte discriminant + 2-byte payload + 1-byte padding).
-
-Note: There is no `CaptureNode` instruction. Node capture is implicit—a successful match automatically emits `RuntimeEffect::CaptureNode` to the builder (see below).
+Note: There is no `CaptureNode` instruction. Node capture is implicit—a successful match automatically emits `RuntimeEffect::CaptureNode` to the effect stream (see below).
 
 Effects capture structure only—arrays, objects, variants. Type annotations (`:: str`, `:: Type`) are separate metadata applied during post-processing.
 
+##### Effect Placement Rules
+
+After epsilon elimination, effects are classified as pre or post based on when they must execute relative to the match:
+
+| Effect         | Placement | Reason                                     |
+| -------------- | --------- | ------------------------------------------ |
+| `StartArray`   | Pre       | Container must exist before elements added |
+| `StartObject`  | Pre       | Container must exist before fields added   |
+| `StartVariant` | Pre       | Tag must be set before payload captured    |
+| `PushElement`  | Post      | Consumes the just-matched node             |
+| `Field`        | Post      | Consumes the just-matched node             |
+| `EndArray`     | Post      | Finalizes after last element matched       |
+| `EndObject`    | Post      | Finalizes after last field matched         |
+| `EndVariant`   | Post      | Wraps payload after it's captured          |
+| `ToString`     | Post      | Converts the just-matched node to text     |
+
+Pre-effects from incoming epsilon paths accumulate in order. Post-effects from outgoing epsilon paths accumulate in order. This ordering is deterministic and essential for correct data construction.
+
 ### Data Construction (Dynamic Interpreter)
 
 This section describes data construction for the dynamic interpreter. Proc-macro codegen uses direct construction instead (see [Direct Construction](#direct-construction-no-effect-stream)).
@@ -273,7 +396,7 @@ The interpreter emits events to a linear stream during matching. After a success
 
 #### Runtime Effects
 
-Events emitted to the builder during interpretation. Unlike `EffectOp`, these carry runtime data.
+Events emitted to the effect stream during interpretation. Unlike `EffectOp`, these carry runtime data.
 
 ```rust
 enum RuntimeEffect<'a> {
@@ -284,16 +407,16 @@ enum RuntimeEffect<'a> {
 
 The `CaptureNode` variant is never stored in the graph—it's generated by the interpreter when a match succeeds. This separation keeps the graph static (no lifetimes) while allowing the runtime stream to carry actual node references.
 
-#### Builder
+#### Effect Stream
 
 ```rust
 /// Accumulates runtime effects during matching; supports rollback on backtrack
-struct Builder<'a> {
+struct EffectStream<'a> {
     effects: Vec<RuntimeEffect<'a>>,
 }
 ```
 
-The builder accumulates effects as a linear stream during matching. It provides:
+The effect stream accumulates effects linearly during matching. It provides:
 
 - **Effect emission**: Appends `EffectOp` instructions and `CaptureNode` events
 - **Watermarking**: Records position before attempting branches
@@ -340,11 +463,17 @@ Effect semantics on `current`:
 - `End*` → pops container from stack into `current`
 - `ToString` → replaces `current` Node with its source text as String
 
+**Error Handling**
+
+The interpreter assumes the effect stream is well-formed (guaranteed by the query compiler).
+
+- **Panic**: Any operation on invalid state (e.g., `Field` when `current` is `None`, `EndArray` with empty stack, `ToString` on non-Node). These indicate bugs in the IR construction.
+
 #### Execution Pipeline
 
 For any given transition, the execution order is strict to ensure data consistency during backtracking:
 
-1. **Enter**: Push `Frame` with current `builder.watermark()`.
+1. **Enter**: Push `Frame` with current `effect_stream.watermark()`.
 2. **Pre-Effects**: Emit `pre_effects` as `RuntimeEffect::Op(...)`.
 3. **Match**: Validate node kind/fields. If fail, rollback to watermark and abort.
 4. **Capture**: Emit `RuntimeEffect::CaptureNode(matched_node)` — implicit, not from graph.
@@ -420,14 +549,9 @@ Two mechanisms work together (same for both execution modes):
 
 1. **Cursor checkpoint**: `cursor.descendant_index()` returns a `usize` position; `cursor.goto_descendant(pos)` restores it. O(1) save, O(depth) restore, no allocation.
 
-2. **Effect watermark**: `builder.watermark()` before attempting a branch; `builder.rollback(watermark)` on failure.
-
-Both execution modes save state before attempting branches:
+2. **Effect watermark**: `effect_stream.watermark()` before attempting a branch; `effect_stream.rollback(watermark)` on failure.
 
-- **Cursor checkpoint**: Current position in the AST (cheap to save, O(depth) to restore)
-- **Builder watermark**: Current effect count (O(1) save and restore)
-
-The pattern is: attempt first branch, and on failure, restore both cursor and effects to their saved states before trying the next branch. This ensures each alternative starts from the same clean state.
+Both execution modes follow the same pattern: save state before attempting a branch; on failure, restore both cursor and effects before trying the next branch. This ensures each alternative starts from the same clean state.
 
 ```
 
@@ -562,7 +686,7 @@ In proc-macro mode, each definition becomes a Rust function. References become d
 
 **Dynamic**: The interpreter maintains an explicit return stack. On `Enter(ref_id)`:
 
-1. Push frame with `ref_id`, cursor checkpoint, builder watermark
+1. Push frame with `ref_id`, cursor checkpoint, effect stream watermark
 2. Follow `next` into the definition body
 
 On `Exit(ref_id)`:
@@ -571,7 +695,7 @@ On `Exit(ref_id)`:
 2. Pop frame
 3. Continue to `next` successors unconditionally
 
-**Entry filtering mechanism**: The filtering happens when _entering_ an `Exit` transition, not when leaving it. After epsilon elimination, multiple `Exit` transitions with different `RefId`s may be reachable from the same point (merged from different call sites). The interpreter only takes an `Exit(ref_id)` transition if `ref_id` matches the current stack top. This ensures returns go to the correct call site.
+**Entry filtering mechanism**: After epsilon elimination, multiple `Exit` transitions with different `RefId`s may be reachable from the same point (merged from different call sites). The interpreter only takes an `Exit(ref_id)` transition if `ref_id` matches the current stack top. This ensures returns go to the correct call site.
 
 After taking an `Exit` and popping the frame, successors are followed unconditionally—they represent the continuation after the call. If a successor has an `Enter` marker, that's a _new_ call (e.g., `(A) (B)` where returning from A continues to calling B), not a return path.
 
@@ -580,7 +704,7 @@ After taking an `Exit` and popping the frame, successors are followed unconditio
 struct Frame {
     ref_id: RefId,                  // which call site we're inside
     cursor_checkpoint: usize,       // cursor position before call
-    builder_watermark: usize,       // effect count before call
+    effect_stream_watermark: usize, // effect count before call
 }
 
 /// Runtime query executor
@@ -588,7 +712,7 @@ struct Interpreter<'a> {
     graph: &'a TransitionGraph,
     return_stack: Vec<Frame>,       // call stack for definition references
     cursor: TreeCursor<'a>,         // current position in AST
-    builder: Builder,               // effect accumulator
+    effect_stream: EffectStream<'a>, // effect accumulator
 }
 ```
 
@@ -693,17 +817,17 @@ Trade-off: More flexible (runtime query construction), but slower than generated
 
 ## Execution Mode Comparison
 
-| Aspect            | Proc Macro                 | Dynamic                      |
-| ----------------- | -------------------------- | ---------------------------- |
-| Query source      | Compile-time literal       | Runtime string               |
-| Graph lifetime    | Compile-time only          | Runtime                      |
-| Data construction | Direct (no effect stream)  | `RuntimeEffect` stream + exe |
-| Definition calls  | Rust function calls        | Explicit return stack        |
-| Return stack      | Rust call stack            | `Vec<Frame>`                 |
-| Backtracking      | Drop + re-alloc            | `truncate()` effects         |
-| Performance       | Zero dispatch, single pass | Interpretation + 2 pass      |
-| Type safety       | Compile-time checked       | Runtime types                |
-| Use case          | Known queries              | User-provided queries        |
+| Aspect            | Proc Macro                 | Dynamic                       |
+| ----------------- | -------------------------- | ----------------------------- |
+| Query source      | Compile-time literal       | Runtime string                |
+| Graph lifetime    | Compile-time only          | Runtime                       |
+| Data construction | Direct (no effect stream)  | `RuntimeEffect` stream + exec |
+| Definition calls  | Rust function calls        | Explicit return stack         |
+| Return stack      | Rust call stack            | `Vec<Frame>`                  |
+| Backtracking      | Drop + re-alloc            | `truncate()` effects          |
+| Performance       | Zero dispatch, single pass | Interpretation + 2 pass       |
+| Type safety       | Compile-time checked       | Runtime types                 |
+| Use case          | Known queries              | User-provided queries         |
 
 ## Consequences
 
@@ -740,11 +864,38 @@ These mechanisms deserve their own ADR (fuel budget design, configurable limits,
 
 The IR design is WASM-compatible:
 
-- **Single arena allocation**: No fragmentation concerns in linear memory
-- **`usize` offsets**: 32-bit on WASM32, limiting arena to 4GB (sufficient for any query)
-- **`BTreeMap` for objects**: Deterministic iteration order ensures reproducible output
-- **Per-type alignment**: Segment offsets computed at construction time, respecting target alignment requirements
-- **No platform-specific primitives**: All types are portable (`u16`, `u32`, `Box<[u8]>`)
+- **Single arena allocation**: No fragmentation concerns in linear memory. Note: WASM linear memory grows in 64KB pages; the arena coexists with other allocations (e.g., tree-sitter's memory) but this is standard for any WASM allocation.
+- **Explicit alignment**: Arena allocated with `std::alloc::Layout`, segment offsets computed with `align_up()`. Prevents misaligned access traps on WASM and strict ARM.
+- **`u32` offsets**: All segment offsets are `u32`, matching WASM32's pointer size. 4GB arena limit is sufficient for any query.
+- **`BTreeMap` for objects**: Deterministic iteration order ensures reproducible output across platforms.
+- **Fixed-size Entrypoints**: The `Entrypoint` struct (12 bytes, align 4) avoids variable-length inline strings that would cause alignment hazards.
+- **No platform-specific primitives**: All types are portable (`u16`, `u32`, byte arrays).
+- **Allocator Independence**: Uses `std::alloc::alloc` via `Layout`. On `wasm32-unknown-unknown`, this defaults to the system allocator. Implementers targeting other environments (e.g., Emscripten) must ensure a global allocator is configured.
+
+#### Serialization Format
+
+The arena uses a simple binary format for caching compiled queries to disk. The current scope is limited to same-machine, same-version usage (e.g., caching a compiled query between CLI invocations). Cross-architecture portability and version migration are explicitly out of scope for this ADR and will be addressed in future work if needed.
+
+- **Validation**: The `magic` bytes must be `b"PLNK"`. The `version` field must match the exact compiler version AND platform ABI hash (pointer width + endianness). Any mismatch invalidates the cache.
+- **Byte order**: Native (little-endian on x86/ARM/WASM). No byte-swapping is performed.
+- **String encoding**: UTF-8 for all string data (entrypoint names, data field names, variant tags).
+- **Layout**: Header followed by raw arena bytes:
+
+```
+Header (16 bytes):
+  magic: [u8; 4]           // "PLNK"
+  version: u32             // format version (must match exactly)
+  arena_len: u32           // byte length of arena data
+  segment_count: u32       // number of segment offset entries
+
+Segment Offsets (segment_count × 4 bytes):
+  [u32; segment_count]     // successors_offset, effects_offset, ...
+
+Arena Data (arena_len bytes):
+  [u8; arena_len]          // raw arena bytes, used directly without fixup
+```
+
+**Loading**: The loader verifies magic, version, and arena length. If any mismatch occurs, the cache is invalidated and the query is recompiled. No byte-swapping or layout fixup is performed—mismatched architectures simply trigger recompilation.
 
 ### Considered Alternatives
 
@@ -763,6 +914,9 @@ The IR design is WASM-compatible:
 5. **Vectorized Reference Markers (`Vec<RefTransition>`)**
    - Rejected: Optimized for alias chains (e.g. `A = B`, `B = C`) to allow full epsilon elimination. However, this bloats the `Transition` struct for all other cases. Standard epsilon elimination is sufficient; traversing a few remaining epsilon transitions for aliases is cheaper than increasing memory pressure on the whole graph.
 
+6. **Portable binary format**
+   - Deferred: Cross-architecture serialization would require byte-swapping and layout fixups. Current scope is same-machine caching only; portability can be added later if needed.
+
 ## References
 
 - Bazaco, D. (2022). "Building a Regex Engine" blog series. https://www.abstractsyntaxseed.com/blog/regex-engine/introduction — NFA construction and modern regex features