diff --git a/docs/dsl-examples-echo.md b/docs/dsl-examples-echo.md new file mode 100644 index 0000000..2a8f676 --- /dev/null +++ b/docs/dsl-examples-echo.md @@ -0,0 +1,114 @@ +# Maroon DSL — Echo and Migration Examples (Draft) + +These examples illustrate how the DSL in `docs/dsl-language.md` maps to fibers, queues, `select`, and schema migration using a minimal Echo fiber. Syntax here follows the conceptual spec (fibers, `state current/next`, `send`/`recv`, `after(ms)`, `await`, `select`, `external`) rather than the current parser status. + +## Example 1 — Minimal Echo Fiber + +Purpose: show `recv`/`send` and per-fiber state. Deterministic: no wall clock, pure counter. + +```dsl +// Outbound payload for echoes +struct EchoOut { + echo_of: String, + count: U64, + fiber: String, +} + +// Explicit constructor parameters: fiber identity + typed directional queues +fiber Echo(name: String, in_queue_first: RecvQueue, in_queue_second: RecvQueue, out_queue: SendQueue) { + state current { + seen: U64, + } + + // Pure constructor: runs once on fiber creation + fn init { + self.seen = 0; + } + + // Main loop: illustrate select across multiple inbound queues + fn main() { + loop { + select { + // Arm 1: receive from the first queue (RecvQueue) + let msg: String = self.in_queue_first.await => { + let n = self.seen + 1; + self.seen = n; + self.out_queue.send(EchoOut { echo_of: msg + " first", count: n, fiber: self.name }); + } + + // Arm 2: receive from the second queue (RecvQueue) + let msg: String = self.in_queue_second.await => { + let n = self.seen + 1; + self.seen = n; + self.out_queue.send(EchoOut { echo_of: msg + " second", count: n, fiber: self.name }); + } + } + } + } +} +``` + +Ingress/egress queues (illustrative bindings): +- In 1: `queue("echo.in.first.")` -> `in_queue_first: RecvQueue` +- In 2: `queue("echo.in.second.")` -> `in_queue_second: RecvQueue` +- Out: `queue("echo.out.")` -> `out_queue: SendQueue` + +Note: The runtime’s underlying channel is duplex, but the DSL encourages directional capability types for clarity and static safety. Use `DuplexQueue` only when both directions are truly needed (e.g., brokers/tests), or split a duplex handle into `(RecvQueue, SendQueue)` for explicit usage. + +## Example 2 — Two-Version Migration: current -> next + +Goal: demonstrate an explicit state migration that both renames a field and adds a new one. + +Current state (from the example above): + +```dsl +state current { + seen: U64, +} +``` + +Desired next state (phased rename): +- Add `last_input: Option` with an explicit default +- Introduce `count: U64` alongside existing `seen` to allow code to switch safely + +```dsl +migrate current -> next { + // `from.` is the old snapshot; `self.` is the new state + self.last_input = None; // explicit default + self.count = from.seen; // first step of renaming field +} + +state next { + seen: U64, + count: U64, + last_input: Option, +} +``` + +- Deploy with `state next` present: migration runs to completion in the background. Before migration finishes, code continues to use the `current` shape for reads/writes. +- After migration, update the main logic to use `self.count` instead of `self.seen`, and optionally track the last input when receiving a message: + +```dsl +// inside the first select arm +let n = self.count + 1; +self.count = n; +self.last_input = Some(msg); +self.out_queue.send(EchoOut { echo_of: msg + " first", count: n, fiber: self.name }); +``` + +Finalize/promotion: after rolling out the code that uses `self.count`, promote `next` -> `current` by removing the old `current` and the `migrate` block, leaving only the new `current`: + +```dsl +state next { + count: U64, + last_input: Option, +} +``` + +After promotion, any reference to the removed `seen` field is a compile error. + +Notes on migration: +- Migrations are pure and must not perform I/O or waits; no `await`/`select`/`external` inside `migrate`. +- Initialize every new-state field explicitly. Uninitialized fields are a compile error. +- Reading old fields is done via `from.`; writing new fields via `self.`. +- If a key type changes inside `Map`/`Set`, ensure the transformation preserves canonical ordering. diff --git a/docs/dsl-language.md b/docs/dsl-language.md new file mode 100644 index 0000000..598bfe6 --- /dev/null +++ b/docs/dsl-language.md @@ -0,0 +1,183 @@ +# Maroon DSL — Goals, Non‑Goals, and Core Spec (Draft) + +This doc describes a small, purpose-built language for Maroon. Code in this DSL compiles to Maroon IR (our “assembler”) and runs on the runtime. The goal is simple: you write business logic; the platform guarantees durable compute by determenistic execution on several nodes. + +## Why a DSL +- Deterministic by default: no hidden time/RNG(random number generator)/float surprises, stable iteration order +- Fits Maroon’s model: fibers, queues, futures, and timers are native concepts with deterministic scheduling. +- Safe to replay: re-running the same history yields the same state and outputs. +- Easy to check: we can warn/error on unbounded loops, impure code in `pure` functions, and risky waits. +- Smooth upgrades: state has versions and migrations, so rolling upgrades don’t corrupt data. +- Concurrency model: v1 runs under a single logical total order for simplicity and replayability; future versions may relax this with explicit primitives (compare-and-swap, CRDTs) where commutativity makes it safe—without sacrificing deterministic outcomes. + +## Out of Scope (on purpose) +- No arbitrary threads/syscalls/FFI(foreign function interface). +- No host-specific behavior (wall-clock reads, RNG, hash-map iteration order, IEEE float edge cases). +- No DIY persistence: only schema-defined types go to storage; the runtime handles format and migrations. +- No implicit I/O: external calls must be declared with types, timeouts, and retry policies. +- No relaxed consistency primitives (CAS/CRDTs) in v1: all effects are sequenced by the single total order. + +## How Code Runs +- Unit of work: a lightweight [fiber](./fiber.md). +- Communication: named FIFO queues with directional capability types (`RecvQueue`, `SendQueue`, optional `DuplexQueue` for both directions). +- Time: logical monotonic ms via timers (`after(ms)`), not wall-clock. + +### Source Organization (design note) +- Declaration order and file/dir layout are not part of semantics; the compiler builds a single module graph from all inputs. +- Interfaces (ingress/egress, queues, external gateways) are declared in the DSL and compiled together with the code that uses them. + +### Concurrency and Ordering (design note) +- v1 uses a single logical total order of events/effects, which makes execution, replay, and debugging straightforward. +- Over time, we may introduce opt-in, scoped primitives that allow concurrency without global coordination: + - Compare-and-swap (CAS) on targeted state fields. + - CRDTs (conflict-free replicated data types) for commutative/associative updates (e.g., counters, OR-sets) with deterministic merge semantics. +- Any relaxation will remain compatible with deterministic replay by using canonical encodings, explicit merge rules, and well-defined failure/retry behavior. + +## External Effects +- Three kinds: `pure` (no effects), `timer` (logical), `external(service)` (declared capability). +- External call must declare: request/response types, idempotency key, timeout, retry/backoff; optional compensation. +- Observability (logs/metrics/traces) does not change behavior/order. + +## Numbers and Data +- Integers: `I64`, `U64` (optionally `I128` later). Overflow behavior is explicit: checked (default), saturating, or wrapping. +- Decimals: fixed-point `Decimal{scale}` (no floats/NaNs/inf). Rounding mode is explicit and stable. +- Text/bytes: `String` (UTF‑8) and `Bytes`. +- Collections: `Vec` (stable order), `Map` and `Set` with keys ordered by their byte encoding. A hash‑map (`HashMap`) may be provided for performance, but all observable operations are deterministic (e.g., iteration defined as canonical key order); non‑deterministic iteration is not exposed. +- Types: `struct`, `enum`, and `type` aliases with explicit field/variant order. + +## Canonical Encoding (why ordering is stable) +- Every value has one byte representation (platform-independent, invertible). We sort map/set keys by these bytes. +- Examples: big-endian integers; `Decimal{2}(1)` and `1.00` encode the same; strings use UTF‑8 NFC normalization; no “-0”. + +## Fiber State and Persistence +- No global app state. Each fiber owns its own persistent state, defined inside that fiber. +- Define per‑fiber state in the DSL: within a `fiber` block, declare `state current { ... }`. Optionally, during upgrades, also declare `state next { ... }`. Only these schema types are persisted for that fiber. +- Creation initializes `current` deterministically. Upgrades use a two‑version migration: `migrate current -> next { ... }` with at most two states present at any time. +- All writes happen within the owning fiber, driven by messages/timers. Other fibers cannot mutate this state; they must send messages. +- Reads see the fiber’s deterministic view for the current step. The runtime snapshots each fiber’s state in canonical form and replays that fiber’s message stream to recover. + +### State access inside a fiber +- Access persistent fields with `self.` for both reads and writes (e.g., `let n = self.count + 1; self.count = n`). +- Local variables use bare identifiers (e.g., `let n = 0;`). Shadowing state field names is not allowed. +- Initialize state via field defaults, a pure initializer `fn init { ... }`, or during `migrate` steps; avoid relying on implicit, unspecified defaults. + +### Initialization +- Purpose: set up newly created fibers deterministically before handling any messages. +- Syntax: place a `fn init { /* pure */ }` inside the `fiber` block. +- Semantics: + - Runs exactly once, only on creation of a new fiber instance, after constructor params are bound and before the first activation of `main`/handlers. + - Pure only: no `await`, `select`, `send`, or `external` inside `init`. + - Can read constructor params (e.g., `self.name`) and assign to state fields. + - Must leave the base state version fully initialized, either via field defaults or explicit assignments. +- Migrations and init: + - Existing fibers never run `init` during upgrades; they transition via `migrate current -> next` only. + - Creation when `state next` exists: + - Construct `state current` using its field defaults and `init`. + - Apply the migration `current -> next`. + - Only after migration completes is the fiber considered created; `main`/handlers may run thereafter. + - `init` executes against the `current` shape; fields introduced in `next` must be initialized in `migrate current -> next`. + - For effectful bootstrapping at creation, use a bootstrap message pattern; keep `init` pure. + +### Constructor parameters +- Fiber constructor parameters (identity, handles like queues, config) are read-only fields of the fiber instance. +- Access them as `self.` inside the fiber (e.g., `self.name`, `self.inbox_queue`). +- Parameters cannot be reassigned; locals remain bare identifiers. Shadowing parameter names is not allowed. +- In `select`, the sugar `self.queue.await` is valid when `self.queue` is a `RecvQueue<_>` (or `DuplexQueue<_>`, though using `RecvQueue` is preferred) and desugars to `await recv(self.queue)`. + +### State migrations (two‑version model) +- Syntax: `migrate current -> next { /* transforms */ }` placed inside the `fiber` block before `state next`. +- Scope: + - `from.`: read-only view of the previous `current` state snapshot. + - `self.`: the `next` state you must initialize. +- Rules: + - Explicit init: every newly introduced or type-changed field in `state next` must be assigned; unchanged fields carry forward implicitly. + - Determinism: transformations must be deterministic and terminate. + - Type changes: allowed if you provide an explicit transform; otherwise keep the same type. + - Collections: when changing key types in `Map`/`Set`, ensure canonical encoding order is preserved by re-encoding keys. + +Design note (v1 scope): migrations are defined per‑fiber using a two‑version model (`current` and `next`). Alternative models (e.g., per‑type/era migrations applied across instances) are under consideration and may be adopted if they provide better ergonomics without sacrificing determinism. + +Example (direct rename + add default): + ```dsl + migrate current -> next { + self.count = from.seen; + self.last_input = None; + } + + state next { count: U64, last_input: Option } + ``` + +#### Finalize/Promotion +- After migration completes and code uses the `next` fields, promote `next` -> `current` by removing the old `current` and the `migrate` block, leaving only `state current { ... }`. +- At any time, there must be at most two states present (`current` and optionally `next`). + +### Interface Schema Versioning (queues/gateways) +- Priority: define versioning for fiber interfaces (cross‑fiber queues and external gateway APIs) so producers/consumers can upgrade safely. +- Compatibility: interface types evolve via eras/versions with explicit upgrade rules; mixed‑version communication must be either rejected or mediated via canonical transforms. +- Storage vs. interface: internal fiber state shape is important, but interface compatibility governs safe rolling deploys and should be specified first. + +## Transactions and IDs +- Each transaction has a unique `idempotency ID` (assigned by gateways layer). +- Idempotency by design: re-applying a transaction yields the same result. + +## Resource Limits +- We track CPU/memory/I/O “cost”. +- Per-transaction and per-fiber limits apply with prioritization classes; the system avoids starving main business logic. Hitting a limit triggers backpressure on lower‑priority work first. + +## Build Pipeline +- DSL -> typed AST -> Maroon IR -> generated code that the runtime executes. +- You’ll see types like `Value`, `CreatePrimitiveValue`, `SetPrimitiveValue`, `SelectArm`, `State` in the generated layer. +- Runtime input/output: `Input = (LogicalTimeAbsoluteMs, Vec)`, results are `(UniqueU64BlobId, Value)`. + +## Static Checks (examples) +- `pure` functions can’t call effects or timers. +- `Map`/`Set` keys must be orderable (canonical encoding available). +- `select` cases must be cancellable or time-bounded; if a waitable is non‑cancellable it must be a single‑step arm. +- Reads/writes are validated against the active schema version during upgrades. + +## Core Building Blocks (maps 1:1 to runtime) +- Values: `Unit | Bool | I64 | U64 | Decimal{s} | Bytes | String | Vec | Map | Set | struct | enum`. +- Queues: named FIFO channels of `Value`, with directional capabilities (`RecvQueue`, `SendQueue`, and `DuplexQueue` when both are required). +- Futures/Timers: one-shot futures; `after(ms)` creates a timer; `await` resolves. + - Fibers: define a fiber type with parameters (identity) and its private `state` and handlers. + +### Queue Capability Types and API +- Types: + - `RecvQueue`: receive-only capability for a named channel of `T`. + - `SendQueue`: send-only capability for a named channel of `T`. + - `DuplexQueue`: full capability (both send and receive). Prefer directional types for clarity; reserve `DuplexQueue` for cases that truly need both directions. +- API: + - Send: `queue.send(value)` where `queue: SendQueue | DuplexQueue` and `value: T`. + - Receive (canonical): `recv(queue) -> Future` where `queue: RecvQueue | DuplexQueue`. + - Await (outside select): `let v: T = await recv(queue)`. + - Await sugar (inside select): `queue.await` is shorthand for `await recv(queue)` and requires `queue: RecvQueue | DuplexQueue`. +- Conversions/helpers (conceptual): + - `split(q: DuplexQueue) -> (RecvQueue, SendQueue)`. + - `join(rx: RecvQueue, tx: SendQueue) -> DuplexQueue` if they reference the same named channel. +- Alias (optional for ergonomics/back-compat in docs): `type Queue = DuplexQueue`. + +### Select Syntax and Waitables +- Waitables: any `Future` can be selected: `recv(queue)`, `after(ms)`, `external(...)`, etc. +- Canonical arms: + - `case await recv(queue) as v: T => { ... }` + - `case await after(ms) => { ... } // T = Unit` +- Concise let-binding arms (sugar, only in `select`): + - `let v: T = queue.await => { ... } // requires RecvQueue<_> (or DuplexQueue<_>); desugars to case await recv(queue)` + - `let _: Unit = after(ms).await => { ... } // desugars to case await after(ms)` +- Semantics: first-ready arm runs; other pending waitables are cancelled deterministically. +### Rust Interop (design note) +- To remain Rust‑friendly, we expose `await`/`select` as macros (e.g., `await!`, `maroon_await!`) when embedding; the core primitive is `select` and `await` is its single‑arm form. + +## Open Questions +- Exact syntax vs. minimal IR friendliness. +- Standard library scope (safe math, codecs, collection helpers). +- Effect capabilities and per-call metering. +- Formalization scope (how much of the core we specify/prove). + +## Next Steps +- Compile a small example to today’s IR and run it on the runtime. +- Finalize byte encodings for all base types and composite keys. +- Implement basic static checks (bounded loops, determinism guards, purity) in the DSL frontend. + +## Examples +- See `docs/dsl-examples-echo.md` for small, focused examples (fibers, queues, select sugar, and schema migration).