Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,14 @@ since = "849762245925cce325c04da1d604088370ec3723"

## Unreleased (v0.8.4)

- TBD
- feat(gambit): add `createDefaultedRuntime` and defaulted `runDeck` wrapper
with CLI-equivalent provider/model routing for library callers
- refactor(gambit): route CLI runtime/provider setup through shared
`default_runtime` construction path
- feat(demo-runner): migrate demo test-deck prompt generation to Gambit default
runtime wrapper (no hardwired OpenRouter provider)
- docs(gambit): add migration guidance for `runDeck` wrapper and `runDeckCore`
replacement mapping

## v0.8.3

Expand Down
78 changes: 74 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,10 +100,10 @@ Drop into a REPL (streams by default):
npx @bolt-foundry/gambit repl <deck>
```

Run a persona against a root deck (test bot):
Run a persona against a root deck (scenario):

```
npx @bolt-foundry/gambit test-bot <root-deck> --test-deck <persona-deck>
npx @bolt-foundry/gambit scenario <root-deck> --test-deck <persona-deck>
```

Grade a saved session:
Expand All @@ -124,6 +124,23 @@ Tracing and state: 
`--verbose` to print events\
`--state <file>` to persist a session.

### Worker sandbox defaults

- Deck-executing CLI surfaces default to worker sandbox execution.
- Use `--no-worker-sandbox` (or `--legacy-exec`) to force legacy in-process
execution.
- `--worker-sandbox` explicitly forces worker execution on.
- `--sandbox` / `--no-sandbox` are deprecated aliases.
- `gambit.toml` equivalent:
```toml
[execution]
worker_sandbox = false # same as --no-worker-sandbox
# legacy_exec = true # equivalent rollback toggle
```

The npm launcher (`npx @bolt-foundry/gambit ...`) runs the Gambit CLI binary for
your platform, so these defaults and flags apply there as well.

## Using the Simulator

The simulator is the local Debug UI that streams runs and renders traces.
Expand Down Expand Up @@ -173,6 +190,59 @@ Define `contextSchema`/`responseSchema` with Zod to validate IO, and implement\
`ctx.spawnAndWait({ path, input })`. Emit structured trace events with\
`ctx.log(...)`.

### Runtime defaults for programmatic `runDeck`

`runDeck` from `@bolt-foundry/gambit` now uses CLI-equivalent provider/model
defaults (alias expansion, provider routing, fallback behavior).

Before (direct-provider setup in each caller):

```ts
import { createOpenRouterProvider, runDeck } from "jsr:@bolt-foundry/gambit";

const provider = createOpenRouterProvider({
apiKey: Deno.env.get("OPENROUTER_API_KEY")!,
});
await runDeck({
path: "./root.deck.md",
input: { message: "hi" },
modelProvider: provider,
});
```

After (defaulted wrapper):

```ts
import { runDeck } from "jsr:@bolt-foundry/gambit";

await runDeck({
path: "./root.deck.md",
input: { message: "hi" },
});
```

Per-runtime override (shared runtime object):

```ts
import { createDefaultedRuntime, runDeck } from "jsr:@bolt-foundry/gambit";

const runtime = await createDefaultedRuntime({
fallbackProvider: "codex-cli",
});

await runDeck({
runtime,
path: "./root.deck.md",
input: { message: "hi" },
});
```

Replacement mapping:

- Legacy direct core passthrough export: `runDeck` -> `runDeckCore`
- Defaulted wrapper export: `runDeck`
- Runtime builder: `createDefaultedRuntime`

---

## Author your first deck
Expand Down Expand Up @@ -271,8 +341,8 @@ npx @bolt-foundry/gambit serve ./examples/respond_flow/decks/root.deck.ts --port
Then:

1. Open `http://localhost:8000/test`, pick the **Escalation persona**, and run
it. Leave the “Use test deck input for init” toggle on to see persona data
seed the init form automatically.
it. Leave the “Use scenario deck input for init” toggle on to see persona
data seed the init form automatically.
2. Switch to the Debug tab to inspect the session—the child deck emits a
`gambit_respond` payload that now shows up as a structured assistant turn.
3. Head to the Calibrate tab and run the **Respond payload grader** to exercise
Expand Down
2 changes: 1 addition & 1 deletion deno.jsonc
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
"bundle:sim:sourcemap": "deno run -A scripts/bundle_simulator_ui.ts --sourcemap=external",
"bundle:sim:web": "deno run -A scripts/bundle_simulator_ui.ts --platform=browser",
"bundle:sim:web:sourcemap": "deno run -A scripts/bundle_simulator_ui.ts --platform=browser --sourcemap=external",
"serve:bot": "mkdir -p /tmp/gambit-bot-root && GAMBIT_BOT_ROOT=/tmp/gambit-bot-root deno run -A src/cli.ts serve src/decks/gambit-bot/PROMPT.md --bundle --port 8000",
"serve:bot": "mkdir -p /tmp/gambit-bot-root && GAMBIT_SIMULATOR_BUILD_BOT_ROOT=/tmp/gambit-bot-root GAMBIT_BOT_ROOT=/tmp/gambit-bot-root deno run -A src/cli.ts serve src/decks/gambit-bot/PROMPT.md --bundle --port 8000",
"serve:bot:sandbox": "deno run -A scripts/serve_bot_sandbox.ts",
"build_npm": "deno run -A scripts/build_npm.ts"
},
Expand Down
2 changes: 1 addition & 1 deletion docs/external/concepts/runtime.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ safe/observable.
- `gambit_end`: enable with `![end](gambit://cards/end.card.md)` in Markdown (or
`allowEnd: true` in TypeScript decks). Calling it returns a sentinel
`{ __gambitEnd: true, payload?, status?, message?, code?, meta? }` so
CLI/test-bot loops stop reinjecting the closing assistant turn.
CLI/scenario loops stop reinjecting the closing assistant turn.

## State and turn order

Expand Down
22 changes: 11 additions & 11 deletions docs/external/guides/authoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,10 @@ verification.
references (action/test/grader) and schema fragments into the parent deck.
- Action decks are child decks exposed as model tools. Names must match
`^[A-Za-z_][A-Za-z0-9_]*$` and avoid the `gambit_` prefix (reserved).
- Persona/test decks may accept free-form user turns. Use the `acceptsUserTurns`
flag to control this behavior: root decks default to `true`, while action
decks default to `false`. Set it explicitly to `true` for persona/bot decks or
to `false` for workflow-only decks.
- Persona/scenario decks may accept free-form user turns. Use the
`acceptsUserTurns` flag to control this behavior: root decks default to
`true`, while action decks default to `false`. Set it explicitly to `true` for
persona/bot decks or to `false` for workflow-only decks.

## Pick a format

Expand Down Expand Up @@ -77,7 +77,7 @@ migrate a repository, run:
deno run -A packages/gambit/scripts/migrate-schema-terms.ts <repo-root>
```

## Action decks, test decks, grader decks
## Action decks, scenario decks, grader decks

- Add action decks in front matter or TS definitions:
`actionDecks = [{ name = "get_time", path = "./get_time.deck.ts" }]`.
Expand All @@ -101,10 +101,10 @@ deno run -A packages/gambit/scripts/migrate-schema-terms.ts <repo-root>
should set `acceptsUserTurns = true` and may declare its own `contextSchema`
(for example `contextSchema = "../schemas/my_persona_test.zod.ts"`) so the
Test tab renders a schema-driven “Scenario” form for that persona.
- For persona/test decks, you can embed
- For persona/scenario decks, you can embed
`![generate-test-input](gambit://cards/generate-test-input.card.md)` to
include the Test Bot init-fill contract instructions.
- Test Bot init fill: when a Test Bot run is missing required init fields, the
include the scenario init-fill contract instructions.
- Scenario init fill: when a scenario run is missing required init fields, the
selected persona deck is asked to supply only the missing values before the
run begins. The persona receives a single user message containing a JSON
payload like:
Expand Down Expand Up @@ -133,8 +133,8 @@ deno run -A packages/gambit/scripts/migrate-schema-terms.ts <repo-root>
- Markdown roots default to `true`; TypeScript decks default to `false`
everywhere. Set it to `false` for any workflow deck that should never accept
user turns (regardless of how it's run).
- Persona/test decks should set `acceptsUserTurns = true` so they can receive
messages even when invoked as non-root bots.
- Persona/scenario decks should set `acceptsUserTurns = true` so they can
receive messages even when invoked as non-root bots.

## Synthetic tools and handlers

Expand Down Expand Up @@ -170,7 +170,7 @@ deno run -A packages/gambit/scripts/migrate-schema-terms.ts <repo-root>
http://localhost:8000/debug.
- Tracing: add `--verbose` for console traces or `--trace out.jsonl` to persist
events; use `--state state.json` with `run` to persist conversation state
between turns. When `--state` is omitted, test-bot/serve sessions default to
between turns. When `--state` is omitted, scenario/serve sessions default to
`<project-root>/.gambit/sessions/...` where each session includes `state.json`
(materialized snapshot) plus append-only `events.jsonl`, `feedback.jsonl`, and
`grading.jsonl` for downstream ingestion. The project root is the nearest
Expand Down
29 changes: 19 additions & 10 deletions docs/external/reference/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,22 +11,22 @@ How to run Gambit, the agent harness framework, locally and observe runs.
- Command help: `deno run -A src/cli.ts help <command>` (or
`deno run -A src/cli.ts <command> -h`).
- Run once:
`deno run -A src/cli.ts run <deck> [--context <json|string>] [--message <json|string>] [--model <id>] [--model-force <id>] [--trace <file>] [--state <file>] [--stream] [--responses] [--verbose]`
`deno run -A src/cli.ts run <deck> [--context <json|string>] [--message <json|string>] [--model <id>] [--model-force <id>] [--trace <file>] [--state <file>] [--stream] [--responses] [--verbose] [--worker-sandbox|--no-worker-sandbox|--legacy-exec]`
- Check models: `deno run -A src/cli.ts check <deck>`
- REPL: `deno run -A src/cli.ts repl <deck>` (defaults to
`src/decks/gambit-assistant.deck.md` in a local checkout). Streams by default
and keeps state in memory for the session.
- Test bot (CLI):
`deno run -A src/cli.ts test-bot <root-deck> --test-deck <persona-deck> [--context <json|string>] [--bot-input <json|string>] [--message <json|string>] [--max-turns <n>] [--state <file>] [--grade <grader-deck> ...] [--trace <file>] [--responses] [--verbose]`
- Scenario (CLI):
`deno run -A src/cli.ts scenario <root-deck> --test-deck <persona-deck> [--context <json|string>] [--bot-input <json|string>] [--message <json|string>] [--max-turns <n>] [--state <file>] [--grade <grader-deck> ...] [--trace <file>] [--responses] [--verbose] [--worker-sandbox|--no-worker-sandbox|--legacy-exec]`
- Grade (CLI):
`deno run -A src/cli.ts grade <grader-deck> --state <file> [--model <id>] [--model-force <id>] [--trace <file>] [--responses] [--verbose]`
`deno run -A src/cli.ts grade <grader-deck> --state <file> [--model <id>] [--model-force <id>] [--trace <file>] [--responses] [--verbose] [--worker-sandbox|--no-worker-sandbox|--legacy-exec]`
- Export bundle (CLI):
`deno run -A src/cli.ts export [<deck>] --state <file> --out <bundle.tar.gz>`
- Debug UI: `deno run -A src/cli.ts serve <deck> --port 8000` then open
http://localhost:8000/. This serves a multi-page UI:

- Debug (default): `http://localhost:8000/debug`
- Test: `http://localhost:8000/test-bot`
- Test: `http://localhost:8000/test`
- Calibrate: `http://localhost:8000/calibrate`

The WebSocket server streams turns, traces, and status updates.
Expand All @@ -46,15 +46,24 @@ How to run Gambit, the agent harness framework, locally and observe runs.
- `GAMBIT_RESPONSES_MODE=1`: env alternative to `--responses` for runtime/state.
- `GAMBIT_OPENROUTER_RESPONSES=1`: route OpenRouter calls through the Responses
API (experimental; chat remains the default path).
- Worker execution defaults on for deck-executing surfaces. Use
`--no-worker-sandbox` (or `--legacy-exec`) to roll back to legacy in-process
execution. `--sandbox/--no-sandbox` still work as deprecated aliases.
- `gambit.toml` config equivalent:
```toml
[execution]
worker_sandbox = false # same as --no-worker-sandbox
# legacy_exec = true # equivalent rollback toggle
```

## State and tracing

- `--state <file>` (run/test-bot/grade/export): load/persist messages so you can
- `--state <file>` (run/scenario/grade/export): load/persist messages so you can
continue a conversation; skips `gambit_context` on resume. `grade` writes
`meta.gradingRuns` back into the session state, while `export` reads the state
file to build the bundle.
- `--out <file>` (export): bundle output path (tar.gz).
- `--grade <grader-deck>` (test-bot): can be repeated; graders run in the order
- `--grade <grader-deck>` (scenario): can be repeated; graders run in the order
provided and append results to `meta.gradingRuns` in the same session state
file.
- `--trace <file>` writes JSONL trace events; `--verbose` prints trace to
Expand Down Expand Up @@ -91,17 +100,17 @@ How to run Gambit, the agent harness framework, locally and observe runs.
`window.gambitFormatTrace` hook in the page; return a string or
`{role?, summary?, details?, depth?}` to override the entry that appears in
the Traces & Tools pane.
- The Test page reuses the same simulator runtime but drives persona/test-bot
- The Test page reuses the same simulator runtime but drives persona/scenario
decks so you can batch synthetic conversations, inspect per-turn scoring, and
export JSONL artifacts for later ingestion. List personas by declaring
`[[testDecks]]` entries in your root deck (for example
`gambit/examples/advanced/voice_front_desk/decks/root.deck.md`). Each entry’s
`path` should point to a persona deck (Markdown or TS) that includes
`acceptsUserTurns = true`; the persona deck’s own `contextSchema` and defaults
power the Scenario/Test Bot form (see
power the Scenario form (see
`gambit/examples/advanced/voice_front_desk/tests/new_patient_intake.deck.md`).
Editing those deck files is how you add/remove personas now—there is no
`.gambit/test-bot.md` override.
`.gambit/scenario.md` override.
- The Calibrate page is the regroup/diagnostics view for graders that run
against saved Debug/Test sessions; it currently serves as a placeholder until
the grading transport lands.
Expand Down
7 changes: 6 additions & 1 deletion docs/external/reference/cli/commands/bot.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,17 @@
+++
command = "bot"
summary = "Run the Gambit bot assistant"
usage = "gambit bot [<dir>] [--bot-root <dir>] [--model <id>] [--model-force <id>] [--responses] [--verbose]"
usage = "gambit bot [<dir>] [--bot-root <dir>] [--model <id>] [--model-force <id>] [--responses] [--verbose] [--worker-sandbox|--no-worker-sandbox|--legacy-exec]"
flags = [
"--bot-root <dir> Allowed folder for bot file writes (defaults to workspace.decks if set; overrides <dir>)",
"--model <id> Default model id",
"--model-force <id> Override model id",
"--responses Run runtime/state in Responses mode",
"--worker-sandbox Force worker execution on",
"--no-worker-sandbox Force worker execution off",
"--legacy-exec Alias for --no-worker-sandbox",
"--sandbox Deprecated alias for --worker-sandbox",
"--no-sandbox Deprecated alias for --no-worker-sandbox",
"--verbose Print trace events to console",
]
+++
Expand Down
7 changes: 6 additions & 1 deletion docs/external/reference/cli/commands/grade.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,19 @@
+++
command = "grade"
summary = "Grade a saved state file"
usage = "gambit grade <grader-deck.(ts|md)> --state <file> [--model <id>] [--model-force <id>] [--trace <file>] [--responses] [--verbose]"
usage = "gambit grade <grader-deck.(ts|md)> --state <file> [--model <id>] [--model-force <id>] [--trace <file>] [--responses] [--verbose] [--worker-sandbox|--no-worker-sandbox|--legacy-exec]"
flags = [
"--grader <path> Grader deck path (overrides positional)",
"--state <file> Load/persist state",
"--model <id> Default model id",
"--model-force <id> Override model id",
"--trace <file> Write trace events to file (JSONL)",
"--responses Run runtime/state in Responses mode",
"--worker-sandbox Force worker execution on",
"--no-worker-sandbox Force worker execution off",
"--legacy-exec Alias for --no-worker-sandbox",
"--sandbox Deprecated alias for --worker-sandbox",
"--no-sandbox Deprecated alias for --no-worker-sandbox",
"--verbose Print trace events to console",
]
+++
Expand Down
13 changes: 12 additions & 1 deletion docs/external/reference/cli/commands/repl.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,25 @@
+++
command = "repl"
summary = "Start an interactive REPL"
usage = "gambit repl <deck.(ts|md)> [--context <json|string>] [--message <json|string>] [--model <id>] [--model-force <id>] [--responses] [--verbose]"
usage = "gambit repl <deck.(ts|md)> [--context <json|string>] [--message <json|string>] [--model <id>] [--model-force <id>] [--responses] [--verbose] [-A|--allow-all|--allow-<kind>] [--worker-sandbox|--no-worker-sandbox|--legacy-exec]"
flags = [
"--context <json|string> Context payload (seeds gambit_context; legacy --init still works)",
"--message <json|string> Initial user message (sent before assistant speaks)",
"--model <id> Default model id",
"--model-force <id> Override model id",
"--responses Run runtime/state in Responses mode",
"--verbose Print trace events to console",
"-A, --allow-all Allow all session permissions (read/write/run/net/env)",
"--allow-read[=<paths>] Session read override (all when value omitted)",
"--allow-write[=<paths>] Session write override (all when value omitted)",
"--allow-run[=<entries>] Session run override (all when value omitted)",
"--allow-net[=<hosts>] Session net override (all when value omitted)",
"--allow-env[=<names>] Session env override (all when value omitted)",
"--worker-sandbox Force worker execution on",
"--no-worker-sandbox Force worker execution off",
"--legacy-exec Alias for --no-worker-sandbox",
"--sandbox Deprecated alias for --worker-sandbox",
"--no-sandbox Deprecated alias for --no-worker-sandbox",
]
+++

Expand Down
13 changes: 12 additions & 1 deletion docs/external/reference/cli/commands/run.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
+++
command = "run"
summary = "Run a deck once"
usage = "gambit run [<deck.(ts|md)>] [--context <json|string>] [--message <json|string>] [--model <id>] [--model-force <id>] [--trace <file>] [--state <file>] [--stream] [--responses] [--verbose]"
usage = "gambit run [<deck.(ts|md)>] [--context <json|string>] [--message <json|string>] [--model <id>] [--model-force <id>] [--trace <file>] [--state <file>] [--stream] [--responses] [--verbose] [-A|--allow-all|--allow-<kind>] [--worker-sandbox|--no-worker-sandbox|--legacy-exec]"
flags = [
"--context <json|string> Context payload (seeds gambit_context; legacy --init still works)",
"--message <json|string> Initial user message (sent before assistant speaks)",
Expand All @@ -12,6 +12,17 @@ flags = [
"--stream Enable streaming responses",
"--responses Run runtime/state in Responses mode",
"--verbose Print trace events to console",
"-A, --allow-all Allow all session permissions (read/write/run/net/env)",
"--allow-read[=<paths>] Session read override (all when value omitted)",
"--allow-write[=<paths>] Session write override (all when value omitted)",
"--allow-run[=<entries>] Session run override (all when value omitted)",
"--allow-net[=<hosts>] Session net override (all when value omitted)",
"--allow-env[=<names>] Session env override (all when value omitted)",
"--worker-sandbox Force worker execution on",
"--no-worker-sandbox Force worker execution off",
"--legacy-exec Alias for --no-worker-sandbox",
"--sandbox Deprecated alias for --worker-sandbox",
"--no-sandbox Deprecated alias for --no-worker-sandbox",
]
+++

Expand Down
Loading
Loading