lab becomes a trace-first product.
The primary object is not a tab, endpoint, or homepage. It is a run artifact: a trace showing what code ran, what capabilities it had, what happened, and what came out.
The current five modes stay, but as examples/presets:
- Sandbox
- KV Read
- Chain
- Generate
- Spawn
They should teach and seed usage, not define the main information architecture.
Shipped baseline (keep tightening):
- Homepage is marketing + curl; Compose (
/compose) runs all five modes in the browser and returnsresultIdlinks to/results/{id}. - Trace viewer shows mode-specific sections (KV snapshot note, spawn depth note, collapsible JSON for large payloads).
- Fork on a trace hydrates Compose via
sessionStorage. - Examples (
/examples) and trace schema (/docs/result-schema,docs/result-schema.md) support external consumers. - Saved recipes (named playbooks) and a richer gallery are still out of scope until trace URLs prove sticky.
People will share and revisit traces, not tabs.
The best artifact is a permalink to a specific run:
- exact code or prompt
- exact capabilities
- exact output
- exact timing
- exact orchestration structure
This should be understandable in seconds by someone who did not create it.
-
Trace is the object
- Every meaningful run should produce a trace.
- The viewer should feel like a durable artifact, not debug output.
-
Examples are onboarding, not product structure
- Existing phases remain useful.
- They move behind examples/presets/tutorial framing.
-
Capabilities stay visible
- What code can do should always be legible.
- Capability boundaries are part of the product value.
-
One visual grammar across all modes
- Sandbox, KV, Chain, Generate, and Spawn should all resolve to the same trace mental model.
- Each mode can specialize the presentation, but not invent a new product shape.
-
Shareability beats more surface area
- A shared run is more valuable than another demo tab.
- New top-level sections need a clear artifact or workflow payoff.
-
Keep it simple
- No accounts unless proven necessary.
- No speculative collaboration features.
- No new abstraction unless it directly improves sharing, reading, or composing traces.
Users should be able to:
- understand the capability model quickly
- run or inspect an example without reading much
- share a specific run with a short link
- read a trace and understand what happened
- compose their own run after seeing examples
- fork an existing trace into a new run
Do not grow the current tab bar into more phases.
Preferred top-level structure:
- Explore
- Compose
- Traces
- Docs
Meaning:
- Explore: examples, presets, tutorials, authored starting points
- Compose: write code or prompts, configure capabilities, run
- Traces: durable trace viewer, later gallery/directory
- Docs: API reference, architecture, capability model, deployment
Safer incremental structure if needed:
- Examples
- Compose
- Viewer
The homepage should move from "textarea playground with tabs" to "product overview with a primary compose/share path."
Target sections:
-
Hero
- one-line value prop
- primary CTA to compose
- secondary CTA to example traces
-
Core primitives
- isolates
- capabilities
- traces
-
Example runs
- sandbox
- kv read
- chain
- generate
- spawn
-
Trace anatomy
- code or prompt
- capabilities
- result
- timing
- structure
-
API/docs links
- routes
- architecture
- GitHub
- self-host/deploy
Goal:
Make every run produce a durable artifact.
Scope:
- define a trace schema shared across run types
- persist traces in KV with a short id
- return
resultIdfrom run endpoints - add a read-only trace viewer route
- add share affordance in UI
Acceptance criteria:
- every successful run returns a
resultId - a trace URL renders without requiring editor context
- viewer shows enough information to understand the run by itself
- trace URLs can be copied and opened directly
Notes:
- start without auth
- use expiry only if cost needs it
- keep viewer read-only
Goal:
Make all modes feel like one product.
Scope:
- define common trace sections
- add mode-specific rendering patterns:
- sandbox: code, caps, result, timing
- kv read: code, caps, result, timing, snapshot note
- chain: ordered pipeline
- generate: prompt -> generated code -> result
- spawn: tree view with depth and children
- reduce raw JSON dumping in UI
Acceptance criteria:
- all five modes render into trace-shaped output
- users can compare different run types with the same mental model
- capability visibility is consistent
Goal:
Restructure the product shell around traces and composition.
Scope:
- replace phase-first tab framing on the homepage
- add clearer hero and information hierarchy
- surface examples as cards/presets
- surface trace viewer as a first-class destination
Acceptance criteria:
- homepage communicates product value without requiring interaction first
- examples feel secondary to the trace object
- nav reflects user goals, not internal implementation phases
Goal:
Turn the product into a real tool, not just a viewer.
Scope:
- editor/workspace for custom code
- capability picker
- prompt mode for generate
- run inline, then open/share resulting trace
Acceptance criteria:
- user can create a custom run without editing source code
- result always resolves to a trace
- compose flow is simpler than using the raw API
Goal:
Make traces generative, not just inspectable.
Scope:
- fork button from trace viewer
- hydrate compose view from trace contents
- preserve code/prompt/capabilities as starting point
Acceptance criteria:
- any trace can seed a new run
- viewer -> compose is one click
- examples can also be implemented as forkable traces
Goal:
Make traces useful beyond the UI.
Scope:
- machine-readable trace route
- stable trace schema docs
- easy embedding/citation path
Acceptance criteria:
- traces can be fetched as JSON
- schema is documented
- other tools can consume trace artifacts directly
Goal:
Create a public browsing surface for good traces.
Scope:
- curated example traces
- categories by capability/pattern
- later, public directory if warranted
Acceptance criteria:
- new users can browse representative traces quickly
- examples reinforce product value, not distract from it
Work one active item at a time, but keep the larger list visible.
Recommended order:
- trace schema
- trace persistence
- trace viewer route
- share button and URL handling
- unified render treatment for all run types
- homepage IA refresh
- compose workspace
- fork from trace
- trace JSON route
- curated example gallery
Based on the current code:
- Trace creation lives in
worker/index.ts:respondWithTrace+saveTrace+ Effect-wrapped run paths. - Chain defines the richest
trace[]execution shape; other modes use the sameStoredTraceenvelope with type-specificrequestfields. - App shell is SvelteKit:
src/routes/compose/+page.svelte,src/routes/results/[id]/+page.svelte,src/routes/data.remote.tsfor remote calls to the Worker. - Docs:
docs/result-schema.mdandREADME.mdshould stay aligned with the Worker’s persisted JSON.
Questions to answer before or during phase 1:
- Should traces expire by default?
- Should failed runs get trace ids too?
- Should traces store full input/output always, or truncate in storage only in rendering?
- Do we want example traces to be authored by code in repo or seeded into storage?
- Does the share URL point to HTML only, or HTML plus JSON from day one?
- How opinionated should the trace schema be about mode-specific fields?
Avoid:
- adding more primary tabs/phases as the main product move
- adding accounts before shareability proves value
- building collaborative features before trace viewing/forking exists
- building a complex editor before the trace object is solid
- over-designing a gallery before trace URLs are actually useful
Use this to evaluate future changes:
- Does this strengthen the trace as the main artifact?
- Does this make the product more useful to revisit or share?
- Is capability visibility preserved?
- Is the result understandable without reading code first?
- Does the UI feel like artifact viewing, not debug output?
- Is navigation organized around user goals?
- Are examples clearly examples, not the entire product?
- Does this reuse the existing route architecture simply?
- Does this avoid speculative abstraction?
- Does this keep the path to compose/fork straightforward?
Preferred way to work:
- maintain this larger prioritized plan in Markdown
- keep only one active implementation item at a time
- re-rank when new evidence appears
- avoid drifting into unrelated polish not justified by the plan
Phase 1 foundation is in place. Next highest-leverage gaps (pick one thread at a time):
- Saved recipes (phase “later” in habit loop) — named, loadable flows in KV, if product validation says yes.
- Gallery (phase 7) — curated trace IDs + categories; optional
PUBLIC_EXAMPLE_TRACE_IDfor a stable demo link. - Unified rendering polish (phase 2) — spawn tree only if Worker returns structured child metadata; otherwise keep shallow spawn notes.
- Resolve open product questions (expiry, truncation) once KV cost or privacy needs force a decision.