diff --git a/docs/README.md b/docs/README.md index 6604b23e..23257677 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,58 +1,120 @@ --- -title: "Perstack: Expert Stack for Agent-first Development" +title: "Perstack: The Agent Runtime" --- -# Perstack: Expert Stack for Agent-first Development +# Perstack: The Agent Runtime -Perstack is a package manager and runtime for agent-first development. -Define modular micro-agents as Experts in TOML, publish them to a registry, and compose them like npm packages. +Define AI agents as declarative **Experts** in TOML. Execute them with deterministic, event-derived tracking. Each Expert runs in its own isolated context — no shared state, no prompt bloat, full execution history. -**Perstack isn't another agent framework — it's npm/npx for agents.** - -- [Get Started →](./getting-started.md) +- [Getting Started →](./getting-started.md) - [Browse Registry →](https://platform.perstack.ai/) -## Key Features - -- **Agent-first development toolkit** - - Declarative Expert definitions as modular micro-agents - - Dependency management for composing Experts - - Public registry for reusing Experts instead of rebuilding them -- **Sandbox-ready runtime** - - Secure execution designed for sandbox integration - - Observable, event-driven architecture - - Reproducible, checkpoint-based history - -## Why Perstack? - -AI agent developers struggle with: -- Complex, monolithic agent apps -- Little to no reusability -- No dependency management -- Context windows that explode at scale - -Perstack fixes these with proven software engineering principles: -- **Isolation** — clear separation of concerns per Expert -- **Observability** — every step is visible and traceable -- **Reusability** — Experts compose declaratively through the runtime - -## Expert Stack - -Perstack provides **Expert Stack**: -- **Experts** — modular micro-agents -- **Runtime** — executes Experts -- **Registry** — shares Experts -- **Sandbox Integration** — safe production execution - -> [!NOTE] -> The name "Perstack" combines the Latin word "perītus" (expert) and "stack". - -## Next Steps - -- [Getting Started](./getting-started.md) — run your first Expert in 5 minutes -- [Understanding Perstack](./understanding-perstack/concept.md) — learn core concepts -- [Making Experts](./making-experts/README.md) — define and develop Experts -- [Using Experts](./using-experts/README.md) — run and integrate Experts +## What Perstack Solves + +Agentic app development has five structural problems. Perstack is designed to address each of them: + +| Problem | Perstack's Approach | +| :--- | :--- | +| **Tight coupling** | **A runtime** that separates Expert definitions from app code, tools, prompts, and models | +| **Broken feedback loops** | **CLI tools** to execute and analyze Experts from day one. Expert and app evolve independently. | +| **The developer owns everything** | Expert definitions in **`perstack.toml`** are written by domain experts using natural language. Developers focus on integration, not prompt engineering. | +| **No sustained behavior** | **Event-derived execution** and **step-level checkpoints** help maintain reproducible behavior, even across model or provider changes | +| **No real isolation** | **Isolation** is built into the runtime architecture — workspace boundaries, environment sandboxing, and tool whitelisting — so your platform can enforce security at the infrastructure level | + +For the full rationale, see the [root README](../README.md#why-perstack). + +## How It Works + +**1. Define** — Describe Experts in `perstack.toml` using natural language: + +```toml +[experts."fitness-assistant"] +description = "Manages fitness records and suggests training menus" +instruction = """ +Conduct interview sessions and manage records in `./fitness-log.md`. +Collaborate with `pro-trainer` for professional training menus. +""" +delegates = ["pro-trainer"] +``` + +Or generate one interactively: + +```bash +npx create-expert "Create a fitness assistant that delegates to a pro trainer" +``` + +**2. Execute** — Run from the CLI with real-time feedback: + +```bash +npx perstack start fitness-assistant "Start today's session" +``` + +**3. Integrate** — Embed into your application via the runtime API or Execution API: + +```typescript +import { run } from "@perstack/runtime" + +const checkpoint = await run({ + setting: { + model: "claude-sonnet-4-5-20250929", + providerConfig: { providerName: "anthropic" }, + expertKey: "fitness-assistant", + input: { text: "Start today's session" }, + }, +}) +``` + +## Documentation + +### Learn + +- [Getting Started](./getting-started.md) — create your first Expert and walk through the core workflow +- **Understanding Perstack** + - [Concept](./understanding-perstack/concept.md) — the architecture behind the runtime + - [Experts](./understanding-perstack/experts.md) — what Experts are and how they work + - [Runtime](./understanding-perstack/runtime.md) — execution model and event system + - [Sandbox Integration](./understanding-perstack/sandbox-integration.md) — infrastructure-level isolation + - [Boundary Model](./understanding-perstack/boundary-model.md) — trust boundaries between components + - [Registry](./understanding-perstack/registry.md) — publishing and discovering Experts + +### Build + +- **[Making Experts](./making-experts/README.md)** — complete guide to Expert definitions + - [Examples](./making-experts/examples.md) — real-world use cases + - [Best Practices](./making-experts/best-practices.md) — design guidelines for effective Experts + - [Skills](./making-experts/skills.md) — adding MCP tools to your Experts + - [Base Skill](./making-experts/base-skill.md) — built-in tools provided by the runtime + - [Testing](./making-experts/testing.md) — strategies for testing Experts + - [Publishing](./making-experts/publishing.md) — share Experts via the Registry +- **[Guides](./guides/README.md)** — task-oriented walkthroughs + - [Rapid Prototyping](./guides/rapid-prototyping.md) — validate ideas without writing code + - [Taming Prompt Sprawl](./guides/taming-prompt-sprawl.md) — fix bloated prompts with modular Experts + - [Extending with Tools](./guides/extending-with-tools.md) — give your Experts real-world capabilities via MCP + - [Adding AI to Your App](./guides/adding-ai-to-your-app.md) — integrate Experts into existing applications + - [Going to Production](./guides/going-to-production.md) — deploy safely with container isolation + +### Run + +- **[Using Experts](./using-experts/README.md)** — run and integrate Experts + - [Running Experts](./using-experts/running-experts.md) — CLI commands and runtime API + - [Workspace](./using-experts/workspace.md) — file system layout and conventions + - [State Management](./using-experts/state-management.md) — checkpoints, pausing, resuming + - [Error Handling](./using-experts/error-handling.md) — handling failures gracefully +- **Operating Experts** — production operations + - [Isolation by Design](./operating-experts/isolation-by-design.md) — production security patterns + - [Observing](./operating-experts/observing.md) — monitoring and debugging in production + - [Skill Management](./operating-experts/skill-management.md) — managing MCP skill dependencies + - [Deployment](./operating-experts/deployment.md) — deployment strategies and infrastructure + +### Reference + +- **References** + - [CLI Reference](./references/cli.md) — all commands and options + - [perstack.toml Reference](./references/perstack-toml.md) — complete configuration spec + - [Providers and Models](./references/providers-and-models.md) — supported LLM providers + - [Events](./references/events.md) — runtime event schema +- **Contributing** + - [Roadmap](./contributing/roadmap.md) — what's planned and how to contribute ## Community diff --git a/docs/getting-started.md b/docs/getting-started.md index a3a18db6..f0b987de 100644 --- a/docs/getting-started.md +++ b/docs/getting-started.md @@ -4,177 +4,283 @@ title: "Getting Started" # Getting Started -Ever tried to reuse an agent you built 3 months ago? You probably couldn't. - -Perstack fixes this. Define Experts once, reuse them everywhere — like npm packages for agents. +This walkthrough takes you from zero to production integration. Each step addresses one or more of the [five structural problems](../README.md#why-perstack) in agentic app development. ## Prerequisites -- Node.js 22+ -- LLM provider API key (Anthropic, OpenAI, Google, etc.) +- [Node.js 22+](https://nodejs.org/) +- An LLM provider API key (see [Providers and Models](./references/providers-and-models.md)) ```bash export ANTHROPIC_API_KEY=sk-ant-... ``` -See [Providers and Models](./references/providers-and-models.md) for other providers. +## Step 1: Create an Expert + +> [!TIP] +> **Addresses**: Tight coupling, The developer owns everything -## Try It +Generate an Expert definition interactively: ```bash -npx perstack start tic-tac-toe "Let's play!" +npx create-expert "Create a fitness assistant that delegates to a pro trainer" ``` -`perstack start` starts an interactive session with the Expert. -The runtime fetches the Expert from Registry and runs it with the given query. - -## Build Your Own - -Here's where it gets interesting. Let's build a fitness assistant that delegates to a professional trainer. - -**Traditional approach:** One monolithic agent with a bloated prompt trying to do everything. +`create-expert` does more than scaffold a file — it: +- generates Expert definitions in `perstack.toml` based on your description +- tests them against real-world scenarios +- analyzes execution history and output to evaluate the definitions +- iterates on definitions until behavior stabilizes +- reports capabilities and limitations -**Perstack approach:** Two focused Experts, each doing one thing well. +The result is a `perstack.toml` ready to use: ```toml -# ./perstack.toml +# perstack.toml [experts."fitness-assistant"] -description = "Assists users with their fitness journey" - +description = "Manages fitness records and suggests training menus" instruction = """ -As a personal fitness assistant: -1. Conduct interview sessions with the user -2. Manage records in `./fitness-log.md` -3. Delegate to `pro-trainer` for professional training menus +Conduct interview sessions and manage records in `./fitness-log.md`. +Collaborate with `pro-trainer` for professional training menus. """ - delegates = ["pro-trainer"] [experts."pro-trainer"] -description = """ -Suggests training menus based on user history and current physical condition. -""" - -instruction = """ -Provide training menu suggestions with scientifically verified effects. -Tailor recommendations to the user's condition and training history. -""" +description = "Suggests scientifically-backed training menus" +instruction = "Provide split routines and HIIT plans tailored to user history." ``` -Run it: +You can also write `perstack.toml` manually — `create-expert` is a convenient starting point, not a requirement. + +## Step 2: Run Your Expert + +> [!TIP] +> **Addresses**: Broken feedback loops + +### Interactive mode ```bash npx perstack start fitness-assistant "Start today's session" ``` -What just happened: - -| Aspect | What Perstack Does | -| ----------------- | -------------------------------------------------------------------------------- | -| **Isolation** | Each Expert has its own context window. No prompt bloat. | -| **Collaboration** | `fitness-assistant` delegates to `pro-trainer` autonomously. | -| **State** | Both Experts share the workspace (`./fitness-log.md`), not conversation history. | -| **Reusability** | `pro-trainer` can be reused in other projects as-is. | +`perstack start` opens a text-based UI for developing and testing Experts. You get real-time feedback and can iterate on definitions without deploying anything. -The benefit of separating concerns: +### Headless mode -- **Role clarity** — `fitness-assistant` handles interaction and records; `pro-trainer` focuses on training menus -- **Automatic context** — `fitness-assistant` passes relevant history to `pro-trainer` so users don't repeat themselves -- **Better output** — specialists collaborate to produce accurate, personalized suggestions +```bash +npx perstack run fitness-assistant "Start today's session" +``` -> [!NOTE] -> By default, Perstack searches for `perstack.toml` from the current directory upward. -> Use `--config ` to specify a custom location. +`perstack run` outputs JSON events to stdout — designed for automation and CI pipelines. -## Add MCP Skills +### What just happened -Experts can use external tools through MCP (Model Context Protocol). +| Aspect | What Perstack Does | +| --- | --- | +| **Isolation** | Each Expert has its own context window. No prompt bloat. | +| **Collaboration** | `fitness-assistant` delegates to `pro-trainer` autonomously. | +| **Observability** | Every step is visible as a structured event. | +| **State** | Both Experts share the workspace (`./fitness-log.md`), not conversation history. | -Here's an example using [exa-mcp-server](https://github.com/exa-labs/exa-mcp-server) for web search: +## Step 3: Analyze Execution -```toml -[experts."news-researcher"] -description = "Researches latest news on a topic" +> [!TIP] +> **Addresses**: No sustained behavior (observability) -instruction = """ -1. Search the web for recent news on the given topic -2. Summarize findings with source URLs -3. Save report to `./reports/news.md` -""" +After running an Expert, inspect what happened: -[experts."news-researcher".skills."web-search"] -type = "mcpStdioSkill" -command = "npx" -packageName = "exa-mcp-server" -requiredEnv = ["EXA_API_KEY"] +```bash +npx perstack log ``` +By default, this shows a summary of the latest job — the Expert that ran, the steps it took, and any errors. + +Key options for deeper inspection: + +| Option | Purpose | +| --- | --- | +| `--errors` | Show only error-related events | +| `--tools` | Show only tool call events | +| `--step "5-10"` | Filter by step range | +| `--summary` | Show summarized view | +| `--json` | Machine-readable output | + +This matters because debugging agents across model changes, requirement changes, and prompt iterations requires visibility into every decision the agent made. `perstack log` gives you that visibility without adding instrumentation code. + +See [CLI Reference](./references/cli.md#perstack-log) for the full list of options. + +## Step 4: Lock for Reproducibility + +> [!TIP] +> **Addresses**: No sustained behavior (reproducibility) + ```bash -export EXA_API_KEY=your_exa_key -npx perstack start news-researcher "AI startup funding this week" +npx perstack install ``` -You just declare what you need — the runtime handles the rest: +This creates a `perstack.lock` file that caches tool schemas for all MCP skills. Without the lockfile, Perstack initializes MCP skills at runtime to discover their tool definitions — which can add 500ms–6s startup latency per skill. + +**Workflow:** +1. **Develop** without a lockfile — MCP skills are resolved dynamically +2. **Run `perstack install`** before deploying — tool schemas are cached +3. **Deploy** with `perstack.lock` — the runtime starts LLM inference immediately + +**When to re-run:** after adding or modifying skills in `perstack.toml`, or after updating MCP server dependencies. + +The lockfile is optional. If not present, skills are initialized at runtime as usual. + +## Step 5: Integrate into Your Application + +> [!TIP] +> **Addresses**: No real isolation + +The CLI is for prototyping. For production, integrate Experts into your application via the Execution API, sandbox providers, or runtime embedding. -- **Package resolution**: `exa-mcp-server` is fetched and spawned automatically -- **Lifecycle management**: MCP servers start with the Expert, shut down when done -- **Environment isolation**: Only `requiredEnv` variables are passed to the MCP server -- **Error recovery**: MCP failures are fed back to the LLM, not thrown as runtime errors +### Perstack Execution API -For more on skills, see [Skills](./making-experts/skills.md). +The Execution API is the primary path for production integration. Your application starts jobs, streams events, and sends follow-up queries over HTTP. -## Run in Production +#### REST API -For production, run Experts in isolated containers: +**Start a job:** -```Dockerfile -FROM node:22-slim -RUN npm install -g perstack -COPY perstack.toml /app/perstack.toml -WORKDIR /workspace -ENTRYPOINT ["perstack", "start", "--config", "/app/perstack.toml", "fitness-assistant"] +```bash +curl -X POST https://api.perstack.ai/api/v1/jobs \ + -H "Authorization: Bearer $PERSTACK_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "applicationId": "your-app-id", + "expertKey": "fitness-assistant", + "query": "Start today'\''s session", + "provider": "anthropic" + }' ``` +**Stream events (SSE):** + ```bash -docker build -t fitness-assistant . -docker run --rm \ - -e ANTHROPIC_API_KEY \ - -v $(pwd)/workspace:/workspace \ - fitness-assistant "Start today's session" +curl -N https://api.perstack.ai/api/v1/jobs/{jobId}/stream \ + -H "Authorization: Bearer $PERSTACK_API_KEY" ``` -This pattern maps one container to one Expert: -- `ENTRYPOINT` fixes the Expert — callers just pass the query -- Workspace is volume-mounted, so `--continue` works across container restarts -- Mount per-user workspaces for personalization (e.g., `-v /data/users/alice:/workspace`) -- Container image becomes a deployable, versioned artifact +The stream emits Server-Sent Events: `message` events contain `PerstackEvent` payloads, `error` events signal failures, and `complete` events indicate the job finished. -Perstack is sandbox-first by design: -- Agents have larger attack surfaces than typical apps — sandbox the environment, not the agent's capabilities -- JSON events to stdout by default — no direct external messaging from within the sandbox -- Embed the runtime with custom event listeners for programmatic control +**Continue a job:** -See [Sandbox Integration](./understanding-perstack/sandbox-integration.md) for the full rationale. +```bash +curl -X POST https://api.perstack.ai/api/v1/jobs/{jobId}/continue \ + -H "Authorization: Bearer $PERSTACK_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "query": "Now create a weekly schedule" + }' +``` -## What's Next +#### TypeScript SDK (`@perstack/api-client`) -You've seen the basics. Here's where to go from here: +```typescript +import { createApiClient } from "@perstack/api-client" -**Do something specific:** -- [Rapid Prototyping](./guides/rapid-prototyping.md) — validate ideas without writing code -- [Taming Prompt Sprawl](./guides/taming-prompt-sprawl.md) — fix bloated prompts with modular Experts -- [Adding AI to Your App](./guides/adding-ai-to-your-app.md) — integrate Experts into existing applications -- [Going to Production](./guides/going-to-production.md) — deploy safely with container isolation +const client = createApiClient({ + apiKey: process.env.PERSTACK_API_KEY, +}) -**Understand the architecture:** -- [Concept](./understanding-perstack/concept.md) — why isolation and observability matter -- [Experts](./understanding-perstack/experts.md) — best practices for Expert design +// Start a job +const result = await client.jobs.start({ + applicationId: "your-app-id", + expertKey: "fitness-assistant", + query: "Start today's session", + provider: "anthropic", +}) + +if (!result.ok) { + // result.error.type: "http" | "network" | "timeout" | "validation" | "abort" + console.error(result.error.message) + process.exit(1) +} + +const jobId = result.data.data.job.id + +// Stream events +const stream = await client.jobs.stream(jobId) + +if (stream.ok) { + for await (const event of stream.data.events) { + console.log(event.type, event) + } +} + +// Continue with a follow-up +await client.jobs.continue(jobId, { + query: "Now create a weekly schedule", +}) +``` + +Every method returns an `ApiResult` — either `{ ok: true, data }` or `{ ok: false, error }`. Error types are: `"http"`, `"network"`, `"timeout"`, `"validation"`, and `"abort"`. + +### Other Sandbox Providers + +Perstack's isolation model maps naturally to container and serverless platforms: + +- Docker +- AWS ECS +- Google Cloud Run +- Kubernetes +- Cloudflare Workers + +Each Expert runs in its own sandboxed environment. See [Going to Production](./guides/going-to-production.md) for the Docker setup pattern. Detailed guides for other providers are coming soon. + +### Runtime Embedding (`@perstack/runtime`) + +For tighter integration, embed the runtime directly in your TypeScript/JavaScript application: + +```typescript +import { run } from "@perstack/runtime" + +const checkpoint = await run({ + setting: { + model: "claude-sonnet-4-5-20250929", + providerConfig: { providerName: "anthropic" }, + expertKey: "fitness-assistant", + input: { text: "Start today's session" }, + }, +}) +``` + +You can also listen for events during execution: + +```typescript +import { run } from "@perstack/runtime" + +const checkpoint = await run({ + setting: { + model: "claude-sonnet-4-5-20250929", + providerConfig: { providerName: "anthropic" }, + expertKey: "fitness-assistant", + input: { text: "Start today's session" }, + }, + eventListener: (event) => { + console.log(event.type, event) + }, +}) +``` + +The CLI is for prototyping. The runtime API is for production. Both use the same `perstack.toml`. + +## What's Next **Build more:** - [Making Experts](./making-experts/README.md) — full `perstack.toml` guide - [Skills](./making-experts/skills.md) — MCP integration patterns +- [Taming Prompt Sprawl](./guides/taming-prompt-sprawl.md) — break monolithic prompts into collaborating Experts + +**Understand the architecture:** +- [Concept](./understanding-perstack/concept.md) — why isolation and observability matter +- [Experts](./understanding-perstack/experts.md) — best practices for Expert design +- [Sandbox Integration](./understanding-perstack/sandbox-integration.md) — infrastructure-level isolation **Reference:** - [CLI Reference](./references/cli.md) — all commands and options - [perstack.toml Reference](./references/perstack-toml.md) — complete configuration spec +- [Events](./references/events.md) — runtime event schema diff --git a/docs/making-experts/README.md b/docs/making-experts/README.md index a10d1ec5..08a2fa05 100644 --- a/docs/making-experts/README.md +++ b/docs/making-experts/README.md @@ -99,6 +99,6 @@ This means: Before making Experts, understand the core concepts: - [Experts](../understanding-perstack/experts.md) — what Experts are and how they work -- [Getting Started](../getting-started.md) — build your first Expert +- [Getting Started](../getting-started.md) — create your first Expert and walk through the core workflow For running and integrating Experts, see [Using Experts](../using-experts/README.md).