diff --git a/README.md b/README.md index c18c093c..1b6b947e 100644 --- a/README.md +++ b/README.md @@ -1,233 +1,148 @@ -# Perstack: Expert Stack for Agent-first Development +# Perstack: The Agent Runtime

Perstack Demo

- πŸ“š Documentation β€’ - πŸš€ Getting Started β€’ - 𝕏 Twitter + npm version + npm downloads + License

-## Overview - -Perstack is a package manager and runtime for agent-first development. -Define modular micro-agents as Experts in TOML, publish them to a registry, and compose them like npm packages. +

+ Documentation Β· + Getting Started Β· + Twitter +

-Perstack isn’t another agent framework β€” it’s npm/npx for agents. +Define AI agents as declarative **Experts** in TOML. Execute them with deterministic, event-derived tracking. Each Expert runs in its own isolated context β€” no shared state, no prompt bloat, full execution history. ## Quick Start -Prerequisites: -- Node.js 22+ -- Provider API Credentials - - See [LLM Providers](./docs/references/providers-and-models.md) - -**Run a demo Expert:** - -Set ANTHROPIC_API_KEY (or any provider key you use): +Prerequisites: [Node.js 22+](https://nodejs.org/) and an [LLM provider API key](./docs/references/providers-and-models.md). ```bash -$ ANTHROPIC_API_KEY= npx perstack start tic-tac-toe "Game start!" +npx create-expert "Create saas-developer expert that can build a SaaS product" ``` -**Run in headless mode (no TUI):** +This creates a new Expert in `perstack.toml`, then: +- tests it against real-world scenarios +- analyzes execution history and output to evaluate the definition +- iterates on the definition until behavior stabilizes +- reports capabilities and limitations + +Run the Expert from the CLI: ```bash -$ ANTHROPIC_API_KEY= npx perstack run tic-tac-toe "Game start!" +npx perstack start saas-developer "Build an agentic CRM with Perstack" ``` -**What's next?** -- [Rapid Prototyping](./docs/guides/rapid-prototyping.md) β€” build your own Expert -- [Taming Prompt Sprawl](./docs/guides/taming-prompt-sprawl.md) β€” fix bloated prompts with modular Experts -- [Extending with Tools](./docs/guides/extending-with-tools.md) β€” add MCP skills to your Experts - -## Examples - -| Example | Description | -| ------------------------------------------------ | ---------------------------------------------------------------- | -| [bug-finder](./examples/bug-finder/) | Codebase analyzer that finds potential bugs | -| [github-issue-bot](./examples/github-issue-bot/) | Automated GitHub issue responder with real-time activity logging | -| [gmail-assistant](./examples/gmail-assistant/) | AI-powered email assistant with Gmail integration | - -## Key Features +Or use it programmatically via [runtime embedding](./docs/guides/adding-ai-to-your-app.md#runtime-embedding-optional): -- **Agent-first development toolkit** - - Declarative Expert definitions as modular micro-agents - - Dependency management for composing Experts - - Public registry for reusing Experts instead of rebuilding them -- **Sandbox-ready runtime** - - Secure execution designed for sandbox integration - - Observable, event-driven architecture - - Reproducible, checkpoint-based history +```typescript +import { run } from "@perstack/runtime" -### Safety by Design - -Perstack runtime is built for production-grade safety: -- Designed to run under sandboxed infrastructure -- Emits JSON events for every execution change -- Can be embedded in your app to add stricter policies and isolation - -```bash -$ npx perstack run my-expert "query" +const checkpoint = await run({ + setting: { + model: "claude-sonnet-4-5-20250929", + providerConfig: { providerName: "anthropic" }, + expertKey: "saas-developer", + input: { text: "Build an agentic CRM with Perstack" }, + }, +}) ``` -For production deployments, use external sandboxing (Docker, VM, cloud platform) to provide: -- Container isolation for file system access -- Network restrictions -- Environment variable isolation - +The CLI is for prototyping. The runtime API is for production. Both use the same `perstack.toml`. ## Why Perstack? -AI agent developers struggle with: -- Complex, monolithic agent apps -- Little to no reusability -- No dependency management -- Context windows that explode at scale -- Poor observability and debugging +Agentic app development has five structural problems: -Perstack fixes these with proven software engineering principles: -- **Isolation** β€” clear separation of concerns per Expert -- **Observability** β€” every step is visible and traceable -- **Reusability** β€” Experts compose declaratively through the runtime +1. **Tight coupling**: Frameworks bundle tools, code, and prompts together. Agent behavior is determined by all of them at once. +2. **Broken feedback loops**: You only get feedback after shipping, and improving the agent means changing the entire app codebase. +3. **The developer owns everything**: A single developer is responsible for the entire stack β€” from defining the agent to building the app around it. +4. **No sustained behavior**: Today's working agent may break next year. When the model or framework changes, behavior must be redefined from scratch. +5. **No real isolation**: Agents need filesystem sandboxing and infrastructure-level isolation, but most frameworks address isolation at the application level rather than the architecture level. -Before/After: +Perstack is designed to address these problems: -| | Traditional Agent Dev | With Perstack | -| :----------------- | :-------------------- | :-------------------------- | -| **Architecture** | Monolithic & fragile | **Modular & composable** | -| **State** | Global context window | **Isolated per Expert** | -| **Reuse** | Copy-paste prompts | **npm-style dependencies** | -| **Observability** | Hard to trace | **Full execution history** | -| **Safe execution** | On you | **Sandbox-ready by design** | +| Problem | Perstack's Approach | +| :--- | :--- | +| **Tight coupling** | **A runtime** that separates Expert definitions from app code, tools, prompts, and models | +| **Broken feedback loops** | **CLI tools** to execute and analyze Experts from day one. Expert and app evolve independently. | +| **The developer owns everything** | Expert definitions in **`perstack.toml`** are written by domain experts using natural language. Developers focus on integration, not prompt engineering. | +| **No sustained behavior** | **Event-derived execution** and **step-level checkpoints** help maintain reproducible behavior, even across model or provider changes | +| **No real isolation** | **Isolation** is built into the runtime architecture β€” workspace boundaries, environment sandboxing, and tool whitelisting β€” so your platform can enforce security at the infrastructure level | -### Example: Fitness Assistant +### How It Works -Here is a small example showing how two Experts can collaborate: a fitness assistant and a professional trainer. +Define Experts in `perstack.toml` and run them through the runtime: ```toml -# ./perstack.toml +# perstack.toml [experts."fitness-assistant"] -description = """ -Assists users with their fitness journey by managing records and suggesting training menus. -""" - +description = "Manages fitness records and suggests training menus" instruction = """ -As a personal fitness assistant, conduct interview sessions and manage records. -Manage records in a file called `./fitness-log.md` and update it regularly. -Collaborate with the `pro-trainer` expert to suggest professional training menus. +Conduct interview sessions and manage records in `./fitness-log.md`. +Collaborate with `pro-trainer` for professional training menus. """ - -delegates = ["pro-trainer"] +delegates = ["pro-trainer"] # This Expert can delegate to pro-trainer [experts."pro-trainer"] - -description = """ -Suggests training menus by considering the given user information and past training history. -""" - -instruction = """ -Provide training menu suggestions with scientifically verified effects such as split routines and HIIT, -tailored to the user's condition and mood, and the user's training history. -""" +description = "Suggests scientifically-backed training menus" +instruction = "Provide split routines and HIIT plans tailored to user history." ``` -To run this example, execute the following command: ```bash -$ npx perstack start fitness-assistant "Start today's session" +npx perstack start fitness-assistant "Start today's session" ``` -This example shows: - -- **Componentization** β€” each Expert owns one role -- **Isolation** β€” contexts are separate; shared data lives in the workspace -- **Observability** β€” full, replayable execution history -- **Reusability** β€” Experts collaborate declaratively via the runtime - -## Next Steps - -- [Guides](./docs/guides/README.md) -- [Understanding Perstack](./docs/understanding-perstack/concept.md) -- [Making Experts](./docs/making-experts/README.md) -- [Using Experts](./docs/using-experts/README.md) -- [Operating Experts](./docs/operating-experts/isolation-by-design.md) -- [References](./docs/references/cli.md) - -## Motivation - -Perstack is built to tackle the core problems of agent development using software engineering best practices. - -It centers on three ideas: **Isolation**, **Observability**, and **Reusability**. +Each Expert runs in its own isolated context β€” no shared state, no prompt bloat, full execution history. -### Isolation +### Key Capabilities -Isolation means separating an agent from everything except its role β€” that's what makes it a true Expert. +- **Step-level checkpoints** β€” resume from any step, not just the beginning. Debug, replay, and audit every decision the agent made. +- **Multi-provider support** β€” 8 LLM providers (OpenAI, Anthropic, Google, and more). Switch with a single config change. +- **Delegation** β€” Experts delegate to other Experts. Break monolithic agents into composable Experts that run in parallel. +- **Draft & versioned definitions** β€” iterate on Expert behavior without affecting production. Promote when ready. +- **Isolation by design** β€” each Expert runs in a sandboxed context with its own workspace, environment, and tool whitelist. -Specifically: -- **Model isolation:** the runtime mediates access to LLMs -- **Role isolation:** each Expert focuses on one job -- **Control isolation:** all controls live in tools; Experts only decide how to use them -- **Dependency isolation:** collaboration is resolved by the runtime -- **Context isolation:** context windows are never shared; data flows through runtime/workspace -- **Sandbox support:** designed to align with infra-level isolation - -### Observability - -Observability means agent behavior is fully transparent and inspectable. - -Specifically: -- **Prompt visibility:** no hidden instructions or context injection -- **Definition visibility:** only perstack.toml or registry definitions execute -- **Registry visibility:** write-once per version; text-only, fully auditable -- **Tool visibility:** tools run through MCP; usage is explicit -- **Internal state visibility:** state machines emit visible events -- **Deterministic history:** checkpoints make runs reproducible - -### Reusability - -Reusability enables agents to collaborate as components β€” the path to more capable agentic apps. - -An Expert is a modular micro-agent: -- Built for a specific purpose -- Reliable at one thing -- Modular and composable - -An agent represents a user. -An Expert is a specialist component that helps an application achieve its goals. +## Examples -## Perstack Components +| Example | Description | +| ------- | ----------- | +| [bug-finder](./examples/bug-finder/) | Codebase analyzer that finds potential bugs | +| [github-issue-bot](./examples/github-issue-bot/) | Automated GitHub issue responder that reads your codebase to answer questions | +| [gmail-assistant](./examples/gmail-assistant/) | AI-powered email assistant with Gmail integration | -Perstack provides **Expert Stack**: -- **Experts** β€” modular micro-agents -- **Runtime** β€” executes Experts -- **Registry** β€” shares Experts -- **Sandbox Integration** β€” safe production execution +## What to Read Next -> [!NOTE] -> The name "Perstack" is a combination of the Latin word "perΔ«tus" and the English word "stack". "perΔ«tus" means "expert", so Perstack means "expert stack". +**Getting started:** +1. **[Build your first Expert](./docs/guides/rapid-prototyping.md)** β€” 5 minutes to your first Expert +2. **[Compose Experts together](./docs/guides/taming-prompt-sprawl.md)** β€” break monolithic prompts into collaborating Experts +3. **[Add tools via MCP](./docs/guides/extending-with-tools.md)** β€” give your Experts real-world capabilities -**[πŸ“š Read the full documentation β†’](./docs/)** +**Going deeper:** +- [Core Concepts](./docs/understanding-perstack/concept.md) β€” the architecture behind the runtime +- [Making Experts](./docs/making-experts/README.md) β€” complete guide to Expert definitions +- [Isolation & Safety](./docs/operating-experts/isolation-by-design.md) β€” production deployment patterns +- [CLI & API Reference](./docs/references/cli.md) ## FAQ -### Is this an AI agent framework? - -No. The relationship is like Express vs npm, or Rails vs RubyGems. +### Is this an agent framework? -Agent frameworks help you build agent apps. -Perstack helps you package, share, and compose the Experts that power them. +No. Frameworks help you *build* agents. Perstack is a *runtime* that *executes* them. You define Experts declaratively in TOML; Perstack handles execution, isolation, and state. -### Can Experts in the Registry be used with other AI agent frameworks? +### Can Experts be used with other frameworks? -Yes. Registry entries are plain text definitions, so other frameworks can consume them too. -See the [API Reference](https://perstack.ai/api/v1/spec). +Yes. Expert definitions are TOML files, so other tools can read and interpret them. See the [API Reference](https://perstack.ai/api/v1/spec). -### Can Experts created with other AI agent frameworks be used with Perstack? +### Can I convert existing agents to Experts? -Not directly β€” but you can re-express their roles in `perstack.toml` to make them Perstack-compatible. +Not directly. But you can redefine their behavior in `perstack.toml` to make them Perstack-compatible. ## Contributing @@ -235,4 +150,4 @@ See [CONTRIBUTING.md](CONTRIBUTING.md). ## License -Perstack Runtime is open-source under the Apache License 2.0. +Apache License 2.0 β€” see [LICENSE](LICENSE) for details.