Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
257 changes: 86 additions & 171 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,238 +1,153 @@
# Perstack: Expert Stack for Agent-first Development
# Perstack: The Agent Runtime

<p align="center">
<img src="demo/demo.gif" alt="Perstack Demo" width="600">
</p>

<p align="center">
<a href="./docs/"><strong>📚 Documentation</strong></a>
<a href="./docs/getting-started.md"><strong>🚀 Getting Started</strong></a>
<a href="https://twitter.com/perstack_ai"><strong>𝕏 Twitter</strong></a>
<a href="https://www.npmjs.com/package/perstack"><img src="https://img.shields.io/npm/v/perstack" alt="npm version"></a>
<a href="https://www.npmjs.com/package/perstack"><img src="https://img.shields.io/npm/dm/perstack" alt="npm downloads"></a>
<a href="https://github.com/perstack-ai/perstack/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-Apache%202.0-blue" alt="License"></a>
</p>

## Overview

Perstack is a package manager and runtime for agent-first development.
Define modular micro-agents as Experts in TOML, publish them to a registry, and compose them like npm packages.
<p align="center">
<a href="./docs/"><strong>Documentation</strong></a> ·
<a href="./docs/getting-started.md"><strong>Getting Started</strong></a> ·
<a href="https://twitter.com/perstack_ai"><strong>Twitter</strong></a>
</p>

Perstack isn’t another agent framework — it’s npm/npx for agents.
Define AI agents as declarative **Experts** in TOML. Execute them with deterministic, event-derived tracking. Each Expert runs in its own isolated context — no shared state, no prompt bloat, full execution history.

## Quick Start

Prerequisites:
- Node.js 22+
- Provider API Credentials
- See [LLM Providers](./docs/references/providers-and-models.md)

**Run a demo Expert:**

Set ANTHROPIC_API_KEY (or any provider key you use):
Prerequisites: [Node.js 22+](https://nodejs.org/) and an [LLM provider API key](./docs/references/providers-and-models.md).

```bash
$ ANTHROPIC_API_KEY=<YOUR_API_KEY> npx perstack start tic-tac-toe "Game start!"
npx create-expert "Create saas-developer expert that can build a SaaS product"
```

**Run in headless mode (no TUI):**
This creates a new Expert in `perstack.toml`, then:
- tests it against real-world scenarios
- analyzes execution history and output to evaluate the definition
- iterates on the definition until behavior stabilizes
- reports capabilities and limitations

Run the Expert from the CLI:

```bash
$ ANTHROPIC_API_KEY=<YOUR_API_KEY> npx perstack run tic-tac-toe "Game start!"
npx perstack start saas-developer "Build an agentic CRM with Perstack"
```

**What's next?**
- [Rapid Prototyping](./docs/guides/rapid-prototyping.md) — build your own Expert
- [Taming Prompt Sprawl](./docs/guides/taming-prompt-sprawl.md) — fix bloated prompts with modular Experts
- [Extending with Tools](./docs/guides/extending-with-tools.md) — add MCP skills to your Experts

## Examples

| Example | Description |
| ------------------------------------------------ | ---------------------------------------------------------------- |
| [bug-finder](./examples/bug-finder/) | Codebase analyzer that finds potential bugs |
| [github-issue-bot](./examples/github-issue-bot/) | Automated GitHub issue responder with real-time activity logging |
| [gmail-assistant](./examples/gmail-assistant/) | AI-powered email assistant with Gmail integration |

## Key Features
Or use it programmatically via [runtime embedding](./docs/guides/adding-ai-to-your-app.md#runtime-embedding-optional):

- **Agent-first development toolkit**
- Declarative Expert definitions as modular micro-agents
- Dependency management for composing Experts
- Public registry for reusing Experts instead of rebuilding them
- **Sandbox-ready runtime**
- Secure execution designed for sandbox integration
- Observable, event-driven architecture
- Reproducible, checkpoint-based history
```typescript
import { run } from "@perstack/runtime"

### Safety by Design

Perstack runtime is built for production-grade safety:
- Designed to run under sandboxed infrastructure
- Emits JSON events for every execution change
- Can be embedded in your app to add stricter policies and isolation

```bash
$ npx perstack run my-expert "query"
const checkpoint = await run({
setting: {
model: "claude-sonnet-4-5-20250929",
providerConfig: { providerName: "anthropic" },
expertKey: "saas-developer",
input: { text: "Build an agentic CRM with Perstack" },
},
})
```

For production deployments, use external sandboxing (Docker, VM, cloud platform) to provide:
- Container isolation for file system access
- Network restrictions
- Environment variable isolation

The CLI is for prototyping. The runtime API is for production. Both use the same `perstack.toml`.

## Why Perstack?

AI agent developers struggle with:
- Complex, monolithic agent apps
- Little to no reusability
- No dependency management
- Context windows that explode at scale
- Poor observability and debugging
Agentic app development has five structural problems:

Perstack fixes these with proven software engineering principles:
- **Isolation** — clear separation of concerns per Expert
- **Observability** — every step is visible and traceable
- **Reusability** — Experts compose declaratively through the runtime
1. **Tight coupling**: Frameworks bundle tools, code, and prompts together. Agent behavior is determined by all of them at once.
2. **Broken feedback loops**: You only get feedback after shipping, and improving the agent means changing the entire app codebase.
3. **The developer owns everything**: A single developer is responsible for the entire stack — from defining the agent to building the app around it.
4. **No sustained behavior**: Today's working agent may break next year. When the model or framework changes, behavior must be redefined from scratch.
5. **No real isolation**: Agents need filesystem sandboxing and infrastructure-level isolation, but most frameworks address isolation at the application level rather than the architecture level.

Before/After:
Perstack is designed to address these problems:

| | Traditional Agent Dev | With Perstack |
| :----------------- | :-------------------- | :-------------------------- |
| **Architecture** | Monolithic & fragile | **Modular & composable** |
| **State** | Global context window | **Isolated per Expert** |
| **Reuse** | Copy-paste prompts | **npm-style dependencies** |
| **Observability** | Hard to trace | **Full execution history** |
| **Safe execution** | On you | **Sandbox-ready by design** |
| Problem | Perstack's Approach |
| :--- | :--- |
| **Tight coupling** | **A runtime** that separates Expert definitions from app code, tools, prompts, and models |
| **Broken feedback loops** | **CLI tools** to execute and analyze Experts from day one. Expert and app evolve independently. |
| **The developer owns everything** | Expert definitions in **`perstack.toml`** are written by domain experts using natural language. Developers focus on integration, not prompt engineering. |
| **No sustained behavior** | **Event-derived execution** and **step-level checkpoints** help maintain reproducible behavior, even across model or provider changes |
| **No real isolation** | **Isolation** is built into the runtime architecture — workspace boundaries, environment sandboxing, and tool whitelisting — so your platform can enforce security at the infrastructure level |

### Example: Fitness Assistant
### How It Works

Here is a small example showing how two Experts can collaborate: a fitness assistant and a professional trainer.
Define Experts in `perstack.toml` and run them through the runtime:

```toml
# ./perstack.toml
# perstack.toml

[experts."fitness-assistant"]
description = """
Assists users with their fitness journey by managing records and suggesting training menus.
"""

description = "Manages fitness records and suggests training menus"
instruction = """
As a personal fitness assistant, conduct interview sessions and manage records.
Manage records in a file called `./fitness-log.md` and update it regularly.
Collaborate with the `pro-trainer` expert to suggest professional training menus.
Conduct interview sessions and manage records in `./fitness-log.md`.
Collaborate with `pro-trainer` for professional training menus.
"""

delegates = ["pro-trainer"]
delegates = ["pro-trainer"] # This Expert can delegate to pro-trainer

[experts."pro-trainer"]

description = """
Suggests training menus by considering the given user information and past training history.
"""

instruction = """
Provide training menu suggestions with scientifically verified effects such as split routines and HIIT,
tailored to the user's condition and mood, and the user's training history.
"""
description = "Suggests scientifically-backed training menus"
instruction = "Provide split routines and HIIT plans tailored to user history."
```

To run this example, execute the following command:
```bash
$ npx perstack start fitness-assistant "Start today's session"
npx perstack start fitness-assistant "Start today's session"
```

This example shows:

- **Componentization** — each Expert owns one role
- **Isolation** — contexts are separate; shared data lives in the workspace
- **Observability** — full, replayable execution history
- **Reusability** — Experts collaborate declaratively via the runtime

## Next Steps

- [Guides](./docs/guides/README.md)
- [Understanding Perstack](./docs/understanding-perstack/concept.md)
- [Making Experts](./docs/making-experts/README.md)
- [Using Experts](./docs/using-experts/README.md)
- [Operating Experts](./docs/operating-experts/isolation-by-design.md)
- [References](./docs/references/cli.md)

## Motivation

Perstack is built to tackle the core problems of agent development using software engineering best practices.

It centers on three ideas: **Isolation**, **Observability**, and **Reusability**.
Each Expert runs in its own isolated context — no shared state, no prompt bloat, full execution history.

### Isolation
### Key Capabilities

Isolation means separating an agent from everything except its role — that's what makes it a true Expert.
- **Step-level checkpoints** — resume from any step, not just the beginning. Debug, replay, and audit every decision the agent made.
- **Multi-provider support** — 8 LLM providers (OpenAI, Anthropic, Google, and more). Switch with a single config change.
- **Delegation** — Experts delegate to other Experts. Break monolithic agents into composable Experts that run in parallel.
- **Draft & versioned definitions** — iterate on Expert behavior without affecting production. Promote when ready.
- **Isolation by design** — each Expert runs in a sandboxed context with its own workspace, environment, and tool whitelist.

Specifically:
- **Model isolation:** the runtime mediates access to LLMs
- **Role isolation:** each Expert focuses on one job
- **Control isolation:** all controls live in tools; Experts only decide how to use them
- **Dependency isolation:** collaboration is resolved by the runtime
- **Context isolation:** context windows are never shared; data flows through runtime/workspace
- **Sandbox support:** designed to align with infra-level isolation

### Observability

Observability means agent behavior is fully transparent and inspectable.

Specifically:
- **Prompt visibility:** no hidden instructions or context injection
- **Definition visibility:** only perstack.toml or registry definitions execute
- **Registry visibility:** write-once per version; text-only, fully auditable
- **Tool visibility:** tools run through MCP; usage is explicit
- **Internal state visibility:** state machines emit visible events
- **Deterministic history:** checkpoints make runs reproducible

### Reusability

Reusability enables agents to collaborate as components — the path to more capable agentic apps.

An Expert is a modular micro-agent:
- Built for a specific purpose
- Reliable at one thing
- Modular and composable

An agent represents a user.
An Expert is a specialist component that helps an application achieve its goals.
## Examples

## Perstack Components
| Example | Description |
| ------- | ----------- |
| [bug-finder](./examples/bug-finder/) | Codebase analyzer that finds potential bugs |
| [github-issue-bot](./examples/github-issue-bot/) | Automated GitHub issue responder that reads your codebase to answer questions |
| [gmail-assistant](./examples/gmail-assistant/) | AI-powered email assistant with Gmail integration |

Perstack provides **Expert Stack**:
- **Experts** — modular micro-agents
- **Runtime** — executes Experts
- **Registry** — shares Experts
- **Sandbox Integration** — safe production execution
## What to Read Next

> [!NOTE]
> The name "Perstack" is a combination of the Latin word "perītus" and the English word "stack". "perītus" means "expert", so Perstack means "expert stack".
**Getting started:**
1. **[Build your first Expert](./docs/guides/rapid-prototyping.md)** — 5 minutes to your first Expert
2. **[Compose Experts together](./docs/guides/taming-prompt-sprawl.md)** — break monolithic prompts into collaborating Experts
3. **[Add tools via MCP](./docs/guides/extending-with-tools.md)** — give your Experts real-world capabilities

**[📚 Read the full documentation →](./docs/)**
**Going deeper:**
- [Core Concepts](./docs/understanding-perstack/concept.md) — the architecture behind the runtime
- [Making Experts](./docs/making-experts/README.md) — complete guide to Expert definitions
- [Isolation & Safety](./docs/operating-experts/isolation-by-design.md) — production deployment patterns
- [CLI & API Reference](./docs/references/cli.md)

## FAQ

### Is this an AI agent framework?

No. The relationship is like Express vs npm, or Rails vs RubyGems.
### Is this an agent framework?

Agent frameworks help you build agent apps.
Perstack helps you package, share, and compose the Experts that power them.
No. Frameworks help you *build* agents. Perstack is a *runtime* that *executes* them. You define Experts declaratively in TOML; Perstack handles execution, isolation, and state.

### Can Experts in the Registry be used with other AI agent frameworks?
### Can Experts be used with other frameworks?

Yes. Registry entries are plain text definitions, so other frameworks can consume them too.
See the [API Reference](https://perstack.ai/api/v1/spec).
Yes. Expert definitions are TOML files, so other tools can read and interpret them. See the [API Reference](https://perstack.ai/api/v1/spec).

### Can Experts created with other AI agent frameworks be used with Perstack?
### Can I convert existing agents to Experts?

Not directly — but you can re-express their roles in `perstack.toml` to make them Perstack-compatible.
Not directly. But you can redefine their behavior in `perstack.toml` to make them Perstack-compatible.

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md).

## License

Perstack Runtime is open-source under the Apache License 2.0.
Apache License 2.0 — see [LICENSE](LICENSE) for details.