Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -244,19 +244,34 @@ jobs:
with:
fetch-depth: 0

- name: Check if package source code changed
id: check_files
run: |
PACKAGE_CHANGES=$(git diff --name-only origin/main...HEAD -- apps/ packages/ | grep -v '\.md$' || true)
if [ -z "$PACKAGE_CHANGES" ]; then
echo "needs_changeset=false" >> "$GITHUB_OUTPUT"
echo "No package source changes — skipping changeset check"
else
echo "needs_changeset=true" >> "$GITHUB_OUTPUT"
fi

- name: Setup pnpm
if: steps.check_files.outputs.needs_changeset == 'true'
uses: pnpm/action-setup@v4
with:
version: ${{ env.PNPM_VERSION }}

- name: Setup Node.js
if: steps.check_files.outputs.needs_changeset == 'true'
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
cache: "pnpm"

- name: Install dependencies
if: steps.check_files.outputs.needs_changeset == 'true'
run: pnpm install --frozen-lockfile

- name: Check changeset
if: steps.check_files.outputs.needs_changeset == 'true'
run: pnpm changeset status --since=origin/main
2 changes: 1 addition & 1 deletion benchmarks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,7 @@ Skill Spawn Handshake ToolDisc Total
## Environment

Benchmarks run on:
- **Runtime**: local (not Docker)
- **Runtime**: local
- **Model**: claude-sonnet-4-5 (Anthropic)
- **Node.js**: v22+
- **Platform**: macOS (darwin 23.4.0)
Expand Down
8 changes: 4 additions & 4 deletions docs/guides/going-to-production.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,10 +118,10 @@ Each user gets isolated state. The Expert reads and writes to their workspace on
`perstack run` outputs JSON events to stdout. Each line is a structured event:

```json
{"type":"generation:start","timestamp":"2024-01-15T10:30:00Z"}
{"type":"tool:called","toolName":"search","input":{"query":"..."},"timestamp":"..."}
{"type":"generation:complete","content":"Based on my search...","timestamp":"..."}
{"type":"complete","timestamp":"..."}
{"type":"startRun","timestamp":1705312200000,"runId":"abc123",...}
{"type":"startGeneration","timestamp":1705312201000,...}
{"type":"callTools","timestamp":1705312202000,...}
{"type":"completeRun","timestamp":1705312203000,...}
```

Pipe these to your logging system:
Expand Down
9 changes: 8 additions & 1 deletion docs/operating-experts/deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,14 @@ export default {
const { query, expertKey } = await request.json()
const events: unknown[] = []
await run(
{ expertKey, query, providerConfig: { apiKey: env.ANTHROPIC_API_KEY } },
{
setting: {
model: "claude-sonnet-4-5",
providerConfig: { providerName: "anthropic", apiKey: env.ANTHROPIC_API_KEY },
expertKey,
input: { text: query },
},
},
{ eventListener: (event) => events.push(event) }
)
return Response.json(events)
Expand Down
122 changes: 15 additions & 107 deletions docs/references/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,15 +46,21 @@ Both `start` and `run` accept the same options:
| `--provider <provider>` | LLM provider | `anthropic` |
| `--model <model>` | Model name | `claude-sonnet-4-5` |

Providers: `anthropic`, `google`, `openai`, `ollama`, `azure-openai`, `amazon-bedrock`, `google-vertex`
Providers: `anthropic`, `google`, `openai`, `deepseek`, `ollama`, `azure-openai`, `amazon-bedrock`, `google-vertex`

### Execution Control

| Option | Description | Default |
| ------------------- | -------------------------------------------- | --------- |
| `--max-steps <n>` | Maximum total steps across all Runs in a Job | unlimited |
| `--max-steps <n>` | Maximum total steps across all Runs in a Job | `100` |
| `--max-retries <n>` | Max retry attempts per generation | `5` |
| `--timeout <ms>` | Timeout per generation (ms) | `60000` |
| `--timeout <ms>` | Timeout per generation (ms) | `300000` |

### Reasoning

| Option | Description | Default |
| ------------------------------- | ------------------------------------------------------------------------ | ------- |
| `--reasoning-budget <budget>` | Reasoning budget for native LLM reasoning (`minimal`, `low`, `medium`, `high`, or token count) | - |

### Configuration

Expand Down Expand Up @@ -95,6 +101,12 @@ Providers: `anthropic`, `google`, `openai`, `ollama`, `azure-openai`, `amazon-be

Use with `--continue` to respond to interactive tool calls from the Coordinator Expert.

### Output Filtering (`run` only)

| Option | Description |
| ------------------ | ---------------------------------------------------------------------- |
| `--filter <types>` | Filter events by type (comma-separated, e.g., `completeRun,stopRunByError`) |

### Other

| Option | Description |
Expand Down Expand Up @@ -145,110 +157,6 @@ npx perstack run tic-tac-toe "Let's play!"
npx perstack run @org/expert@1.0.0 "query"
```

## Registry Management

### `perstack publish`

Publish an Expert to the registry.

```bash
perstack publish [expertName] [options]
```

**Arguments:**
- `[expertName]`: Expert name from `perstack.toml` (prompts if not provided)

**Options:**
| Option | Description |
| ----------------- | --------------------------- |
| `--config <path>` | Path to `perstack.toml` |
| `--dry-run` | Validate without publishing |

**Example:**
```bash
perstack publish my-expert
perstack publish my-expert --dry-run
```

Requires `PERSTACK_API_KEY` environment variable.

**Note:** Published Experts must use `npx` or `uvx` as skill commands. Arbitrary commands are not allowed for security reasons. See [Publishing](../making-experts/publishing.md#skill-requirements).

### `perstack unpublish`

Remove an Expert version from the registry.

```bash
perstack unpublish [expertKey] [options]
```

**Arguments:**
- `[expertKey]`: Expert key with version (e.g., `my-expert@1.0.0`)

**Options:**
| Option | Description |
| ----------------- | ------------------------------------------------ |
| `--config <path>` | Path to `perstack.toml` |
| `--force` | Skip confirmation (required for non-interactive) |

**Example:**
```bash
perstack unpublish # Interactive mode
perstack unpublish my-expert@1.0.0 --force # Non-interactive
```

### `perstack tag`

Add or update tags on an Expert version.

```bash
perstack tag [expertKey] [tags...] [options]
```

**Arguments:**
- `[expertKey]`: Expert key with version (e.g., `my-expert@1.0.0`)
- `[tags...]`: Tags to set (e.g., `stable`, `beta`)

**Options:**
| Option | Description |
| ----------------- | ----------------------- |
| `--config <path>` | Path to `perstack.toml` |

**Example:**
```bash
perstack tag # Interactive mode
perstack tag my-expert@1.0.0 stable beta # Set tags directly
```

### `perstack status`

Change the status of an Expert version.

```bash
perstack status [expertKey] [status] [options]
```

**Arguments:**
- `[expertKey]`: Expert key with version (e.g., `my-expert@1.0.0`)
- `[status]`: New status (`available`, `deprecated`, `disabled`)

**Options:**
| Option | Description |
| ----------------- | ----------------------- |
| `--config <path>` | Path to `perstack.toml` |

**Example:**
```bash
perstack status # Interactive mode
perstack status my-expert@1.0.0 deprecated
```

| Status | Meaning |
| ------------ | ---------------------------- |
| `available` | Normal, visible in registry |
| `deprecated` | Still usable but discouraged |
| `disabled` | Cannot be executed |

## Debugging and Inspection

### `perstack log`
Expand Down
30 changes: 10 additions & 20 deletions docs/references/events.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,14 +75,14 @@ interface BaseEvent {

| Event Type | Description | Key Payload |
| --------------------- | ---------------------------------------- | ---------------------------------------- |
| `callTools` | Regular tool calls | `newMessage`, `toolCalls`, `usage` |
| `callInteractiveTool` | Interactive tool call (needs user input) | `newMessage`, `toolCall`, `usage` |
| `callDelegate` | Delegation to another Expert | `newMessage`, `toolCalls`, `usage` |
| `resolveToolResults` | Tool results received | `toolResults` |
| `attemptCompletion` | Completion tool called | `toolResult` |
| `finishToolCall` | Single tool call finished | `newMessages` |
| `resumeToolCalls` | Resume pending tool calls | `pendingToolCalls`, `partialToolResults` |
| `finishAllToolCalls` | All tool calls finished | `newMessages` |
| `callTools` | Regular tool calls | `newMessage`, `toolCalls`, `usage` |
| `resolveToolResults` | Tool results received | `toolResults` |
| `attemptCompletion` | Completion tool called | `toolResult` |
| `finishToolCall` | Single tool call finished | `newMessages` |
| `resumeToolCalls` | Resume pending tool calls | `pendingToolCalls`, `partialToolResults` |
| `finishMcpTools` | All MCP tool calls finished | `newMessages` |
| `skipDelegates` | Delegates skipped | (empty) |
| `proceedToInteractiveTools` | Proceeding to interactive tool calls | `pendingToolCalls`, `partialToolResults` |

#### Step Transition Events

Expand Down Expand Up @@ -158,7 +158,7 @@ RuntimeEvent represents **infrastructure-level side effects** — the runtime en
### Characteristics

- Only the **latest state matters** — past RuntimeEvents are not meaningful
- Includes infrastructure-level information (skills, proxy)
- Includes infrastructure-level information (skills)
- Not tied to the agent loop state machine

### Base Properties
Expand Down Expand Up @@ -191,12 +191,6 @@ interface BaseRuntimeEvent {
| `skillStderr` | Skill stderr output | `skillName`, `message` |
| `skillDisconnected` | MCP skill disconnected | `skillName` |

#### Network Events

| Event Type | Description | Key Payload |
| ------------- | -------------------------- | ------------------------------------ |
| `proxyAccess` | Network access allow/block | `action`, `domain`, `port`, `reason` |

### Processing RuntimeEvents

RuntimeEvents should be processed as **current state** — only the latest value matters.
Expand Down Expand Up @@ -410,7 +404,7 @@ function ActivityLog({ activities }: { activities: Activity[] }) {
│ └──────────────────────┬───────────────────────────────┘ │
│ │ │
│ ┌──────────────────────┼───────────────────────────────┐ │
│ │ Skills, Proxy │ │
│ │ Skills │ │
│ │ │ │ │
│ │ RuntimeEvents │ │
│ │ (environment state) │ │
Expand Down Expand Up @@ -448,10 +442,6 @@ function formatEvent(event: Record<string, unknown>): string | null {
// RuntimeEvents
switch (type) {
case "skillConnected": return `Skill connected: ${event.skillName}`
case "proxyAccess": {
const action = event.action === "allowed" ? "✓" : "✗"
return `Proxy ${action} ${event.domain}:${event.port}`
}
}

return null
Expand Down
25 changes: 16 additions & 9 deletions docs/references/perstack-toml.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,14 +108,15 @@ headers = { "X-Custom-Header" = "value" }
| `maxSteps` | number | Maximum steps per run |
| `maxRetries` | number | Maximum retry attempts |
| `timeout` | number | Timeout per generation (ms) |
| `envPath` | string[] | Paths to environment files |

### Provider Configuration

Configure LLM provider under `[provider]` table.

```toml
[provider]
providerName = "anthropic" # Required: anthropic, google, openai, ollama, azure-openai, amazon-bedrock, google-vertex
providerName = "anthropic" # Required: anthropic, google, openai, deepseek, ollama, azure-openai, amazon-bedrock, google-vertex
[provider.setting]
# Provider-specific options (all optional)
```
Expand Down Expand Up @@ -214,8 +215,11 @@ delegates = ["other-expert", "@org/another-expert"]
| `minRuntimeVersion` | string | No | Minimum runtime version |
| `description` | string | No | Brief description (max 2048 chars) |
| `instruction` | string | **Yes** | Behavior instructions (max 20KB) |
| `delegates` | string[] | No | Experts this Expert can delegate to |
| `tags` | string[] | No | Tags for categorization |
| `delegates` | string[] | No | Experts this Expert can delegate to |
| `tags` | string[] | No | Tags for categorization |
| `providerTools` | string[] | No | Provider-specific tool names (e.g., `["webSearch", "codeExecution"]`) |
| `providerSkills` | array | No | Anthropic Agent Skills (builtin or custom) |
| `providerToolOptions` | object | No | Provider tool options (e.g., webSearch maxUses, allowedDomains) |

## Skill Definition

Expand Down Expand Up @@ -244,9 +248,11 @@ requiredEnv = ["API_KEY"]
| `command` | string | **Yes** | Command to execute (`npx`, `python`, `uvx`) |
| `packageName` | string | No | Package name (for `npx`) |
| `args` | string[] | No | Command-line arguments |
| `pick` | string[] | No | Tools to include (whitelist) |
| `omit` | string[] | No | Tools to exclude (blacklist) |
| `requiredEnv` | string[] | No | Required environment variables |
| `pick` | string[] | No | Tools to include (whitelist) |
| `omit` | string[] | No | Tools to exclude (blacklist) |
| `requiredEnv` | string[] | No | Required environment variables |
| `allowedDomains` | string[] | No | Allowed domain patterns for network access |
| `lazyInit` | boolean | No | Delay initialization until first use (default: `false`) |

### MCP SSE Skill

Expand All @@ -264,9 +270,10 @@ omit = ["tool2"]
| `type` | literal | **Yes** | `"mcpSseSkill"` |
| `description` | string | No | Skill description |
| `rule` | string | No | Additional usage guidelines |
| `endpoint` | string | **Yes** | MCP server URL |
| `pick` | string[] | No | Tools to include |
| `omit` | string[] | No | Tools to exclude |
| `endpoint` | string | **Yes** | MCP server URL |
| `pick` | string[] | No | Tools to include |
| `omit` | string[] | No | Tools to exclude |
| `allowedDomains` | string[] | No | Allowed domain patterns for network access |

### Interactive Skill

Expand Down
18 changes: 9 additions & 9 deletions docs/understanding-perstack/runtime.md
Original file line number Diff line number Diff line change
Expand Up @@ -237,18 +237,18 @@ The runtime supports real-time streaming of LLM output through fire-and-forget e

### Event sequence

| Phase | Events | Description |
| --------- | ------------------------------------------------------------- | ------------------------ |
| Reasoning | `startReasoning` → `streamReasoning...` → `completeReasoning` | Extended thinking output |
| Result | `startRunResult` → `streamRunResult...` → `completeRun` | Final completion text |
| Phase | Events | Description |
| --------- | --------------------------------------------------------------------------------- | ------------------------ |
| Reasoning | `startStreamingReasoning` → `streamReasoning...` → `completeStreamingReasoning` | Extended thinking output |
| Result | `startStreamingRunResult` → `streamRunResult...` → `completeStreamingRunResult` | Final completion text |

### Streaming vs state machine events

| Event type | State transition? | Purpose |
| ----------- | ------------------ | ------------------------------------- |
| `start*` | No | Marks stream beginning (display hint) |
| `stream*` | No | Incremental delta (fire-and-forget) |
| `complete*` | `completeRun` only | Final result with full text |
| Event type | State transition? | Purpose |
| ----------- | ----------------- | ------------------------------------- |
| `start*` | No | Marks stream beginning (display hint) |
| `stream*` | No | Incremental delta (fire-and-forget) |
| `complete*` | Streaming: No; `completeRun` (ExpertStateEvent): Yes | Full text / final result |

### When streaming is used

Expand Down
Loading