From 0674e7d1a228c5d28c01557f5b20e68e7576c3d5 Mon Sep 17 00:00:00 2001 From: Onur Solmaz <2453968+osolmaz@users.noreply.github.com> Date: Wed, 25 Mar 2026 14:40:18 +0100 Subject: [PATCH 01/22] docs: add acpx flows implementation plan --- ...26-03-25-acpx-flows-implementation-plan.md | 746 ++++++++++++++++++ 1 file changed, 746 insertions(+) create mode 100644 docs/2026-03-25-acpx-flows-implementation-plan.md diff --git a/docs/2026-03-25-acpx-flows-implementation-plan.md b/docs/2026-03-25-acpx-flows-implementation-plan.md new file mode 100644 index 0000000..19baad3 --- /dev/null +++ b/docs/2026-03-25-acpx-flows-implementation-plan.md @@ -0,0 +1,746 @@ +--- +title: acpx Flows Implementation Plan +description: Monorepo plan for adding a general workflow library and CLI to acpx for orchestrating ACP workers with simple primitives. +author: OpenClaw Team +date: 2026-03-25 +--- + +# acpx Flows Implementation Plan + +## Why this document exists + +`acpx` already has the hard parts of ACP execution: + +- ACP transport over stdio +- agent spawning and lifecycle handling +- persistent session storage +- queue ownership and prompt serialization +- machine-readable output +- MCP server attachment on session setup + +What it does not have yet is a general workflow layer that can orchestrate ACP +workers step by step with: + +- explicit graphs +- programmable branching +- selective context visibility +- persistent workflow state outside the worker +- reusable sessions where continuity helps +- fresh sessions where blind judgment is required + +This document defines that plan. + +It assumes `acpx` will move to a monorepo, but all code will remain in the same +repository and under the same product family. + +## Core position + +`acpx` should become a swiss army knife for ACP, but it should do that through +small, composable primitives rather than one undifferentiated blob. + +The correct split is: + +- one repo +- one product family +- multiple packages +- one clear runtime boundary + +The worker is not the workflow engine. + +The workflow runtime owns: + +- graph execution +- branching +- retries +- wait states +- checkpointing +- selective context visibility +- bindings to persistent `acpx` sessions + +The ACP worker only executes one step at a time. + +## Goals + +- Add a general workflow library for ACP workers, not a PR-specific automation tool. +- Keep workflow definitions readable as TypeScript modules with object-shaped graphs. +- Support arbitrary branching and forks/joins with deterministic routing outside the worker. +- Reuse existing `acpx` session persistence for conversations instead of duplicating transcripts. +- Keep the first implementation simple enough to land incrementally in the current codebase. +- Preserve a coherent CLI surface under the `acpx` name. + +## Non-goals + +- No ACP protocol redesign. +- No requirement to introduce a distributed scheduler. +- No visual builder. +- No giant custom DSL. +- No requirement that every result must come back through a custom MCP tool on day one. +- No transcript duplication into a second workflow database. + +## Design principles + +### 1. Graph topology should read like data + +The default authoring format should be: + +- plain object for graph topology +- code only for node-local logic + +This keeps flows inspectable, serializable, and renderable. + +### 2. Routing must be deterministic outside the worker + +Workers produce outputs. + +The runtime chooses: + +- next node +- retry vs fail +- fan-out +- join behavior +- wait states + +Never route on prose alone. + +### 3. Context visibility is a first-class primitive + +Each node should receive only what its `read(...)` projection returns. + +If a step should not know earlier conclusions, that must be enforced by: + +- a narrow `read(...)` +- a fresh ACP session + +### 4. Session continuity is a policy, not a side effect + +Each ACP node should explicitly choose: + +- `fresh` +- `sticky(key)` +- `inherit` + +### 5. Conversations stay in the existing session store + +`acpx` already stores persistent ACP conversations in `~/.acpx/sessions/*.json`. + +The workflow layer should store: + +- run state +- node state +- branch state +- session references +- artifacts + +It should not store duplicate full transcripts. + +### 6. Start with the existing runtime, not the CLI + +The flow engine should call the current runtime functions directly: + +- `runOnce` +- `createSession` +- `ensureSession` +- `sendSession` +- cancel and control operations + +It should not shell out to `acpx` as a subprocess. + +## Target monorepo shape + +The repository should become a workspace monorepo with these packages: + +- `packages/acpx` +- `packages/core` +- `packages/flows` + +Recommended responsibilities: + +### `packages/acpx` + +Published package name: `acpx` + +Responsibilities: + +- CLI binary +- public umbrella exports +- `acpx/core` subpath export +- `acpx/flows` subpath export + +This package is the user-facing umbrella. + +### `packages/core` + +Internal workspace package for the reusable ACP runtime. + +Responsibilities: + +- ACP transport +- agent spawning +- session lifecycle +- session persistence +- queue runtime +- output formatters +- config loading +- prompt content helpers +- agent registry and capability helpers + +This is where the current `src/client.ts`, `src/session-runtime.ts`, +`src/session-persistence/**`, `src/output.ts`, and related files should move +over time. + +### `packages/flows` + +Internal workspace package for the workflow library. + +Responsibilities: + +- flow graph types +- graph validation +- flow loader +- run store +- graph executor +- branching and fork/join runtime +- checkpoint/resume +- step result extraction and validation +- optional flow-specific helpers + +### Why this shape + +This gives `acpx` the swiss-army-knife product shape while keeping the code +modular: + +- one repo +- one public brand +- separate runtime layers + +## Public package surface + +The public API should present a single umbrella: + +- `acpx` +- `acpx/core` +- `acpx/flows` + +That means `packages/acpx` should re-export the public surfaces from the +workspace libraries rather than forcing users to import package-internal names. + +## Flow authoring model + +Flows are `.ts` files. + +Each file exports one flow definition. + +The canonical authoring style is: + +```ts +import { defineFlow, acp, compute, action, checkpoint } from "acpx/flows"; + +export default defineFlow({ + name: "triage", + input: InputSchema, + nodes: { + facts: acp({ ... }), + judge: acp({ ... }), + route: compute({ ... }), + external: checkpoint(), + continue_work: action({ ... }), + }, + edges: [ + { from: "facts", to: "judge" }, + { from: "judge", to: "route" }, + { + from: "route", + switch: { + on: "$.next", + cases: { + external: "external", + continue: "continue_work", + }, + }, + }, + ], +}); +``` + +### Why object-shaped graphs + +This format is better than a fluent chain for: + +- readability +- validation +- static analysis +- visualization +- IR generation +- tooling + +### Canonical execution model + +Authoring format: + +- TypeScript module + +Execution format: + +- normalized graph IR + +The engine should normalize every flow into one internal representation before +execution. + +## Core primitives + +Keep the primitive set small. + +### `acp(...)` + +Run one ACP worker step. + +Use this for any step executed by Codex, OpenClaw, Claude, Pi, or another +ACP-compatible worker. + +### `compute(...)` + +Pure local transformation. + +Used for: + +- result normalization +- reducers +- route preparation +- branch aggregation + +### `action(...)` + +Explicit local side effect. + +Used for: + +- GitHub writes +- file writes +- notifications +- external API calls + +### `checkpoint(...)` + +Pause and wait for an external actor or event. + +This is the correct primitive, not `human(...)`. + +The external actor may be: + +- a person +- another worker +- a CI system +- a webhook +- an operator action + +### Edge primitives + +Support: + +- linear edge +- `switch` +- `fork` +- `join` + +That is enough for most workflows. + +## Branching rules + +Branching must support two modes. + +### 1. Declarative branching + +For common structured cases: + +```ts +{ + from: "judge", + switch: { + on: "$.decision", + cases: { + yes: "yes_path", + no: "no_path", + }, + }, +} +``` + +### 2. Arbitrary code-based branching + +For custom logic, use a local `compute` router node: + +```ts +route: compute({ + run: ({ outputs }) => { + const answer = String(outputs.judge.answer).trim().toUpperCase(); + + if (answer === "Y") return { next: "yes_path" }; + if (answer === "N") return { next: "no_path" }; + return { next: "fallback_path" }; + }, +}); +``` + +Then branch declaratively on `route.next`. + +This keeps the graph readable while allowing arbitrary branching rules. + +## Session model + +Session policy is first-class on each `acp` node. + +Support exactly these policies: + +- `fresh` +- `sticky(key)` +- `inherit` + +### `fresh` + +Use a new ACP session for this node. + +Use for: + +- blind judgment +- independent critics +- isolated analysis + +### `sticky(key)` + +Reuse a persistent `acpx` session bound to the run and key. + +Use for: + +- implementation loops +- long-running review/fix cycles +- branch-local continuity + +### `inherit` + +Reuse the active session from an upstream sticky path. + +Use only when continuity is intentional. + +### Validation rules + +The flow validator should reject: + +- `inherit` when no inherited session can exist +- two concurrent branches writing to the same sticky key +- steps marked as blind/isolated while using `inherit` + +## Context visibility + +Each `acp` node gets a `read(...)` projection. + +The runtime state may be broad, but the node sees only the projected view. + +Example: + +- node A sees raw issue and diff +- node B sees extracted facts but not earlier verdicts +- node C sees the verdict and executes a side effect + +This is the main mechanism for reducing anchoring and confirmation bias. + +## Prompt model + +The workflow layer should build on the existing ACP prompt content model. + +`acpx` already has prompt helpers and validation for: + +- text blocks +- image blocks +- resource links +- embedded resources + +That should remain the base prompt type for flow steps rather than inventing a +second prompt representation. + +## Result capture + +Do not make a custom MCP result tool mandatory for the first implementation. + +The current runtime forwards MCP server config to `session/new` and +`session/load`, but it does not yet host a built-in MCP server runtime for flow +steps. The first implementation should respect that. + +### Initial result path + +Each `acp` node should specify: + +- a prompt +- an output schema +- a result extraction policy + +Default policy: + +- ask the worker to return a final structured JSON object +- capture the ACP output stream +- extract the final assistant payload +- parse JSON +- validate it + +### Future extension + +Later, a flow-specific MCP tool can be added behind the same abstraction for +more reliable structured returns. That should be an enhancement, not a +prerequisite. + +## Schema model + +The flow engine should not hard-require one validation library. + +Accept any schema-like object that supports one of: + +- `parse(value)` +- `safeParse(value)` + +This keeps the core flexible and avoids baking a new large dependency into the +runtime contract. + +## Persistence model + +Do not use SQLite first. + +The current repo already uses file-based JSON and NDJSON persistence for +sessions. The workflow layer should match that style. + +### Conversation storage + +Persistent conversations remain in: + +- `~/.acpx/sessions/*.json` +- `~/.acpx/sessions/*.stream.ndjson` + +The workflow engine should reference those sessions by `acpxRecordId`. + +### Workflow storage + +Store workflow state under: + +- `~/.acpx/flows/` + +Recommended layout: + +- `~/.acpx/flows/index.json` +- `~/.acpx/flows/runs/.json` +- `~/.acpx/flows/runs/.events.ndjson` +- `~/.acpx/flows/runs/.lock` +- `~/.acpx/flows/artifacts//...` + +### What a run record should store + +- `runId` +- `flowName` +- `flowPath` +- `flowVersion` +- `status` +- `cwd` +- `createdAt` +- `updatedAt` +- `input` +- `nodeStates` +- `outputs` +- `activeBranches` +- `sessionBindings` +- `waitingOn` +- `artifacts` + +### What it should not store + +- duplicate conversation transcripts +- duplicate token usage copied out of session records + +## Run model + +Each run is a checkpointed state machine. + +The runtime should persist after every node transition: + +- node started +- node completed +- branch chosen +- checkpoint entered +- run resumed +- run failed +- run completed + +This is required for: + +- crash recovery +- inspectability +- replay +- long-lived checkpoints + +## CLI surface + +Add a new top-level command family: + +- `acpx flow run ` +- `acpx flow resume ` +- `acpx flow show ` +- `acpx flow graph ` +- `acpx flow validate ` + +### Important compatibility note + +Today, unknown first tokens are treated as agent names. Adding `flow` is +therefore a real top-level surface change and must be treated as a deliberate +reserved verb. + +### Loader model + +Flow files should be authored as `.ts`. + +The CLI should load them directly. + +That means the monorepo needs a dedicated runtime loader path for TypeScript +flow modules instead of pretending the current CLI-only build is enough. + +## Agent selection inside flows + +Do not rely on CLI-level `--agent` overrides for the first implementation. + +Flows may contain multiple `acp` nodes with different profiles, so one global +raw command override is ambiguous. + +Instead, flow nodes should name an agent profile resolved through the existing +config and registry layer. + +Example: + +- `profile: "codex"` +- `profile: "openclaw"` +- `profile: "claude"` + +Later, per-node raw command overrides can be added if they are actually needed. + +## Use of existing runtime + +The flow engine should build on the current runtime instead of duplicating it. + +Recommended mapping: + +- `fresh` node -> `runOnce` +- `sticky(key)` initial bind -> `ensureSession` +- sticky turn execution -> `sendSession` +- cancel/control -> existing session control functions + +This keeps ACP execution in one place. + +## Testing strategy + +Build on the existing mock ACP agent and integration test style. + +### Library tests + +Add tests for: + +- graph validation +- `fresh` vs `sticky` semantics +- `inherit` validation +- declarative branching +- arbitrary code-based routing via `compute` +- fork/join execution +- checkpoint persistence and resume +- run store locking +- result parsing failures + +### CLI tests + +Add integration tests for: + +- `acpx flow run ...` +- `acpx flow validate ...` +- `acpx flow graph ...` +- `acpx flow resume ...` +- reserved `flow` verb behavior + +### Mock worker coverage + +The existing mock worker should remain the base for flow tests so the workflow +layer is validated against ACP behavior, not ad-hoc stubs. + +## Implementation phases + +### 1. Monorepo cutover + +- create workspace structure +- add `packages/acpx` +- add `packages/core` +- add `packages/flows` +- move existing code into `packages/core` and `packages/acpx` with minimal logic changes +- keep published `acpx` CLI behavior unchanged + +### 2. Core library surface + +- define public `acpx/core` exports +- stop treating all runtime code as CLI-internal implementation detail +- expose stable session and prompt APIs for flow execution + +### 3. Flow graph and validator + +- implement `defineFlow` +- implement node and edge types +- normalize to internal graph IR +- validate graph structure and session-policy constraints + +### 4. File-based run store + +- add `~/.acpx/flows/` store +- implement run record persistence +- implement event log +- implement run locks +- implement checkpoint and resume + +### 5. Flow executor + +- execute `acp`, `compute`, `action`, `checkpoint` +- wire `acp` nodes into existing session runtime +- implement branch, fork, join, and failure semantics + +### 6. Result extraction + +- add structured final-result capture +- add validator bridge for schema parsing +- add normalized failure handling for malformed worker results + +### 7. CLI + +- add `flow` command family +- add TypeScript flow loader +- add run/validate/graph/show/resume commands + +### 8. Hardening + +- improve inspectability +- add graph rendering +- add richer artifacts +- evaluate whether a custom MCP return tool is worth adding + +## Resolved decisions + +- The repo becomes a monorepo. +- The public product family remains `acpx`. +- The first-class workflow API lives under `acpx/flows`. +- Graph topology is object-shaped, not fluent-first. +- Branching is fully programmable. +- `checkpoint` is the right primitive, not `human`. +- Conversations remain in the existing session store. +- Workflow state uses file-based persistence first. +- The flow runtime uses the current `acpx` runtime directly, not CLI subprocesses. +- A custom MCP result server is optional later, not required up front. + +## Success criteria + +This work is successful when all of the following are true: + +- a flow can be authored as one `.ts` file +- `acpx flow run file.ts` executes it end to end +- fresh and sticky session behavior are explicit and reliable +- blind steps do not inherit hidden worker memory accidentally +- arbitrary routing rules can be expressed cleanly +- fork/join works across multiple ACP workers +- run state survives process exit and resume +- worker conversations are still stored exactly once through the existing session model From 44ddf964d187962da273b9b58c7e8eb56cd69f17 Mon Sep 17 00:00:00 2001 From: Onur Solmaz <2453968+osolmaz@users.noreply.github.com> Date: Wed, 25 Mar 2026 15:05:17 +0100 Subject: [PATCH 02/22] docs: simplify acpx flows session model --- ...26-03-25-acpx-flows-implementation-plan.md | 148 ++++++++++-------- 1 file changed, 87 insertions(+), 61 deletions(-) diff --git a/docs/2026-03-25-acpx-flows-implementation-plan.md b/docs/2026-03-25-acpx-flows-implementation-plan.md index 19baad3..2f74ee5 100644 --- a/docs/2026-03-25-acpx-flows-implementation-plan.md +++ b/docs/2026-03-25-acpx-flows-implementation-plan.md @@ -25,8 +25,8 @@ workers step by step with: - programmable branching - selective context visibility - persistent workflow state outside the worker -- reusable sessions where continuity helps -- fresh sessions where blind judgment is required +- one main conversation by default +- explicit isolated conversations where blind judgment is required This document defines that plan. @@ -109,15 +109,16 @@ Each node should receive only what its `read(...)` projection returns. If a step should not know earlier conclusions, that must be enforced by: - a narrow `read(...)` -- a fresh ACP session +- an isolated ACP session -### 4. Session continuity is a policy, not a side effect +### 4. One main session by default, explicit extra sessions when needed -Each ACP node should explicitly choose: +Each flow run should get one implicit main ACP conversation. -- `fresh` -- `sticky(key)` -- `inherit` +Most `acp` nodes should just use that main conversation. + +If a step needs isolation or a separate line of work, the flow should ask for +that explicitly instead of relying on hidden session policy defaults. ### 5. Conversations stay in the existing session store @@ -150,8 +151,9 @@ It should not shell out to `acpx` as a subprocess. The repository should become a workspace monorepo with these packages: - `packages/acpx` -- `packages/core` - `packages/flows` +- `packages/core` if extracting the shared runtime into its own workspace + package proves useful Recommended responsibilities: @@ -163,7 +165,6 @@ Responsibilities: - CLI binary - public umbrella exports -- `acpx/core` subpath export - `acpx/flows` subpath export This package is the user-facing umbrella. @@ -215,14 +216,17 @@ modular: ## Public package surface -The public API should present a single umbrella: +The public API should start with a single umbrella: - `acpx` -- `acpx/core` - `acpx/flows` -That means `packages/acpx` should re-export the public surfaces from the -workspace libraries rather than forcing users to import package-internal names. +`acpx/core` can exist later if the lower-level runtime surface proves worth +stabilizing. It should not be forced into the first public compatibility +contract unless there is a clear need. + +`packages/acpx` should re-export the public surfaces from the workspace +libraries rather than forcing users to import package-internal names. ## Flow authoring model @@ -387,47 +391,66 @@ This keeps the graph readable while allowing arbitrary branching rules. ## Session model -Session policy is first-class on each `acp` node. +The public model should be simple. + +### Default behavior -Support exactly these policies: +Each flow run gets one implicit main ACP session. -- `fresh` -- `sticky(key)` -- `inherit` +Every `acp` node uses that main session by default. -### `fresh` +That should be the common case for: -Use a new ACP session for this node. +- exploratory analysis +- implementation +- follow-up fixes +- review/fix loops -Use for: +### Isolated steps + +If a step must be independent, the flow should opt into an isolated session +explicitly. + +Use isolation for: - blind judgment - independent critics -- isolated analysis +- adversarial review +- any step that must not inherit earlier conversation state + +This should be expressed as a simple flow-level option such as "run this step in +its own session", not by forcing every author to learn internal session-policy +keywords. -### `sticky(key)` +### Extra long-lived sessions -Reuse a persistent `acpx` session bound to the run and key. +Most flows should not need to manually name sessions. -Use for: +If a workflow truly needs multiple persistent conversations, it may declare +additional session handles explicitly. That is an advanced case, not the +default. -- implementation loops -- long-running review/fix cycles -- branch-local continuity +The runtime should own the mapping from those logical handles to underlying +`acpxRecordId` and ACP session identifiers for the run. -### `inherit` +### Internal runtime model -Reuse the active session from an upstream sticky path. +Internally, the runtime will still need semantics equivalent to: -Use only when continuity is intentional. +- reuse the main session +- create an isolated one-off session +- continue a previously created non-main session + +Those are implementation concerns. They do not need to be the first public API +surface. ### Validation rules The flow validator should reject: -- `inherit` when no inherited session can exist -- two concurrent branches writing to the same sticky key -- steps marked as blind/isolated while using `inherit` +- isolated or blind steps that try to reuse the main conversation +- concurrent branches that would interleave prompts into the same session +- explicit extra-session references that cannot be resolved for the run ## Context visibility @@ -443,6 +466,9 @@ Example: This is the main mechanism for reducing anchoring and confirmation bias. +It is not sufficient by itself for blind review. If a node must not inherit +earlier worker memory, it must use an isolated session as well. + ## Prompt model The workflow layer should build on the existing ACP prompt content model. @@ -467,13 +493,7 @@ steps. The first implementation should respect that. ### Initial result path -Each `acp` node should specify: - -- a prompt -- an output schema -- a result extraction policy - -Default policy: +The first implementation should support one result path: - ask the worker to return a final structured JSON object - capture the ACP output stream @@ -521,13 +541,15 @@ Store workflow state under: - `~/.acpx/flows/` -Recommended layout: +Recommended initial layout: + +- `~/.acpx/flows/runs//run.json` +- `~/.acpx/flows/runs//events.ndjson` +- `~/.acpx/flows/runs//lock` +- `~/.acpx/flows/runs//artifacts/...` -- `~/.acpx/flows/index.json` -- `~/.acpx/flows/runs/.json` -- `~/.acpx/flows/runs/.events.ndjson` -- `~/.acpx/flows/runs/.lock` -- `~/.acpx/flows/artifacts//...` +If later experience shows that fast global lookup is necessary, an index file +can be added then. It should not be required up front. ### What a run record should store @@ -543,7 +565,7 @@ Recommended layout: - `nodeStates` - `outputs` - `activeBranches` -- `sessionBindings` +- `sessionBindings` mapping runtime-owned handles to persisted session ids - `waitingOn` - `artifacts` @@ -622,9 +644,9 @@ The flow engine should build on the current runtime instead of duplicating it. Recommended mapping: -- `fresh` node -> `runOnce` -- `sticky(key)` initial bind -> `ensureSession` -- sticky turn execution -> `sendSession` +- default main-session step -> `ensureSession` then `sendSession` +- isolated one-off step -> `runOnce` +- explicit extra persistent session -> `ensureSession` then `sendSession` - cancel/control -> existing session control functions This keeps ACP execution in one place. @@ -638,8 +660,9 @@ Build on the existing mock ACP agent and integration test style. Add tests for: - graph validation -- `fresh` vs `sticky` semantics -- `inherit` validation +- default main-session reuse +- isolated-step semantics +- explicit extra-session validation - declarative branching - arbitrary code-based routing via `compute` - fork/join execution @@ -668,23 +691,24 @@ layer is validated against ACP behavior, not ad-hoc stubs. - create workspace structure - add `packages/acpx` -- add `packages/core` - add `packages/flows` -- move existing code into `packages/core` and `packages/acpx` with minimal logic changes +- extract a shared runtime package only if it materially clarifies the split +- move existing code into the monorepo with minimal logic changes - keep published `acpx` CLI behavior unchanged ### 2. Core library surface -- define public `acpx/core` exports +- define the internal runtime surface that `acpx/flows` depends on - stop treating all runtime code as CLI-internal implementation detail -- expose stable session and prompt APIs for flow execution +- expose stable session and prompt APIs for flow execution inside the repo +- publish `acpx/core` only if that lower-level surface proves worth freezing ### 3. Flow graph and validator - implement `defineFlow` - implement node and edge types - normalize to internal graph IR -- validate graph structure and session-policy constraints +- validate graph structure and session-isolation constraints ### 4. File-based run store @@ -727,6 +751,8 @@ layer is validated against ACP behavior, not ad-hoc stubs. - Graph topology is object-shaped, not fluent-first. - Branching is fully programmable. - `checkpoint` is the right primitive, not `human`. +- Each flow run has one implicit main session by default. +- Extra sessions must be explicit. - Conversations remain in the existing session store. - Workflow state uses file-based persistence first. - The flow runtime uses the current `acpx` runtime directly, not CLI subprocesses. @@ -738,8 +764,8 @@ This work is successful when all of the following are true: - a flow can be authored as one `.ts` file - `acpx flow run file.ts` executes it end to end -- fresh and sticky session behavior are explicit and reliable -- blind steps do not inherit hidden worker memory accidentally +- the default main-session model is simple and reliable +- isolated steps do not inherit hidden worker memory accidentally - arbitrary routing rules can be expressed cleanly - fork/join works across multiple ACP workers - run state survives process exit and resume From 40f5ffd614f852bd05a8298a7e91926e7e815fb7 Mon Sep 17 00:00:00 2001 From: Onur Solmaz <2453968+osolmaz@users.noreply.github.com> Date: Wed, 25 Mar 2026 16:04:03 +0100 Subject: [PATCH 03/22] docs: place flow examples under workflows dir --- ...26-03-25-acpx-flows-implementation-plan.md | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/docs/2026-03-25-acpx-flows-implementation-plan.md b/docs/2026-03-25-acpx-flows-implementation-plan.md index 2f74ee5..2246135 100644 --- a/docs/2026-03-25-acpx-flows-implementation-plan.md +++ b/docs/2026-03-25-acpx-flows-implementation-plan.md @@ -266,6 +266,19 @@ export default defineFlow({ }); ``` +The recommended repository layout is: + +- library/runtime code under the package workspace +- user-authored and example flows under a repo-level `workflows/` directory + +Example: + +- `workflows/pr-triage.flow.ts` +- `workflows/review.flow.ts` + +That keeps the workflow library separate from the workflows it executes and +gives the CLI one obvious path shape for local development. + ### Why object-shaped graphs This format is better than a fluent chain for: @@ -617,6 +630,10 @@ Flow files should be authored as `.ts`. The CLI should load them directly. +The canonical local invocation should look like: + +- `acpx flow run workflows/pr-triage.flow.ts` + That means the monorepo needs a dedicated runtime loader path for TypeScript flow modules instead of pretending the current CLI-only build is enough. @@ -753,6 +770,8 @@ layer is validated against ACP behavior, not ad-hoc stubs. - `checkpoint` is the right primitive, not `human`. - Each flow run has one implicit main session by default. - Extra sessions must be explicit. +- Example and user-authored flows should live under a repo-level `workflows/` + directory rather than inside the library package tree. - Conversations remain in the existing session store. - Workflow state uses file-based persistence first. - The flow runtime uses the current `acpx` runtime directly, not CLI subprocesses. From b222872e07e2f5cd2ac314c3c53d243891257391 Mon Sep 17 00:00:00 2001 From: Onur Solmaz <2453968+osolmaz@users.noreply.github.com> Date: Wed, 25 Mar 2026 16:36:39 +0100 Subject: [PATCH 04/22] feat: add experimental flow runner --- CHANGELOG.md | 1 + README.md | 2 + docs/CLI.md | 15 + package.json | 12 +- pnpm-lock.yaml | 6 +- src/cli-core.ts | 30 ++ src/flows.ts | 23 ++ src/flows/cli.ts | 138 +++++++ src/flows/github.ts | 148 ++++++++ src/flows/json.ts | 104 ++++++ src/flows/runtime.ts | 599 ++++++++++++++++++++++++++++++ test/cli.test.ts | 14 + test/fixtures/flow-branch.flow.ts | 45 +++ test/flows.test.ts | 243 ++++++++++++ test/integration.test.ts | 45 +++ workflows/pr-triage.flow.ts | 202 ++++++++++ 16 files changed, 1621 insertions(+), 6 deletions(-) create mode 100644 src/flows.ts create mode 100644 src/flows/cli.ts create mode 100644 src/flows/github.ts create mode 100644 src/flows/json.ts create mode 100644 src/flows/runtime.ts create mode 100644 test/fixtures/flow-branch.flow.ts create mode 100644 test/flows.test.ts create mode 100644 workflows/pr-triage.flow.ts diff --git a/CHANGELOG.md b/CHANGELOG.md index 0d98e69..57ee89e 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,6 +8,7 @@ Repo: https://github.com/openclaw/acpx - Conformance/ACP: add a data-driven ACP core v1 conformance suite with CI smoke coverage, nightly coverage, and a hardened runner that reports startup failures cleanly and scopes filesystem checks to the session cwd. (#130) Thanks @lynnzc. - Agents/droid: add `factory-droid` and `factorydroid` aliases for the built-in Factory Droid adapter and sync the built-in docs. Thanks @vincentkoc. +- Flows/workflows: add an initial `flow run` command, an `acpx/flows` runtime surface, file-backed flow run state under `~/.acpx/flows/runs`, and a repo-level PR triage example workflow. Thanks @osolmaz. ### Breaking diff --git a/README.md b/README.md index 397fc03..1226e5d 100644 --- a/README.md +++ b/README.md @@ -39,6 +39,7 @@ One command surface for Pi, OpenClaw ACP, Codex, Claude, and other ACP-compatibl - **Structured output**: typed ACP messages (thinking, tool calls, diffs) instead of ANSI scraping - **Any ACP agent**: built-in registry + `--agent` escape hatch for custom servers - **One-shot mode**: `exec` for stateless fire-and-forget tasks +- **Experimental flows**: `flow run ` for stepwise ACP workflows over multiple prompts ```bash $ acpx codex sessions new @@ -204,6 +205,7 @@ acpx --cwd ~/repos/backend codex 'review recent auth changes' acpx --format text codex 'summarize your findings' acpx --format json codex exec 'review changed files' acpx --format json --json-strict codex exec 'machine-safe JSON only' +acpx flow run workflows/pr-triage.flow.ts --input-json '{"repo":"openclaw/acpx","prNumber":174}' acpx --format quiet codex 'final recommendation only' acpx --timeout 90 codex 'investigate intermittent test timeout' diff --git a/docs/CLI.md b/docs/CLI.md index ef2bce7..684e84e 100644 --- a/docs/CLI.md +++ b/docs/CLI.md @@ -23,6 +23,7 @@ Global options apply to all commands. acpx [global_options] [prompt_text...] acpx [global_options] prompt [prompt_options] [prompt_text...] acpx [global_options] exec [prompt_options] [prompt_text...] +acpx [global_options] flow run [--input-json | --input-file ] [--default-agent ] acpx [global_options] cancel [-s ] acpx [global_options] set-mode [-s ] acpx [global_options] set [-s ] @@ -59,10 +60,24 @@ Prompt options: Notes: - Top-level `prompt`, `exec`, `cancel`, `set-mode`, `set`, `sessions`, and bare `acpx ` default to `codex`. +- Top-level `flow run ` executes a workflow module and persists run state under `~/.acpx/flows/runs/`. - If a prompt argument is omitted, `acpx` reads prompt text from stdin when piped. - `--file` works for implicit prompt, `prompt`, and `exec` commands. - `acpx` with no args in an interactive terminal shows help. +## `flow run` subcommand + +```bash +acpx [global_options] flow run [--input-json | --input-file ] [--default-agent ] +``` + +- Runs a workflow module step by step through the `acpx/flows` runtime. +- Persists run artifacts under `~/.acpx/flows/runs//`. +- Reuses one implicit main ACP session by default for non-isolated `acp` nodes. +- `--input-json` passes flow input inline as JSON. +- `--input-file` reads flow input JSON from disk. +- `--default-agent` supplies the default agent profile for `acp` nodes that do not pin one. + ## Global options All global options: diff --git a/package.json b/package.json index 72183a9..87427e5 100644 --- a/package.json +++ b/package.json @@ -27,8 +27,14 @@ "LICENSE" ], "type": "module", + "exports": { + ".": "./dist/cli.js", + "./flows": "./dist/flows.js", + "./dist/*": "./dist/*", + "./package.json": "./package.json" + }, "scripts": { - "build": "tsdown src/cli.ts --format esm --dts --clean --platform node --target node22 --no-fixedExtension", + "build": "tsdown src/cli.ts src/flows.ts --format esm --dts --clean --platform node --target node22 --no-fixedExtension", "build:test": "node -e \"require('node:fs').rmSync('dist-test',{recursive:true,force:true})\" && tsc -p tsconfig.test.json", "check": "pnpm run format:check && pnpm run typecheck && pnpm run lint && pnpm run build && pnpm run test:coverage", "check:docs": "pnpm run format:docs:check && pnpm run lint:docs", @@ -56,7 +62,8 @@ "dependencies": { "@agentclientprotocol/sdk": "^0.15.0", "commander": "^14.0.3", - "skillflag": "^0.1.4" + "skillflag": "^0.1.4", + "tsx": "^4.0.0" }, "devDependencies": { "@types/node": "^25.3.5", @@ -68,7 +75,6 @@ "oxlint": "^1.51.0", "oxlint-tsgolint": "^0.16.0", "tsdown": "^0.21.0-beta.2", - "tsx": "^4.0.0", "typescript": "^5.7.0" }, "lint-staged": { diff --git a/pnpm-lock.yaml b/pnpm-lock.yaml index ba2722e..942c787 100644 --- a/pnpm-lock.yaml +++ b/pnpm-lock.yaml @@ -17,6 +17,9 @@ importers: skillflag: specifier: ^0.1.4 version: 0.1.4 + tsx: + specifier: ^4.0.0 + version: 4.21.0 devDependencies: '@types/node': specifier: ^25.3.5 @@ -45,9 +48,6 @@ importers: tsdown: specifier: ^0.21.0-beta.2 version: 0.21.1(@typescript/native-preview@7.0.0-dev.20260310.1)(typescript@5.9.3) - tsx: - specifier: ^4.0.0 - version: 4.21.0 typescript: specifier: ^5.7.0 version: 5.9.3 diff --git a/src/cli-core.ts b/src/cli-core.ts index 244cba4..59ddae5 100644 --- a/src/cli-core.ts +++ b/src/cli-core.ts @@ -70,10 +70,17 @@ class NoSessionError extends Error { } } +type FlowRunFlags = { + inputJson?: string; + inputFile?: string; + defaultAgent?: string; +}; + const TOP_LEVEL_VERBS = new Set([ "prompt", "exec", "cancel", + "flow", "set-mode", "set", "sessions", @@ -1302,6 +1309,28 @@ function registerConfigCommand(program: Command, config: ResolvedAcpxConfig): vo }); } +function registerFlowCommand(program: Command, config: ResolvedAcpxConfig): void { + const flowCommand = program + .command("flow") + .description("Run multi-step ACP workflows from flow files"); + + flowCommand + .command("run") + .description("Run a flow file") + .argument("", "Flow module path") + .option("--input-json ", "Flow input as JSON") + .option("--input-file ", "Read flow input JSON from file") + .option( + "--default-agent ", + "Default agent profile for ACP nodes without profile", + (value: string) => parseNonEmptyValue("Default agent", value), + ) + .action(async function (this: Command, file: string, flags: FlowRunFlags) { + const { handleFlowRun } = await import("./flows/cli.js"); + await handleFlowRun(file, flags, this, config); + }); +} + function registerDefaultCommands(program: Command, config: ResolvedAcpxConfig): void { registerSharedAgentSubcommands(program, undefined, config, { prompt: `Prompt using ${config.defaultAgent} by default`, @@ -1314,6 +1343,7 @@ function registerDefaultCommands(program: Command, config: ResolvedAcpxConfig): registerSessionsCommand(program, undefined, config); registerConfigCommand(program, config); + registerFlowCommand(program, config); } type AgentTokenScan = { diff --git a/src/flows.ts b/src/flows.ts new file mode 100644 index 0000000..7529b5e --- /dev/null +++ b/src/flows.ts @@ -0,0 +1,23 @@ +export { + FlowRunner, + acp, + action, + checkpoint, + compute, + defineFlow, + flowRunsBaseDir, + type AcpNodeDefinition, + type ActionNodeDefinition, + type CheckpointNodeDefinition, + type ComputeNodeDefinition, + type FlowDefinition, + type FlowEdge, + type FlowNodeContext, + type FlowNodeDefinition, + type FlowRunResult, + type FlowRunState, + type FlowRunnerOptions, + type FlowSessionBinding, + type FlowStepRecord, +} from "./flows/runtime.js"; +export { extractJsonObject } from "./flows/json.js"; diff --git a/src/flows/cli.ts b/src/flows/cli.ts new file mode 100644 index 0000000..8d45ae5 --- /dev/null +++ b/src/flows/cli.ts @@ -0,0 +1,138 @@ +import fs from "node:fs/promises"; +import path from "node:path"; +import { pathToFileURL } from "node:url"; +import { InvalidArgumentError, type Command } from "commander"; +import { tsImport } from "tsx/esm/api"; +import { + resolveAgentInvocation, + resolveGlobalFlags, + resolveOutputPolicy, + resolvePermissionMode, + type GlobalFlags, +} from "../cli/flags.js"; +import type { ResolvedAcpxConfig } from "../config.js"; +import { type FlowDefinition, FlowRunner } from "../flows.js"; +import { GitHubFlowService } from "./github.js"; + +type FlowRunFlags = { + inputJson?: string; + inputFile?: string; + defaultAgent?: string; +}; + +export async function handleFlowRun( + flowFile: string, + flags: FlowRunFlags, + command: Command, + config: ResolvedAcpxConfig, +): Promise { + const globalFlags = resolveGlobalFlags(command, config); + const permissionMode = resolvePermissionMode(globalFlags, config.defaultPermissions); + const outputPolicy = resolveOutputPolicy(globalFlags.format, globalFlags.jsonStrict === true); + const input = await readFlowInput(flags); + const flowPath = path.resolve(flowFile); + const flow = await loadFlowModule(flowPath); + + const runner = new FlowRunner({ + resolveAgent: (profile?: string) => { + return resolveAgentInvocation(profile ?? flags.defaultAgent, globalFlags, config); + }, + permissionMode, + mcpServers: config.mcpServers, + nonInteractivePermissions: globalFlags.nonInteractivePermissions, + authCredentials: config.auth, + authPolicy: globalFlags.authPolicy, + timeoutMs: globalFlags.timeout, + ttlMs: globalFlags.ttl, + verbose: globalFlags.verbose, + suppressSdkConsoleErrors: outputPolicy.suppressSdkConsoleErrors, + sessionOptions: { + model: globalFlags.model, + allowedTools: globalFlags.allowedTools, + maxTurns: globalFlags.maxTurns, + }, + services: { + github: new GitHubFlowService(), + }, + }); + + const result = await runner.run(flow, input, { + flowPath, + }); + + printFlowRunResult(result, globalFlags); +} + +async function readFlowInput(flags: FlowRunFlags): Promise { + if (flags.inputJson && flags.inputFile) { + throw new InvalidArgumentError("Use only one of --input-json or --input-file"); + } + + if (flags.inputJson) { + return parseJsonInput(flags.inputJson, "--input-json"); + } + + if (flags.inputFile) { + const inputPath = path.resolve(flags.inputFile); + const payload = await fs.readFile(inputPath, "utf8"); + return parseJsonInput(payload, "--input-file"); + } + + return {}; +} + +async function loadFlowModule(flowPath: string): Promise { + const module = (await tsImport(pathToFileURL(flowPath).href, import.meta.url)) as { + default?: unknown; + }; + if (!module.default || typeof module.default !== "object") { + throw new Error(`Flow module must export a default flow object: ${flowPath}`); + } + return module.default as FlowDefinition; +} + +function parseJsonInput(raw: string, label: string): unknown { + try { + return JSON.parse(raw); + } catch (error) { + throw new InvalidArgumentError( + `${label} must contain valid JSON: ${error instanceof Error ? error.message : String(error)}`, + ); + } +} + +function printFlowRunResult( + result: Awaited>, + globalFlags: GlobalFlags, +): void { + const payload = { + action: "flow_run_result", + runId: result.state.runId, + flowName: result.state.flowName, + flowPath: result.state.flowPath, + status: result.state.status, + waitingOn: result.state.waitingOn, + runDir: result.runDir, + outputs: result.state.outputs, + sessionBindings: result.state.sessionBindings, + }; + + if (globalFlags.format === "json") { + process.stdout.write(`${JSON.stringify(payload)}\n`); + return; + } + + if (globalFlags.format === "quiet") { + process.stdout.write(`${result.state.runId}\n`); + return; + } + + process.stdout.write(`runId: ${payload.runId}\n`); + process.stdout.write(`flow: ${payload.flowName}\n`); + process.stdout.write(`status: ${payload.status}\n`); + process.stdout.write(`runDir: ${payload.runDir}\n`); + if (payload.waitingOn) { + process.stdout.write(`waitingOn: ${payload.waitingOn}\n`); + } + process.stdout.write(`${JSON.stringify(payload.outputs, null, 2)}\n`); +} diff --git a/src/flows/github.ts b/src/flows/github.ts new file mode 100644 index 0000000..f5754ef --- /dev/null +++ b/src/flows/github.ts @@ -0,0 +1,148 @@ +import { execFile } from "node:child_process"; +import { promisify } from "node:util"; + +const execFileAsync = promisify(execFile); + +export type PullRequestContext = { + repo: string; + pr: Record; + linkedIssue: Record | null; + promptContext: string; +}; + +export class GitHubFlowService { + private readonly ghCommand: string; + private readonly maxDiffChars: number; + + constructor(options: { ghCommand?: string; maxDiffChars?: number } = {}) { + this.ghCommand = options.ghCommand ?? "gh"; + this.maxDiffChars = options.maxDiffChars ?? 30_000; + } + + async loadPullRequestContext(options: { + repo: string; + prNumber: number; + }): Promise { + const pr = await this.readJson([ + "pr", + "view", + String(options.prNumber), + "-R", + options.repo, + "--json", + "number,title,body,author,url,additions,deletions,changedFiles,files,baseRefName,headRefName", + ]); + + const linkedIssueNumber = findLinkedIssueNumber(typeof pr.body === "string" ? pr.body : ""); + const linkedIssue = linkedIssueNumber + ? await this.readJson([ + "issue", + "view", + String(linkedIssueNumber), + "-R", + options.repo, + "--json", + "number,title,body,url", + ]) + : null; + + const diff = await this.readText(["pr", "diff", String(options.prNumber), "-R", options.repo]); + const truncatedDiff = + diff.length > this.maxDiffChars + ? `${diff.slice(0, this.maxDiffChars)}\n\n[diff truncated at ${this.maxDiffChars} characters]` + : diff; + + return { + repo: options.repo, + pr, + linkedIssue, + promptContext: formatPromptContext({ + repo: options.repo, + pr, + linkedIssue, + diff: truncatedDiff, + }), + }; + } + + private async readJson(args: string[]): Promise> { + const stdout = await this.readText(args); + return JSON.parse(stdout) as Record; + } + + private async readText(args: string[]): Promise { + const result = await execFileAsync(this.ghCommand, args, { + maxBuffer: 10 * 1024 * 1024, + }); + return result.stdout.trim(); + } +} + +function findLinkedIssueNumber(body: string): number | null { + const match = body.match(/\b(?:close[sd]?|fix(?:e[sd])?|resolve[sd]?)\s+#(\d+)\b/i); + return match ? Number(match[1]) : null; +} + +function formatPromptContext(options: { + repo: string; + pr: Record; + linkedIssue: Record | null; + diff: string; +}): string { + const files = Array.isArray(options.pr.files) + ? options.pr.files + .map((file) => { + if (!file || typeof file !== "object") { + return null; + } + const record = file as Record; + return `- ${asString(record.path, "unknown")} (+${asNumber(record.additions, 0)} / -${asNumber(record.deletions, 0)})`; + }) + .filter((line): line is string => Boolean(line)) + .join("\n") + : ""; + + const issueSection = options.linkedIssue + ? `Linked issue #${asNumber(options.linkedIssue.number, 0)}: ${asString(options.linkedIssue.title)}\n${asString(options.linkedIssue.body)}` + : "No linked issue was found in the PR body."; + + return [ + `Repository: ${options.repo}`, + `PR #${asNumber(options.pr.number, 0)}: ${asString(options.pr.title)}`, + `URL: ${asString(options.pr.url)}`, + `Author: ${asAuthorLogin(options.pr.author)}`, + `Base: ${asString(options.pr.baseRefName)}`, + `Head: ${asString(options.pr.headRefName)}`, + `Changed files: ${asNumber(options.pr.changedFiles, 0)}`, + `Additions: ${asNumber(options.pr.additions, 0)}`, + `Deletions: ${asNumber(options.pr.deletions, 0)}`, + "", + "PR body:", + asString(options.pr.body, "(empty)"), + "", + "Changed files:", + files || "(none)", + "", + "Underlying issue:", + issueSection, + "", + "Diff:", + options.diff || "(empty diff)", + ].join("\n"); +} + +function asString(value: unknown, fallback = ""): string { + return typeof value === "string" ? value : fallback; +} + +function asNumber(value: unknown, fallback: number): number { + return typeof value === "number" && Number.isFinite(value) ? value : fallback; +} + +function asAuthorLogin(value: unknown): string { + if (!value || typeof value !== "object") { + return "unknown"; + } + + return asString((value as { login?: unknown }).login, "unknown"); +} diff --git a/src/flows/json.ts b/src/flows/json.ts new file mode 100644 index 0000000..a8d7d3d --- /dev/null +++ b/src/flows/json.ts @@ -0,0 +1,104 @@ +export function extractJsonObject(text: string): unknown { + const trimmed = String(text ?? "").trim(); + if (!trimmed) { + throw new Error("Expected JSON output, got empty text"); + } + + const direct = tryParse(trimmed); + if (direct.ok) { + return direct.value; + } + + const fencedMatch = trimmed.match(/```(?:json)?\s*([\s\S]*?)```/i); + if (fencedMatch) { + const fenced = tryParse(fencedMatch[1].trim()); + if (fenced.ok) { + return fenced.value; + } + } + + const balanced = extractBalancedJson(trimmed); + if (balanced) { + const parsed = tryParse(balanced); + if (parsed.ok) { + return parsed.value; + } + } + + throw new Error(`Could not parse JSON from assistant output:\n${trimmed}`); +} + +function tryParse(text: string): { ok: true; value: unknown } | { ok: false } { + try { + return { + ok: true, + value: JSON.parse(text), + }; + } catch { + return { + ok: false, + }; + } +} + +function extractBalancedJson(text: string): string | null { + for (let index = 0; index < text.length; index += 1) { + if (text[index] !== "{" && text[index] !== "[") { + continue; + } + + const result = scanBalanced(text, index); + if (result) { + return result; + } + } + + return null; +} + +function scanBalanced(text: string, startIndex: number): string | null { + const stack: string[] = []; + let inString = false; + let escaped = false; + + for (let index = startIndex; index < text.length; index += 1) { + const char = text[index]; + + if (inString) { + if (escaped) { + escaped = false; + } else if (char === "\\") { + escaped = true; + } else if (char === '"') { + inString = false; + } + continue; + } + + if (char === '"') { + inString = true; + continue; + } + + if (char === "{" || char === "[") { + stack.push(char); + continue; + } + + if (char !== "}" && char !== "]") { + continue; + } + + const last = stack.at(-1); + if ((last === "{" && char !== "}") || (last === "[" && char !== "]")) { + return null; + } + + stack.pop(); + if (stack.length === 0) { + return text.slice(startIndex, index + 1); + } + } + + return null; +} diff --git a/src/flows/runtime.ts b/src/flows/runtime.ts new file mode 100644 index 0000000..1f98dfa --- /dev/null +++ b/src/flows/runtime.ts @@ -0,0 +1,599 @@ +import { randomUUID } from "node:crypto"; +import fs from "node:fs/promises"; +import os from "node:os"; +import path from "node:path"; +import { createOutputFormatter } from "../output.js"; +import { promptToDisplayText, textPrompt } from "../prompt-content.js"; +import { resolveSessionRecord } from "../session-persistence.js"; +import { createSession, runOnce, sendSession, type SessionAgentOptions } from "../session.js"; +import type { + AuthPolicy, + McpServer, + NonInteractivePermissionPolicy, + PermissionMode, + PromptInput, +} from "../types.js"; + +type MaybePromise = T | Promise; + +export type FlowNodeContext = { + input: TInput; + outputs: Record; + state: FlowRunState; + services: Record; +}; + +export type FlowEdge = + | { + from: string; + to: string; + } + | { + from: string; + switch: { + on: string; + cases: Record; + }; + }; + +export type AcpNodeDefinition = { + kind: "acp"; + profile?: string; + session?: { + handle?: string; + isolated?: boolean; + }; + prompt: (context: FlowNodeContext) => MaybePromise; + parse?: (text: string, context: FlowNodeContext) => MaybePromise; +}; + +export type ComputeNodeDefinition = { + kind: "compute"; + run: (context: FlowNodeContext) => MaybePromise; +}; + +export type ActionNodeDefinition = { + kind: "action"; + run: (context: FlowNodeContext) => MaybePromise; +}; + +export type CheckpointNodeDefinition = { + kind: "checkpoint"; + summary?: string; + run?: (context: FlowNodeContext) => MaybePromise; +}; + +export type FlowNodeDefinition = + | AcpNodeDefinition + | ComputeNodeDefinition + | ActionNodeDefinition + | CheckpointNodeDefinition; + +export type FlowDefinition = { + name: string; + startAt: string; + nodes: Record; + edges: FlowEdge[]; +}; + +export type FlowStepRecord = { + nodeId: string; + kind: FlowNodeDefinition["kind"]; + startedAt: string; + finishedAt: string; + promptText: string | null; + rawText: string | null; + output: unknown; + session: FlowSessionBinding | null; + agent: { + agentName: string; + agentCommand: string; + cwd: string; + } | null; +}; + +export type FlowSessionBinding = { + key: string; + handle: string; + name: string; + profile?: string; + agentName: string; + agentCommand: string; + cwd: string; + acpxRecordId: string; + acpSessionId: string; + agentSessionId?: string; +}; + +export type FlowRunState = { + runId: string; + flowName: string; + flowPath?: string; + startedAt: string; + finishedAt?: string; + updatedAt: string; + status: "running" | "waiting" | "completed" | "failed"; + input: unknown; + outputs: Record; + steps: FlowStepRecord[]; + sessionBindings: Record; + waitingOn?: string; + error?: string; +}; + +export type FlowRunResult = { + runDir: string; + state: FlowRunState; +}; + +type MemoryWritable = { + write(chunk: string): void; +}; + +export type FlowRunnerOptions = { + resolveAgent: (profile?: string) => { + agentName: string; + agentCommand: string; + cwd: string; + }; + permissionMode: PermissionMode; + mcpServers?: McpServer[]; + nonInteractivePermissions?: NonInteractivePermissionPolicy; + authCredentials?: Record; + authPolicy?: AuthPolicy; + timeoutMs?: number; + ttlMs?: number; + verbose?: boolean; + suppressSdkConsoleErrors?: boolean; + sessionOptions?: SessionAgentOptions; + services?: Record; + outputRoot?: string; +}; + +export function defineFlow(definition: TFlow): TFlow { + return definition; +} + +export function acp(definition: Omit): AcpNodeDefinition { + return { + kind: "acp", + ...definition, + }; +} + +export function compute(definition: Omit): ComputeNodeDefinition { + return { + kind: "compute", + ...definition, + }; +} + +export function action(definition: Omit): ActionNodeDefinition { + return { + kind: "action", + ...definition, + }; +} + +export function checkpoint( + definition: Omit = {}, +): CheckpointNodeDefinition { + return { + kind: "checkpoint", + ...definition, + }; +} + +export function flowRunsBaseDir(homeDir: string = os.homedir()): string { + return path.join(homeDir, ".acpx", "flows", "runs"); +} + +export class FlowRunner { + private readonly resolveAgent; + private readonly permissionMode; + private readonly mcpServers?; + private readonly nonInteractivePermissions?; + private readonly authCredentials?; + private readonly authPolicy?; + private readonly timeoutMs?; + private readonly ttlMs?; + private readonly verbose?; + private readonly suppressSdkConsoleErrors?; + private readonly sessionOptions?; + private readonly services; + private readonly outputRoot; + + constructor(options: FlowRunnerOptions) { + this.resolveAgent = options.resolveAgent; + this.permissionMode = options.permissionMode; + this.mcpServers = options.mcpServers; + this.nonInteractivePermissions = options.nonInteractivePermissions; + this.authCredentials = options.authCredentials; + this.authPolicy = options.authPolicy; + this.timeoutMs = options.timeoutMs; + this.ttlMs = options.ttlMs; + this.verbose = options.verbose; + this.suppressSdkConsoleErrors = options.suppressSdkConsoleErrors; + this.sessionOptions = options.sessionOptions; + this.services = options.services ?? {}; + this.outputRoot = options.outputRoot ?? flowRunsBaseDir(); + } + + async run( + flow: FlowDefinition, + input: unknown, + options: { flowPath?: string } = {}, + ): Promise { + validateFlowDefinition(flow); + + const runId = createRunId(flow.name); + const runDir = path.join(this.outputRoot, runId); + const state: FlowRunState = { + runId, + flowName: flow.name, + flowPath: options.flowPath, + startedAt: isoNow(), + updatedAt: isoNow(), + status: "running", + input, + outputs: {}, + steps: [], + sessionBindings: {}, + }; + + await fs.mkdir(runDir, { recursive: true }); + await this.persist(runDir, state, { + type: "run_started", + flowName: flow.name, + flowPath: options.flowPath, + }); + + let current: string | null = flow.startAt; + + try { + while (current) { + const node = flow.nodes[current]; + if (!node) { + throw new Error(`Unknown flow node: ${current}`); + } + + const startedAt = isoNow(); + const context = this.makeContext(state, input); + let output: unknown; + let promptText: string | null = null; + let rawText: string | null = null; + let sessionInfo: FlowSessionBinding | null = null; + let agentInfo: ReturnType | null = null; + + switch (node.kind) { + case "compute": { + output = await node.run(context); + break; + } + case "action": { + output = await node.run(context); + break; + } + case "checkpoint": { + output = + typeof node.run === "function" + ? await node.run(context) + : { + checkpoint: current, + summary: node.summary ?? current, + }; + state.outputs[current] = output; + state.waitingOn = current; + state.updatedAt = isoNow(); + state.status = "waiting"; + state.steps.push({ + nodeId: current, + kind: node.kind, + startedAt, + finishedAt: isoNow(), + promptText, + rawText, + output, + session: null, + agent: null, + }); + await this.persist(runDir, state, { + type: "checkpoint_entered", + nodeId: current, + output, + }); + return { + runDir, + state, + }; + } + case "acp": { + agentInfo = this.resolveAgent(node.profile); + const prompt = normalizePromptInput(await node.prompt(context)); + promptText = promptToDisplayText(prompt); + if (node.session?.isolated) { + rawText = await this.runIsolatedPrompt(agentInfo, prompt); + } else { + sessionInfo = await this.ensureSessionBinding(state, flow, node, agentInfo); + rawText = await this.runPersistentPrompt(sessionInfo, prompt); + sessionInfo = await this.refreshSessionBinding(sessionInfo); + state.sessionBindings[sessionInfo.key] = sessionInfo; + } + output = node.parse ? await node.parse(rawText, context) : rawText; + break; + } + default: { + const exhaustive: never = node; + throw new Error(`Unsupported flow node: ${String(exhaustive)}`); + } + } + + state.outputs[current] = output; + state.updatedAt = isoNow(); + state.steps.push({ + nodeId: current, + kind: node.kind, + startedAt, + finishedAt: isoNow(), + promptText, + rawText, + output, + session: sessionInfo, + agent: agentInfo, + }); + + await this.persist(runDir, state, { + type: "node_completed", + nodeId: current, + output, + }); + + current = resolveNext(flow.edges, current, output); + } + + state.status = "completed"; + state.finishedAt = isoNow(); + state.updatedAt = state.finishedAt; + await this.persist(runDir, state, { type: "run_completed" }); + return { + runDir, + state, + }; + } catch (error) { + state.status = "failed"; + state.updatedAt = isoNow(); + state.finishedAt = state.updatedAt; + state.error = error instanceof Error ? error.message : String(error); + await this.persist(runDir, state, { + type: "run_failed", + error: state.error, + }); + throw error; + } + } + + private makeContext(state: FlowRunState, input: unknown): FlowNodeContext { + return { + input, + outputs: state.outputs, + state, + services: this.services, + }; + } + + private async ensureSessionBinding( + state: FlowRunState, + flow: FlowDefinition, + node: AcpNodeDefinition, + agent: ReturnType, + ): Promise { + const handle = node.session?.handle ?? "main"; + const key = `${agent.agentCommand}::${handle}`; + const existing = state.sessionBindings[key]; + if (existing) { + return existing; + } + + const name = `${flow.name}-${handle}-${state.runId.slice(-8)}`; + const created = await createSession({ + agentCommand: agent.agentCommand, + cwd: agent.cwd, + name, + mcpServers: this.mcpServers, + permissionMode: this.permissionMode, + nonInteractivePermissions: this.nonInteractivePermissions, + authCredentials: this.authCredentials, + authPolicy: this.authPolicy, + timeoutMs: this.timeoutMs, + verbose: this.verbose, + sessionOptions: this.sessionOptions, + }); + + const binding: FlowSessionBinding = { + key, + handle, + name, + profile: node.profile, + agentName: agent.agentName, + agentCommand: agent.agentCommand, + cwd: agent.cwd, + acpxRecordId: created.acpxRecordId, + acpSessionId: created.acpSessionId, + agentSessionId: created.agentSessionId, + }; + state.sessionBindings[key] = binding; + return binding; + } + + private async refreshSessionBinding(binding: FlowSessionBinding): Promise { + const record = await resolveSessionRecord(binding.acpxRecordId); + return { + ...binding, + acpSessionId: record.acpSessionId, + agentSessionId: record.agentSessionId, + }; + } + + private async runPersistentPrompt( + binding: FlowSessionBinding, + prompt: PromptInput, + ): Promise { + const capture = createQuietCaptureOutput(); + await sendSession({ + sessionId: binding.acpxRecordId, + prompt, + mcpServers: this.mcpServers, + permissionMode: this.permissionMode, + nonInteractivePermissions: this.nonInteractivePermissions, + authCredentials: this.authCredentials, + authPolicy: this.authPolicy, + outputFormatter: capture.formatter, + suppressSdkConsoleErrors: this.suppressSdkConsoleErrors, + timeoutMs: this.timeoutMs, + ttlMs: this.ttlMs, + verbose: this.verbose, + waitForCompletion: true, + }); + return capture.read(); + } + + private async runIsolatedPrompt( + agent: ReturnType, + prompt: PromptInput, + ): Promise { + const capture = createQuietCaptureOutput(); + await runOnce({ + agentCommand: agent.agentCommand, + cwd: agent.cwd, + prompt, + mcpServers: this.mcpServers, + permissionMode: this.permissionMode, + nonInteractivePermissions: this.nonInteractivePermissions, + authCredentials: this.authCredentials, + authPolicy: this.authPolicy, + outputFormatter: capture.formatter, + suppressSdkConsoleErrors: this.suppressSdkConsoleErrors, + timeoutMs: this.timeoutMs, + verbose: this.verbose, + sessionOptions: this.sessionOptions, + }); + return capture.read(); + } + + private async persist( + runDir: string, + state: FlowRunState, + event: Record, + ): Promise { + state.updatedAt = isoNow(); + const runPath = path.join(runDir, "run.json"); + const tempPath = `${runPath}.${process.pid}.${Date.now()}.tmp`; + const payload = JSON.stringify(state, null, 2); + await fs.writeFile(tempPath, `${payload}\n`, "utf8"); + await fs.rename(tempPath, runPath); + await fs.appendFile( + path.join(runDir, "events.ndjson"), + `${JSON.stringify({ at: isoNow(), ...event })}\n`, + "utf8", + ); + } +} + +function validateFlowDefinition(flow: FlowDefinition): void { + if (!flow.name.trim()) { + throw new Error("Flow name must not be empty"); + } + if (!flow.nodes[flow.startAt]) { + throw new Error(`Flow start node is missing: ${flow.startAt}`); + } + + for (const edge of flow.edges) { + if (!flow.nodes[edge.from]) { + throw new Error(`Flow edge references unknown from-node: ${edge.from}`); + } + if ("to" in edge) { + if (!flow.nodes[edge.to]) { + throw new Error(`Flow edge references unknown to-node: ${edge.to}`); + } + continue; + } + for (const target of Object.values(edge.switch.cases)) { + if (!flow.nodes[target]) { + throw new Error(`Flow switch references unknown to-node: ${target}`); + } + } + } +} + +function normalizePromptInput(prompt: PromptInput | string): PromptInput { + return typeof prompt === "string" ? textPrompt(prompt) : prompt; +} + +function resolveNext(edges: FlowEdge[], from: string, output: unknown): string | null { + const edge = edges.find((candidate) => candidate.from === from); + if (!edge) { + return null; + } + + if ("to" in edge) { + return edge.to; + } + + const value = getByPath(output, edge.switch.on); + if (typeof value !== "string" && typeof value !== "number" && typeof value !== "boolean") { + throw new Error(`Flow switch value must be scalar for ${edge.switch.on}`); + } + const next = edge.switch.cases[String(value)]; + if (!next) { + throw new Error(`No flow switch case for ${edge.switch.on}=${JSON.stringify(value)}`); + } + return next; +} + +function getByPath(value: unknown, jsonPath: string): unknown { + if (!jsonPath.startsWith("$.")) { + throw new Error(`Unsupported JSON path: ${jsonPath}`); + } + + return jsonPath + .slice(2) + .split(".") + .reduce((current, key) => { + if (current == null || typeof current !== "object") { + return undefined; + } + return (current as Record)[key]; + }, value); +} + +function createQuietCaptureOutput(): { + formatter: ReturnType; + read: () => string; +} { + const chunks: string[] = []; + const stdout: MemoryWritable = { + write(chunk: string) { + chunks.push(chunk); + }, + }; + + return { + formatter: createOutputFormatter("quiet", { + stdout, + }), + read: () => chunks.join("").trim(), + }; +} + +function createRunId(flowName: string): string { + const stamp = isoNow().replaceAll(":", "").replaceAll(".", ""); + const slug = flowName + .replace(/[^a-z0-9]+/gi, "-") + .replace(/^-+|-+$/g, "") + .toLowerCase(); + return `${stamp}-${slug}-${randomUUID().slice(0, 8)}`; +} + +function isoNow(): string { + return new Date().toISOString(); +} diff --git a/test/cli.test.ts b/test/cli.test.ts index 5b0d721..b214109 100644 --- a/test/cli.test.ts +++ b/test/cli.test.ts @@ -241,6 +241,20 @@ test("sessions new command is present in help output", async () => { }); }); +test("flow run command is present in help output", async () => { + await withTempHome(async (homeDir) => { + const flowHelp = await runCli(["flow", "--help"], homeDir); + assert.equal(flowHelp.code, 0, flowHelp.stderr); + assert.match(flowHelp.stdout, /\brun\b/); + + const runHelp = await runCli(["flow", "run", "--help"], homeDir); + assert.equal(runHelp.code, 0, runHelp.stderr); + assert.match(runHelp.stdout, /--input-json /); + assert.match(runHelp.stdout, /--input-file /); + assert.match(runHelp.stdout, /--default-agent /); + }); +}); + test("sessions new --resume-session loads ACP session and stores resumed ids", async () => { await withTempHome(async (homeDir) => { const cwd = path.join(homeDir, "workspace"); diff --git a/test/fixtures/flow-branch.flow.ts b/test/fixtures/flow-branch.flow.ts new file mode 100644 index 0000000..978a36b --- /dev/null +++ b/test/fixtures/flow-branch.flow.ts @@ -0,0 +1,45 @@ +import { acp, compute, defineFlow, extractJsonObject } from "../../src/flows.js"; + +export default defineFlow({ + name: "fixture-branch", + startAt: "first", + nodes: { + first: acp({ + async prompt({ input }) { + return `echo ${JSON.stringify({ next: (input as { next: string }).next, turn: 1 })}`; + }, + parse: (text) => extractJsonObject(text), + }), + second: acp({ + async prompt() { + return 'echo {"turn":2}'; + }, + parse: (text) => extractJsonObject(text), + }), + route: compute({ + run: ({ input }) => ({ + next: (input as { next: string }).next, + }), + }), + yes_path: compute({ + run: () => ({ ok: true }), + }), + no_path: compute({ + run: () => ({ ok: false }), + }), + }, + edges: [ + { from: "first", to: "second" }, + { from: "second", to: "route" }, + { + from: "route", + switch: { + on: "$.next", + cases: { + yes_path: "yes_path", + no_path: "no_path", + }, + }, + }, + ], +}); diff --git a/test/flows.test.ts b/test/flows.test.ts new file mode 100644 index 0000000..3c12109 --- /dev/null +++ b/test/flows.test.ts @@ -0,0 +1,243 @@ +import assert from "node:assert/strict"; +import fs from "node:fs/promises"; +import os from "node:os"; +import path from "node:path"; +import test from "node:test"; +import { fileURLToPath } from "node:url"; +import { + FlowRunner, + acp, + action, + checkpoint, + compute, + defineFlow, + extractJsonObject, + flowRunsBaseDir, +} from "../src/flows.js"; +import { GitHubFlowService } from "../src/flows/github.js"; + +const MOCK_AGENT_PATH = fileURLToPath(new URL("./mock-agent.js", import.meta.url)); +const MOCK_AGENT_COMMAND = `node ${JSON.stringify(MOCK_AGENT_PATH)}`; + +test("extractJsonObject parses direct, fenced, and embedded JSON", () => { + assert.deepEqual(extractJsonObject('{"ok":true}'), { ok: true }); + assert.deepEqual(extractJsonObject('```json\n{"ok":true}\n```'), { ok: true }); + assert.deepEqual(extractJsonObject('before {"ok":true} after'), { ok: true }); +}); + +test("FlowRunner executes isolated ACP nodes and branches deterministically", async () => { + await withTempHome(async (homeDir) => { + const cwd = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-cwd-")); + + try { + const runner = new FlowRunner({ + resolveAgent: () => ({ + agentName: "mock", + agentCommand: MOCK_AGENT_COMMAND, + cwd, + }), + permissionMode: "approve-all", + ttlMs: 1_000, + }); + + const flow = defineFlow({ + name: "branch-test", + startAt: "first", + nodes: { + first: acp({ + session: { + isolated: true, + }, + async prompt({ input }) { + const next = (input as { next: string }).next; + return `echo ${JSON.stringify({ next })}`; + }, + parse: (text) => extractJsonObject(text), + }), + second: acp({ + session: { + isolated: true, + }, + async prompt() { + return 'echo {"seen":"second"}'; + }, + parse: (text) => extractJsonObject(text), + }), + route: compute({ + run: ({ outputs }) => ({ + next: String((outputs.first as { next: string }).next), + }), + }), + yes: action({ + run: () => ({ ok: true }), + }), + no: action({ + run: () => ({ ok: false }), + }), + }, + edges: [ + { from: "first", to: "second" }, + { from: "second", to: "route" }, + { + from: "route", + switch: { + on: "$.next", + cases: { + yes: "yes", + no: "no", + }, + }, + }, + ], + }); + + const result = await runner.run(flow, { next: "yes" }); + assert.equal(result.state.status, "completed"); + assert.deepEqual(result.state.outputs.yes, { ok: true }); + assert.equal(result.state.outputs.no, undefined); + assert.match(result.runDir, new RegExp(escapeRegExp(flowRunsBaseDir(homeDir)))); + } finally { + await fs.rm(cwd, { recursive: true, force: true }); + } + }); +}); + +test("FlowRunner stops at checkpoint nodes and marks the run as waiting", async () => { + await withTempHome(async () => { + const runner = new FlowRunner({ + resolveAgent: () => ({ + agentName: "unused", + agentCommand: "unused", + cwd: process.cwd(), + }), + permissionMode: "approve-all", + outputRoot: await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-store-")), + }); + + const flow = defineFlow({ + name: "checkpoint-test", + startAt: "prepare", + nodes: { + prepare: action({ + run: () => ({ prepared: true }), + }), + wait_for_human: checkpoint({ + summary: "needs review", + }), + after_wait: action({ + run: () => ({ unreachable: true }), + }), + }, + edges: [ + { from: "prepare", to: "wait_for_human" }, + { from: "wait_for_human", to: "after_wait" }, + ], + }); + + const result = await runner.run(flow, {}); + assert.equal(result.state.status, "waiting"); + assert.equal(result.state.waitingOn, "wait_for_human"); + assert.deepEqual(result.state.outputs.prepare, { prepared: true }); + assert.equal(result.state.outputs.after_wait, undefined); + }); +}); + +test("GitHubFlowService builds PR prompt context from gh CLI output", async () => { + const tempDir = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-fake-gh-")); + const scriptPath = path.join(tempDir, "fake-gh.js"); + const launcherPath = + process.platform === "win32" ? path.join(tempDir, "gh.cmd") : path.join(tempDir, "gh"); + + await fs.writeFile( + scriptPath, + [ + "#!/usr/bin/env node", + "const args = process.argv.slice(2);", + 'if (args[0] === "pr" && args[1] === "view") {', + " process.stdout.write(JSON.stringify({", + " number: 7,", + ' title: "Flow PR",', + ' body: "Fixes #42",', + ' author: { login: "dev" },', + ' url: "https://example.test/pr/7",', + " additions: 10,", + " deletions: 2,", + " changedFiles: 1,", + ' files: [{ path: "src/flow.ts", additions: 10, deletions: 2 }],', + ' baseRefName: "main",', + ' headRefName: "feature/flow"', + " }));", + '} else if (args[0] === "issue" && args[1] === "view") {', + " process.stdout.write(JSON.stringify({", + " number: 42,", + ' title: "Underlying issue",', + ' body: "Make the flow runner reusable.",', + ' url: "https://example.test/issues/42"', + " }));", + '} else if (args[0] === "pr" && args[1] === "diff") {', + ' process.stdout.write("diff --git a/src/flow.ts b/src/flow.ts\\n+new behavior\\n+more");', + "} else {", + ' process.stderr.write(`unexpected args: ${args.join(" ")}`);', + " process.exit(1);", + "}", + "", + ].join("\n"), + { mode: 0o755 }, + ); + + if (process.platform === "win32") { + await fs.writeFile( + launcherPath, + [`@echo off`, `"${process.execPath}" "${scriptPath}" %*`, ""].join("\r\n"), + "utf8", + ); + } else { + await fs.writeFile( + launcherPath, + [`#!/bin/sh`, `exec "${process.execPath}" "${scriptPath}" "$@"`, ""].join("\n"), + { mode: 0o755 }, + ); + } + + try { + const service = new GitHubFlowService({ + ghCommand: launcherPath, + maxDiffChars: 20, + }); + + const context = await service.loadPullRequestContext({ + repo: "openclaw/acpx", + prNumber: 7, + }); + + assert.equal(context.repo, "openclaw/acpx"); + assert.equal(context.pr.number, 7); + assert.equal(context.linkedIssue?.number, 42); + assert.match(context.promptContext, /Flow PR/); + assert.match(context.promptContext, /Linked issue #42/); + assert.match(context.promptContext, /\[diff truncated at 20 characters]/); + } finally { + await fs.rm(tempDir, { recursive: true, force: true }); + } +}); + +async function withTempHome(run: (homeDir: string) => Promise): Promise { + const previousHome = process.env.HOME; + const homeDir = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-home-")); + process.env.HOME = homeDir; + + try { + await run(homeDir); + } finally { + if (previousHome === undefined) { + delete process.env.HOME; + } else { + process.env.HOME = previousHome; + } + await fs.rm(homeDir, { recursive: true, force: true }); + } +} + +function escapeRegExp(value: string): string { + return value.replace(/[.*+?^${}()|[\]\\]/g, "\\$&"); +} diff --git a/test/integration.test.ts b/test/integration.test.ts index 137837c..269ae78 100644 --- a/test/integration.test.ts +++ b/test/integration.test.ts @@ -15,6 +15,7 @@ import { queuePaths } from "./queue-test-helpers.js"; const CLI_PATH = fileURLToPath(new URL("../src/cli.js", import.meta.url)); const MOCK_AGENT_PATH = fileURLToPath(new URL("./mock-agent.js", import.meta.url)); +const FLOW_FIXTURE_PATH = fileURLToPath(new URL("./fixtures/flow-branch.flow.js", import.meta.url)); const MOCK_AGENT_COMMAND = `node ${JSON.stringify(MOCK_AGENT_PATH)}`; type CliRunResult = { @@ -72,6 +73,50 @@ test("integration: built-in cursor agent resolves to cursor-agent acp", async () }); }); +test("integration: flow run executes multiple ACP steps in one session and branches", async () => { + await withTempHome(async (homeDir) => { + const cwd = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-integration-cwd-")); + + try { + const result = await runCli( + [ + ...baseAgentArgs(cwd), + "--format", + "json", + "--ttl", + "1", + "flow", + "run", + FLOW_FIXTURE_PATH, + "--input-json", + JSON.stringify({ next: "yes_path" }), + ], + homeDir, + ); + + assert.equal(result.code, 0, result.stderr); + const payload = JSON.parse(result.stdout.trim()) as { + action?: string; + status?: string; + outputs?: Record; + sessionBindings?: Record; + }; + + assert.equal(payload.action, "flow_run_result"); + assert.equal(payload.status, "completed"); + assert.deepEqual(payload.outputs?.yes_path, { ok: true }); + assert.equal(payload.outputs?.no_path, undefined); + assert.equal( + Object.keys(payload.sessionBindings ?? {}).length, + 1, + JSON.stringify(payload, null, 2), + ); + } finally { + await fs.rm(cwd, { recursive: true, force: true }); + } + }); +}); + test("integration: built-in droid agent resolves to droid exec --output-format acp", async () => { await withTempHome(async (homeDir) => { const cwd = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-integration-cwd-")); diff --git a/workflows/pr-triage.flow.ts b/workflows/pr-triage.flow.ts new file mode 100644 index 0000000..bb6c821 --- /dev/null +++ b/workflows/pr-triage.flow.ts @@ -0,0 +1,202 @@ +import { acp, compute, defineFlow, extractJsonObject } from "../src/flows.js"; + +type PullRequestContext = { + repo: string; + promptContext: string; +}; + +export default defineFlow({ + name: "pr-triage", + startAt: "load_pr", + nodes: { + load_pr: compute({ + run: async ({ input, services }) => { + const github = services.github as { + loadPullRequestContext(options: { + repo: string; + prNumber: number; + }): Promise; + }; + const flowInput = input as { + repo: string; + prNumber: number; + }; + return await github.loadPullRequestContext({ + repo: flowInput.repo, + prNumber: flowInput.prNumber, + }); + }, + }), + + solution_fit: acp({ + profile: "codex", + async prompt({ outputs }) { + const context = outputs.load_pr as PullRequestContext; + return [ + "You are doing maintainability-first PR triage.", + "Question: is this the right solution for the underlying issue, or is it only a localized fix that does not address the real problem?", + "Use only the PR context below.", + "Return exactly one JSON object with this shape:", + "{", + ' "verdict": "right_solution" | "localized_fix" | "wrong_problem" | "unclear",', + ' "confidence": 0.0,', + ' "reason": "short explanation",', + ' "evidence": ["short bullet", "short bullet"]', + "}", + "", + context.promptContext, + ].join("\n"); + }, + parse: (text) => extractJsonObject(text), + }), + + issue_clarity: acp({ + profile: "codex", + async prompt() { + return [ + "Use the PR context already in this session.", + "Judge whether the underlying issue is clearly framed enough for safe autonomous continuation.", + "If there is no linked issue, decide whether the PR body still makes the underlying problem clear.", + "Return exactly one JSON object with this shape:", + "{", + ' "verdict": "clear" | "ambiguous" | "conflicting",', + ' "confidence": 0.0,', + ' "reason": "short explanation"', + "}", + ].join("\n"); + }, + parse: (text) => extractJsonObject(text), + }), + + scope_assessment: acp({ + profile: "codex", + async prompt() { + return [ + "Use the PR context and earlier reasoning already in this session.", + "Judge whether the scope is appropriately shaped for the codebase.", + "Return exactly one JSON object with this shape:", + "{", + ' "scope": "appropriately_local" | "too_local" | "cross_cutting_needed",', + ' "refactor_needed": "none" | "superficial" | "fundamental",', + ' "human_judgment_needed": true | false,', + ' "reason": "short explanation"', + "}", + ].join("\n"); + }, + parse: (text) => extractJsonObject(text), + }), + + route: compute({ + run: ({ outputs }) => { + const reasons: string[] = []; + const solutionFit = outputs.solution_fit as { + verdict?: string; + }; + const issueClarity = outputs.issue_clarity as { + verdict?: string; + }; + const scopeAssessment = outputs.scope_assessment as { + scope?: string; + refactor_needed?: string; + human_judgment_needed?: boolean; + }; + + if (solutionFit.verdict !== "right_solution") { + reasons.push(`solution_fit=${solutionFit.verdict ?? "unknown"}`); + } + if (issueClarity.verdict !== "clear") { + reasons.push(`issue_clarity=${issueClarity.verdict ?? "unknown"}`); + } + if (scopeAssessment.scope !== "appropriately_local") { + reasons.push(`scope=${scopeAssessment.scope ?? "unknown"}`); + } + if (scopeAssessment.refactor_needed === "fundamental") { + reasons.push("refactor_needed=fundamental"); + } + if (scopeAssessment.human_judgment_needed) { + reasons.push("human_judgment_needed=true"); + } + + return { + next: reasons.length > 0 ? "human_review" : "continue_lane", + reasons, + }; + }, + }), + + continue_lane: acp({ + profile: "codex", + async prompt({ outputs }) { + return [ + "We are continuing on the autonomous lane.", + "The runtime routed here because the earlier checks did not raise blockers.", + "Return exactly one JSON object with this shape:", + "{", + ' "route": "continue",', + ' "summary": "short explanation",', + ' "next_actions": ["action", "action"],', + ' "residual_risks": ["risk", "risk"]', + "}", + "", + `Runtime reasons: ${JSON.stringify((outputs.route as { reasons?: string[] }).reasons ?? [])}`, + ].join("\n"); + }, + parse: (text) => extractJsonObject(text), + }), + + human_review: acp({ + profile: "codex", + async prompt({ outputs }) { + return [ + "We are routing this PR to human review.", + "Return exactly one JSON object with this shape:", + "{", + ' "route": "human_review",', + ' "summary": "short explanation",', + ' "blocking_reasons": ["reason", "reason"],', + ' "questions_for_human": ["question", "question"]', + "}", + "", + `Runtime reasons: ${JSON.stringify((outputs.route as { reasons?: string[] }).reasons ?? [])}`, + ].join("\n"); + }, + parse: (text) => extractJsonObject(text), + }), + + finalize: compute({ + run: ({ outputs, state }) => { + const route = outputs.route as { + next: string; + reasons?: string[]; + }; + const branch = + route.next === "continue_lane" ? outputs.continue_lane : outputs.human_review; + + return { + route: (branch as { route?: string }).route, + routeReasons: route.reasons ?? [], + final: branch, + sessionBindings: state.sessionBindings, + }; + }, + }), + }, + edges: [ + { from: "load_pr", to: "solution_fit" }, + { from: "solution_fit", to: "issue_clarity" }, + { from: "issue_clarity", to: "scope_assessment" }, + { from: "scope_assessment", to: "route" }, + { + from: "route", + switch: { + on: "$.next", + cases: { + continue_lane: "continue_lane", + human_review: "human_review", + }, + }, + }, + { from: "continue_lane", to: "finalize" }, + { from: "human_review", to: "finalize" }, + ], +}); From 0eea5511f55aad86219114f634a7b081ccc12ab3 Mon Sep 17 00:00:00 2001 From: Onur Solmaz <2453968+osolmaz@users.noreply.github.com> Date: Wed, 25 Mar 2026 16:39:37 +0100 Subject: [PATCH 05/22] refactor: remove repo-specific flow examples --- CHANGELOG.md | 2 +- README.md | 4 +- ...26-03-25-acpx-flows-implementation-plan.md | 6 +- docs/CLI.md | 5 +- src/flows/cli.ts | 4 - src/flows/github.ts | 148 ------------- test/flows.test.ts | 80 ------- workflows/pr-triage.flow.ts | 202 ------------------ 8 files changed, 9 insertions(+), 442 deletions(-) delete mode 100644 src/flows/github.ts delete mode 100644 workflows/pr-triage.flow.ts diff --git a/CHANGELOG.md b/CHANGELOG.md index 57ee89e..c113d12 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,7 +8,7 @@ Repo: https://github.com/openclaw/acpx - Conformance/ACP: add a data-driven ACP core v1 conformance suite with CI smoke coverage, nightly coverage, and a hardened runner that reports startup failures cleanly and scopes filesystem checks to the session cwd. (#130) Thanks @lynnzc. - Agents/droid: add `factory-droid` and `factorydroid` aliases for the built-in Factory Droid adapter and sync the built-in docs. Thanks @vincentkoc. -- Flows/workflows: add an initial `flow run` command, an `acpx/flows` runtime surface, file-backed flow run state under `~/.acpx/flows/runs`, and a repo-level PR triage example workflow. Thanks @osolmaz. +- Flows/workflows: add an initial `flow run` command, an `acpx/flows` runtime surface, and file-backed flow run state under `~/.acpx/flows/runs` for user-authored workflow modules. Thanks @osolmaz. ### Breaking diff --git a/README.md b/README.md index 1226e5d..ff7ca6d 100644 --- a/README.md +++ b/README.md @@ -39,7 +39,7 @@ One command surface for Pi, OpenClaw ACP, Codex, Claude, and other ACP-compatibl - **Structured output**: typed ACP messages (thinking, tool calls, diffs) instead of ANSI scraping - **Any ACP agent**: built-in registry + `--agent` escape hatch for custom servers - **One-shot mode**: `exec` for stateless fire-and-forget tasks -- **Experimental flows**: `flow run ` for stepwise ACP workflows over multiple prompts +- **Experimental flows**: `flow run ` for user-authored ACP workflows over multiple prompts ```bash $ acpx codex sessions new @@ -205,7 +205,7 @@ acpx --cwd ~/repos/backend codex 'review recent auth changes' acpx --format text codex 'summarize your findings' acpx --format json codex exec 'review changed files' acpx --format json --json-strict codex exec 'machine-safe JSON only' -acpx flow run workflows/pr-triage.flow.ts --input-json '{"repo":"openclaw/acpx","prNumber":174}' +acpx flow run ./my-flow.ts --input-file ./flow-input.json acpx --format quiet codex 'final recommendation only' acpx --timeout 90 codex 'investigate intermittent test timeout' diff --git a/docs/2026-03-25-acpx-flows-implementation-plan.md b/docs/2026-03-25-acpx-flows-implementation-plan.md index 2246135..15319bc 100644 --- a/docs/2026-03-25-acpx-flows-implementation-plan.md +++ b/docs/2026-03-25-acpx-flows-implementation-plan.md @@ -273,7 +273,7 @@ The recommended repository layout is: Example: -- `workflows/pr-triage.flow.ts` +- `workflows/example.flow.ts` - `workflows/review.flow.ts` That keeps the workflow library separate from the workflows it executes and @@ -331,10 +331,10 @@ Explicit local side effect. Used for: -- GitHub writes - file writes - notifications - external API calls +- arbitrary local side effects ### `checkpoint(...)` @@ -632,7 +632,7 @@ The CLI should load them directly. The canonical local invocation should look like: -- `acpx flow run workflows/pr-triage.flow.ts` +- `acpx flow run workflows/example.flow.ts` That means the monorepo needs a dedicated runtime loader path for TypeScript flow modules instead of pretending the current CLI-only build is enough. diff --git a/docs/CLI.md b/docs/CLI.md index 684e84e..02f9839 100644 --- a/docs/CLI.md +++ b/docs/CLI.md @@ -60,7 +60,7 @@ Prompt options: Notes: - Top-level `prompt`, `exec`, `cancel`, `set-mode`, `set`, `sessions`, and bare `acpx ` default to `codex`. -- Top-level `flow run ` executes a workflow module and persists run state under `~/.acpx/flows/runs/`. +- Top-level `flow run ` executes a user-authored workflow module and persists run state under `~/.acpx/flows/runs/`. - If a prompt argument is omitted, `acpx` reads prompt text from stdin when piped. - `--file` works for implicit prompt, `prompt`, and `exec` commands. - `acpx` with no args in an interactive terminal shows help. @@ -71,12 +71,13 @@ Notes: acpx [global_options] flow run [--input-json | --input-file ] [--default-agent ] ``` -- Runs a workflow module step by step through the `acpx/flows` runtime. +- Runs a user-authored workflow module step by step through the `acpx/flows` runtime. - Persists run artifacts under `~/.acpx/flows/runs//`. - Reuses one implicit main ACP session by default for non-isolated `acp` nodes. - `--input-json` passes flow input inline as JSON. - `--input-file` reads flow input JSON from disk. - `--default-agent` supplies the default agent profile for `acp` nodes that do not pin one. +- `acpx` does not ship built-in workload-specific flows; the file is provided by the caller. ## Global options diff --git a/src/flows/cli.ts b/src/flows/cli.ts index 8d45ae5..8338021 100644 --- a/src/flows/cli.ts +++ b/src/flows/cli.ts @@ -12,7 +12,6 @@ import { } from "../cli/flags.js"; import type { ResolvedAcpxConfig } from "../config.js"; import { type FlowDefinition, FlowRunner } from "../flows.js"; -import { GitHubFlowService } from "./github.js"; type FlowRunFlags = { inputJson?: string; @@ -51,9 +50,6 @@ export async function handleFlowRun( allowedTools: globalFlags.allowedTools, maxTurns: globalFlags.maxTurns, }, - services: { - github: new GitHubFlowService(), - }, }); const result = await runner.run(flow, input, { diff --git a/src/flows/github.ts b/src/flows/github.ts deleted file mode 100644 index f5754ef..0000000 --- a/src/flows/github.ts +++ /dev/null @@ -1,148 +0,0 @@ -import { execFile } from "node:child_process"; -import { promisify } from "node:util"; - -const execFileAsync = promisify(execFile); - -export type PullRequestContext = { - repo: string; - pr: Record; - linkedIssue: Record | null; - promptContext: string; -}; - -export class GitHubFlowService { - private readonly ghCommand: string; - private readonly maxDiffChars: number; - - constructor(options: { ghCommand?: string; maxDiffChars?: number } = {}) { - this.ghCommand = options.ghCommand ?? "gh"; - this.maxDiffChars = options.maxDiffChars ?? 30_000; - } - - async loadPullRequestContext(options: { - repo: string; - prNumber: number; - }): Promise { - const pr = await this.readJson([ - "pr", - "view", - String(options.prNumber), - "-R", - options.repo, - "--json", - "number,title,body,author,url,additions,deletions,changedFiles,files,baseRefName,headRefName", - ]); - - const linkedIssueNumber = findLinkedIssueNumber(typeof pr.body === "string" ? pr.body : ""); - const linkedIssue = linkedIssueNumber - ? await this.readJson([ - "issue", - "view", - String(linkedIssueNumber), - "-R", - options.repo, - "--json", - "number,title,body,url", - ]) - : null; - - const diff = await this.readText(["pr", "diff", String(options.prNumber), "-R", options.repo]); - const truncatedDiff = - diff.length > this.maxDiffChars - ? `${diff.slice(0, this.maxDiffChars)}\n\n[diff truncated at ${this.maxDiffChars} characters]` - : diff; - - return { - repo: options.repo, - pr, - linkedIssue, - promptContext: formatPromptContext({ - repo: options.repo, - pr, - linkedIssue, - diff: truncatedDiff, - }), - }; - } - - private async readJson(args: string[]): Promise> { - const stdout = await this.readText(args); - return JSON.parse(stdout) as Record; - } - - private async readText(args: string[]): Promise { - const result = await execFileAsync(this.ghCommand, args, { - maxBuffer: 10 * 1024 * 1024, - }); - return result.stdout.trim(); - } -} - -function findLinkedIssueNumber(body: string): number | null { - const match = body.match(/\b(?:close[sd]?|fix(?:e[sd])?|resolve[sd]?)\s+#(\d+)\b/i); - return match ? Number(match[1]) : null; -} - -function formatPromptContext(options: { - repo: string; - pr: Record; - linkedIssue: Record | null; - diff: string; -}): string { - const files = Array.isArray(options.pr.files) - ? options.pr.files - .map((file) => { - if (!file || typeof file !== "object") { - return null; - } - const record = file as Record; - return `- ${asString(record.path, "unknown")} (+${asNumber(record.additions, 0)} / -${asNumber(record.deletions, 0)})`; - }) - .filter((line): line is string => Boolean(line)) - .join("\n") - : ""; - - const issueSection = options.linkedIssue - ? `Linked issue #${asNumber(options.linkedIssue.number, 0)}: ${asString(options.linkedIssue.title)}\n${asString(options.linkedIssue.body)}` - : "No linked issue was found in the PR body."; - - return [ - `Repository: ${options.repo}`, - `PR #${asNumber(options.pr.number, 0)}: ${asString(options.pr.title)}`, - `URL: ${asString(options.pr.url)}`, - `Author: ${asAuthorLogin(options.pr.author)}`, - `Base: ${asString(options.pr.baseRefName)}`, - `Head: ${asString(options.pr.headRefName)}`, - `Changed files: ${asNumber(options.pr.changedFiles, 0)}`, - `Additions: ${asNumber(options.pr.additions, 0)}`, - `Deletions: ${asNumber(options.pr.deletions, 0)}`, - "", - "PR body:", - asString(options.pr.body, "(empty)"), - "", - "Changed files:", - files || "(none)", - "", - "Underlying issue:", - issueSection, - "", - "Diff:", - options.diff || "(empty diff)", - ].join("\n"); -} - -function asString(value: unknown, fallback = ""): string { - return typeof value === "string" ? value : fallback; -} - -function asNumber(value: unknown, fallback: number): number { - return typeof value === "number" && Number.isFinite(value) ? value : fallback; -} - -function asAuthorLogin(value: unknown): string { - if (!value || typeof value !== "object") { - return "unknown"; - } - - return asString((value as { login?: unknown }).login, "unknown"); -} diff --git a/test/flows.test.ts b/test/flows.test.ts index 3c12109..9b061de 100644 --- a/test/flows.test.ts +++ b/test/flows.test.ts @@ -14,7 +14,6 @@ import { extractJsonObject, flowRunsBaseDir, } from "../src/flows.js"; -import { GitHubFlowService } from "../src/flows/github.js"; const MOCK_AGENT_PATH = fileURLToPath(new URL("./mock-agent.js", import.meta.url)); const MOCK_AGENT_COMMAND = `node ${JSON.stringify(MOCK_AGENT_PATH)}`; @@ -142,85 +141,6 @@ test("FlowRunner stops at checkpoint nodes and marks the run as waiting", async }); }); -test("GitHubFlowService builds PR prompt context from gh CLI output", async () => { - const tempDir = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-fake-gh-")); - const scriptPath = path.join(tempDir, "fake-gh.js"); - const launcherPath = - process.platform === "win32" ? path.join(tempDir, "gh.cmd") : path.join(tempDir, "gh"); - - await fs.writeFile( - scriptPath, - [ - "#!/usr/bin/env node", - "const args = process.argv.slice(2);", - 'if (args[0] === "pr" && args[1] === "view") {', - " process.stdout.write(JSON.stringify({", - " number: 7,", - ' title: "Flow PR",', - ' body: "Fixes #42",', - ' author: { login: "dev" },', - ' url: "https://example.test/pr/7",', - " additions: 10,", - " deletions: 2,", - " changedFiles: 1,", - ' files: [{ path: "src/flow.ts", additions: 10, deletions: 2 }],', - ' baseRefName: "main",', - ' headRefName: "feature/flow"', - " }));", - '} else if (args[0] === "issue" && args[1] === "view") {', - " process.stdout.write(JSON.stringify({", - " number: 42,", - ' title: "Underlying issue",', - ' body: "Make the flow runner reusable.",', - ' url: "https://example.test/issues/42"', - " }));", - '} else if (args[0] === "pr" && args[1] === "diff") {', - ' process.stdout.write("diff --git a/src/flow.ts b/src/flow.ts\\n+new behavior\\n+more");', - "} else {", - ' process.stderr.write(`unexpected args: ${args.join(" ")}`);', - " process.exit(1);", - "}", - "", - ].join("\n"), - { mode: 0o755 }, - ); - - if (process.platform === "win32") { - await fs.writeFile( - launcherPath, - [`@echo off`, `"${process.execPath}" "${scriptPath}" %*`, ""].join("\r\n"), - "utf8", - ); - } else { - await fs.writeFile( - launcherPath, - [`#!/bin/sh`, `exec "${process.execPath}" "${scriptPath}" "$@"`, ""].join("\n"), - { mode: 0o755 }, - ); - } - - try { - const service = new GitHubFlowService({ - ghCommand: launcherPath, - maxDiffChars: 20, - }); - - const context = await service.loadPullRequestContext({ - repo: "openclaw/acpx", - prNumber: 7, - }); - - assert.equal(context.repo, "openclaw/acpx"); - assert.equal(context.pr.number, 7); - assert.equal(context.linkedIssue?.number, 42); - assert.match(context.promptContext, /Flow PR/); - assert.match(context.promptContext, /Linked issue #42/); - assert.match(context.promptContext, /\[diff truncated at 20 characters]/); - } finally { - await fs.rm(tempDir, { recursive: true, force: true }); - } -}); - async function withTempHome(run: (homeDir: string) => Promise): Promise { const previousHome = process.env.HOME; const homeDir = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-home-")); diff --git a/workflows/pr-triage.flow.ts b/workflows/pr-triage.flow.ts deleted file mode 100644 index bb6c821..0000000 --- a/workflows/pr-triage.flow.ts +++ /dev/null @@ -1,202 +0,0 @@ -import { acp, compute, defineFlow, extractJsonObject } from "../src/flows.js"; - -type PullRequestContext = { - repo: string; - promptContext: string; -}; - -export default defineFlow({ - name: "pr-triage", - startAt: "load_pr", - nodes: { - load_pr: compute({ - run: async ({ input, services }) => { - const github = services.github as { - loadPullRequestContext(options: { - repo: string; - prNumber: number; - }): Promise; - }; - const flowInput = input as { - repo: string; - prNumber: number; - }; - return await github.loadPullRequestContext({ - repo: flowInput.repo, - prNumber: flowInput.prNumber, - }); - }, - }), - - solution_fit: acp({ - profile: "codex", - async prompt({ outputs }) { - const context = outputs.load_pr as PullRequestContext; - return [ - "You are doing maintainability-first PR triage.", - "Question: is this the right solution for the underlying issue, or is it only a localized fix that does not address the real problem?", - "Use only the PR context below.", - "Return exactly one JSON object with this shape:", - "{", - ' "verdict": "right_solution" | "localized_fix" | "wrong_problem" | "unclear",', - ' "confidence": 0.0,', - ' "reason": "short explanation",', - ' "evidence": ["short bullet", "short bullet"]', - "}", - "", - context.promptContext, - ].join("\n"); - }, - parse: (text) => extractJsonObject(text), - }), - - issue_clarity: acp({ - profile: "codex", - async prompt() { - return [ - "Use the PR context already in this session.", - "Judge whether the underlying issue is clearly framed enough for safe autonomous continuation.", - "If there is no linked issue, decide whether the PR body still makes the underlying problem clear.", - "Return exactly one JSON object with this shape:", - "{", - ' "verdict": "clear" | "ambiguous" | "conflicting",', - ' "confidence": 0.0,', - ' "reason": "short explanation"', - "}", - ].join("\n"); - }, - parse: (text) => extractJsonObject(text), - }), - - scope_assessment: acp({ - profile: "codex", - async prompt() { - return [ - "Use the PR context and earlier reasoning already in this session.", - "Judge whether the scope is appropriately shaped for the codebase.", - "Return exactly one JSON object with this shape:", - "{", - ' "scope": "appropriately_local" | "too_local" | "cross_cutting_needed",', - ' "refactor_needed": "none" | "superficial" | "fundamental",', - ' "human_judgment_needed": true | false,', - ' "reason": "short explanation"', - "}", - ].join("\n"); - }, - parse: (text) => extractJsonObject(text), - }), - - route: compute({ - run: ({ outputs }) => { - const reasons: string[] = []; - const solutionFit = outputs.solution_fit as { - verdict?: string; - }; - const issueClarity = outputs.issue_clarity as { - verdict?: string; - }; - const scopeAssessment = outputs.scope_assessment as { - scope?: string; - refactor_needed?: string; - human_judgment_needed?: boolean; - }; - - if (solutionFit.verdict !== "right_solution") { - reasons.push(`solution_fit=${solutionFit.verdict ?? "unknown"}`); - } - if (issueClarity.verdict !== "clear") { - reasons.push(`issue_clarity=${issueClarity.verdict ?? "unknown"}`); - } - if (scopeAssessment.scope !== "appropriately_local") { - reasons.push(`scope=${scopeAssessment.scope ?? "unknown"}`); - } - if (scopeAssessment.refactor_needed === "fundamental") { - reasons.push("refactor_needed=fundamental"); - } - if (scopeAssessment.human_judgment_needed) { - reasons.push("human_judgment_needed=true"); - } - - return { - next: reasons.length > 0 ? "human_review" : "continue_lane", - reasons, - }; - }, - }), - - continue_lane: acp({ - profile: "codex", - async prompt({ outputs }) { - return [ - "We are continuing on the autonomous lane.", - "The runtime routed here because the earlier checks did not raise blockers.", - "Return exactly one JSON object with this shape:", - "{", - ' "route": "continue",', - ' "summary": "short explanation",', - ' "next_actions": ["action", "action"],', - ' "residual_risks": ["risk", "risk"]', - "}", - "", - `Runtime reasons: ${JSON.stringify((outputs.route as { reasons?: string[] }).reasons ?? [])}`, - ].join("\n"); - }, - parse: (text) => extractJsonObject(text), - }), - - human_review: acp({ - profile: "codex", - async prompt({ outputs }) { - return [ - "We are routing this PR to human review.", - "Return exactly one JSON object with this shape:", - "{", - ' "route": "human_review",', - ' "summary": "short explanation",', - ' "blocking_reasons": ["reason", "reason"],', - ' "questions_for_human": ["question", "question"]', - "}", - "", - `Runtime reasons: ${JSON.stringify((outputs.route as { reasons?: string[] }).reasons ?? [])}`, - ].join("\n"); - }, - parse: (text) => extractJsonObject(text), - }), - - finalize: compute({ - run: ({ outputs, state }) => { - const route = outputs.route as { - next: string; - reasons?: string[]; - }; - const branch = - route.next === "continue_lane" ? outputs.continue_lane : outputs.human_review; - - return { - route: (branch as { route?: string }).route, - routeReasons: route.reasons ?? [], - final: branch, - sessionBindings: state.sessionBindings, - }; - }, - }), - }, - edges: [ - { from: "load_pr", to: "solution_fit" }, - { from: "solution_fit", to: "issue_clarity" }, - { from: "issue_clarity", to: "scope_assessment" }, - { from: "scope_assessment", to: "route" }, - { - from: "route", - switch: { - on: "$.next", - cases: { - continue_lane: "continue_lane", - human_review: "human_review", - }, - }, - }, - { from: "continue_lane", to: "finalize" }, - { from: "human_review", to: "finalize" }, - ], -}); From 4f7b25203dd59a952916f0fbb93bb5c65e392f2b Mon Sep 17 00:00:00 2001 From: Onur Solmaz <2453968+osolmaz@users.noreply.github.com> Date: Wed, 25 Mar 2026 16:44:59 +0100 Subject: [PATCH 06/22] docs: add generic flow examples --- docs/CLI.md | 1 + examples/flows/README.md | 22 +++++++++++ examples/flows/branch.flow.ts | 66 +++++++++++++++++++++++++++++++++ examples/flows/echo.flow.ts | 30 +++++++++++++++ examples/flows/two-turn.flow.ts | 49 ++++++++++++++++++++++++ 5 files changed, 168 insertions(+) create mode 100644 examples/flows/README.md create mode 100644 examples/flows/branch.flow.ts create mode 100644 examples/flows/echo.flow.ts create mode 100644 examples/flows/two-turn.flow.ts diff --git a/docs/CLI.md b/docs/CLI.md index 02f9839..e3c566a 100644 --- a/docs/CLI.md +++ b/docs/CLI.md @@ -78,6 +78,7 @@ acpx [global_options] flow run [--input-json | --input-file - `--input-file` reads flow input JSON from disk. - `--default-agent` supplies the default agent profile for `acp` nodes that do not pin one. - `acpx` does not ship built-in workload-specific flows; the file is provided by the caller. +- The source repo includes small generic examples under `examples/flows/`. ## Global options diff --git a/examples/flows/README.md b/examples/flows/README.md new file mode 100644 index 0000000..31cea21 --- /dev/null +++ b/examples/flows/README.md @@ -0,0 +1,22 @@ +# Flow Examples + +These are simple source-tree examples for `acpx flow run`. + +- `echo.flow.ts`: one ACP step that returns a JSON reply +- `branch.flow.ts`: ACP classification followed by a deterministic branch into either `continue` or `checkpoint` +- `two-turn.flow.ts`: two ACP prompts in the same implicit main session + +Run them from the repo root: + +```bash +acpx flow run examples/flows/echo.flow.ts \ + --input-json '{"request":"Summarize this repository in one sentence."}' + +acpx flow run examples/flows/branch.flow.ts \ + --input-json '{"task":"FIX: add a regression test for the reconnect bug"}' + +acpx flow run examples/flows/two-turn.flow.ts \ + --input-json '{"topic":"How should we validate a new ACP adapter?"}' +``` + +These examples are generic. `acpx` does not ship workload-specific flows in core. diff --git a/examples/flows/branch.flow.ts b/examples/flows/branch.flow.ts new file mode 100644 index 0000000..d3baac4 --- /dev/null +++ b/examples/flows/branch.flow.ts @@ -0,0 +1,66 @@ +import { acp, checkpoint, defineFlow, extractJsonObject } from "../../src/flows.js"; + +type BranchInput = { + task?: string; +}; + +export default defineFlow({ + name: "example-branch", + startAt: "classify", + nodes: { + classify: acp({ + async prompt({ input }) { + const task = + (input as BranchInput).task ?? + "Investigate a flaky test and decide whether the request is clear enough to continue."; + return [ + "Read the task below.", + "If it is concrete and scoped, route `continue`.", + "If it is ambiguous or needs clarification, route `checkpoint`.", + "Return exactly one JSON object with this shape:", + "{", + ' "route": "continue" | "checkpoint",', + ' "reason": "short explanation"', + "}", + "", + `Task: ${task}`, + ].join("\n"); + }, + parse: (text) => extractJsonObject(text), + }), + continue_lane: acp({ + async prompt({ outputs }) { + return [ + "We are on the continue path.", + "Return exactly one JSON object with this shape:", + "{", + ' "route": "continue",', + ' "summary": "short explanation"', + "}", + "", + `Decision: ${JSON.stringify(outputs.classify)}`, + ].join("\n"); + }, + parse: (text) => extractJsonObject(text), + }), + checkpoint_lane: checkpoint({ + summary: "needs clarification", + run: ({ outputs }) => ({ + route: "checkpoint", + summary: (outputs.classify as { reason?: string }).reason ?? "Needs clarification.", + }), + }), + }, + edges: [ + { + from: "classify", + switch: { + on: "$.route", + cases: { + continue: "continue_lane", + checkpoint: "checkpoint_lane", + }, + }, + }, + ], +}); diff --git a/examples/flows/echo.flow.ts b/examples/flows/echo.flow.ts new file mode 100644 index 0000000..6d918f0 --- /dev/null +++ b/examples/flows/echo.flow.ts @@ -0,0 +1,30 @@ +import { acp, compute, defineFlow, extractJsonObject } from "../../src/flows.js"; + +type EchoInput = { + request?: string; +}; + +export default defineFlow({ + name: "example-echo", + startAt: "reply", + nodes: { + reply: acp({ + async prompt({ input }) { + const request = (input as EchoInput).request ?? "Say hello in one short sentence."; + return [ + "Return exactly one JSON object with this shape:", + "{", + ' "reply": "short response"', + "}", + "", + `Request: ${request}`, + ].join("\n"); + }, + parse: (text) => extractJsonObject(text), + }), + finalize: compute({ + run: ({ outputs }) => outputs.reply, + }), + }, + edges: [{ from: "reply", to: "finalize" }], +}); diff --git a/examples/flows/two-turn.flow.ts b/examples/flows/two-turn.flow.ts new file mode 100644 index 0000000..486fd9f --- /dev/null +++ b/examples/flows/two-turn.flow.ts @@ -0,0 +1,49 @@ +import { acp, compute, defineFlow, extractJsonObject } from "../../src/flows.js"; + +type TwoTurnInput = { + topic?: string; +}; + +export default defineFlow({ + name: "example-two-turn", + startAt: "draft", + nodes: { + draft: acp({ + async prompt({ input }) { + const topic = (input as TwoTurnInput).topic ?? "How should we validate a new ACP adapter?"; + return [ + "Write a short draft answer about the topic below.", + "Return exactly one JSON object with this shape:", + "{", + ' "draft": "short paragraph"', + "}", + "", + `Topic: ${topic}`, + ].join("\n"); + }, + parse: (text) => extractJsonObject(text), + }), + refine: acp({ + async prompt({ outputs }) { + return [ + "Use the earlier draft already in this session.", + "Turn it into a concise checklist.", + "Return exactly one JSON object with this shape:", + "{", + ' "checklist": ["item", "item"]', + "}", + "", + `Draft: ${JSON.stringify(outputs.draft)}`, + ].join("\n"); + }, + parse: (text) => extractJsonObject(text), + }), + finalize: compute({ + run: ({ outputs }) => outputs.refine, + }), + }, + edges: [ + { from: "draft", to: "refine" }, + { from: "refine", to: "finalize" }, + ], +}); From 27288f401da19ce011090859f6246b2412db28b7 Mon Sep 17 00:00:00 2001 From: Onur Solmaz <2453968+osolmaz@users.noreply.github.com> Date: Wed, 25 Mar 2026 18:05:52 +0100 Subject: [PATCH 07/22] fix: unwrap external flow module exports --- src/flows/cli.ts | 51 +++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 48 insertions(+), 3 deletions(-) diff --git a/src/flows/cli.ts b/src/flows/cli.ts index 8338021..bf685bf 100644 --- a/src/flows/cli.ts +++ b/src/flows/cli.ts @@ -80,11 +80,56 @@ async function readFlowInput(flags: FlowRunFlags): Promise { async function loadFlowModule(flowPath: string): Promise { const module = (await tsImport(pathToFileURL(flowPath).href, import.meta.url)) as { default?: unknown; + "module.exports"?: unknown; }; - if (!module.default || typeof module.default !== "object") { - throw new Error(`Flow module must export a default flow object: ${flowPath}`); + + const candidate = findFlowDefinition(module); + if (!candidate) { + throw new Error(`Flow module must export a flow object: ${flowPath}`); } - return module.default as FlowDefinition; + return candidate; +} + +function findFlowDefinition(module: { + default?: unknown; + "module.exports"?: unknown; +}): FlowDefinition | null { + const candidates = [ + module.default, + module["module.exports"], + getNestedDefault(module.default), + getNestedDefault(module["module.exports"]), + ]; + + for (const candidate of candidates) { + if (isFlowDefinition(candidate)) { + return candidate; + } + } + + return null; +} + +function getNestedDefault(value: unknown): unknown { + if (!value || typeof value !== "object" || !("default" in value)) { + return null; + } + return (value as { default?: unknown }).default ?? null; +} + +function isFlowDefinition(value: unknown): value is FlowDefinition { + if (!value || typeof value !== "object") { + return false; + } + + const candidate = value as Partial; + return ( + typeof candidate.name === "string" && + typeof candidate.startAt === "string" && + candidate.nodes !== undefined && + typeof candidate.nodes === "object" && + Array.isArray(candidate.edges) + ); } function parseJsonInput(raw: string, label: string): unknown { From 8c3a564f58d7149ec60a5f150d188388442c4867 Mon Sep 17 00:00:00 2001 From: Onur Solmaz <2453968+osolmaz@users.noreply.github.com> Date: Wed, 25 Mar 2026 21:15:43 +0100 Subject: [PATCH 08/22] feat: add native flow action execution --- ...3-25-acpx-flows-production-architecture.md | 401 ++++++++++++++++++ examples/flows/README.md | 4 + examples/flows/shell.flow.ts | 30 ++ src/cli.ts | 8 +- src/flows.ts | 6 + src/flows/cli.ts | 11 + src/flows/runtime.ts | 369 +++++++++++++++- src/session-runtime/queue-owner-process.ts | 13 + test/flows.test.ts | 157 +++++++ test/queue-owner-process.test.ts | 20 + 10 files changed, 1001 insertions(+), 18 deletions(-) create mode 100644 docs/2026-03-25-acpx-flows-production-architecture.md create mode 100644 examples/flows/shell.flow.ts diff --git a/docs/2026-03-25-acpx-flows-production-architecture.md b/docs/2026-03-25-acpx-flows-production-architecture.md new file mode 100644 index 0000000..a9a39ba --- /dev/null +++ b/docs/2026-03-25-acpx-flows-production-architecture.md @@ -0,0 +1,401 @@ +--- +title: acpx Flows Production Architecture +description: Production-ready execution model for acpx workflows, with runtime-owned control, native actions, ACP reasoning steps, and strong liveness guarantees. +author: OpenClaw Team +date: 2026-03-25 +--- + +# acpx Flows Production Architecture + +## Why this document exists + +The first experimental `acpx` flow runner proved that multi-step ACP workflows +are viable, but it also exposed the wrong execution boundary. + +The clearest example was PR triage: + +- the flow itself was structurally fine +- the worker made good judgments +- the run still stalled because a long-running nested `codex review` subprocess + was launched inside an ACP turn and never returned a final result + +This document defines the production-ready architecture for `acpx` flows. + +The goal is not to make the worker do everything. + +The goal is to make the runtime own execution and liveness, while the ACP worker +owns reasoning, judgment, and code changes. + +## Core position + +The correct long-term shape is a hybrid workflow engine: + +- the runtime is the control plane +- ACP workers are reasoning workers +- deterministic mechanics run as native runtime actions + +In other words: + +- the runtime should own step execution, deadlines, retries, heartbeats, + cancellation, state, and side effects +- the worker should own analysis, coding, judgment, summarization, and + decisions that are genuinely model-shaped + +This is the cleanest and most production-ready boundary. + +It is also the most robust answer to the question raised by the prototype: + +why did the flow stall? + +Because a child tool run was hosted inside an ACP turn instead of being owned +and supervised by the runtime. + +## What went wrong in the prototype + +The current prototype runner executes ACP steps synchronously and waits for each +step to finish before persisting completion. + +That is acceptable for simple prompts, but it becomes fragile when an ACP step +tries to orchestrate external mechanics itself. + +In the PR triage case, the failure mode was: + +1. the runtime entered the review step +2. the worker decided to run `codex review` +3. that review launched as a nested subprocess inside the worker turn +4. the review got stuck on transport/runtime behavior +5. the parent ACP turn never returned structured output +6. the outer flow looked hung + +This exposed three separate issues: + +- the wrong boundary for deterministic actions +- no explicit per-step liveness signal in run state +- no reliable step deadline or timeout behavior at the flow layer + +## Production model + +### 1. The runtime is the control plane + +The flow runtime should own: + +- flow graph execution +- current node and next node +- step deadlines and timeouts +- retries and retry policy +- run persistence +- heartbeats and staleness detection +- cancellation +- side-effect execution +- idempotency and action receipts + +The runtime must always know: + +- which node is active +- how long it has been active +- whether it is making progress +- whether it timed out +- whether it is blocked on a human or an external dependency + +### 2. ACP steps are for reasoning, not orchestration + +ACP steps should be used for: + +- extracting intent +- judging solution shape +- classifying bug vs feature +- deciding whether refactor is needed +- deciding whether human escalation is required +- editing code when the change is genuinely model-driven +- summarizing findings for a final comment + +ACP steps should not be the place where the model is expected to supervise +long-running deterministic subprocesses. + +That means: + +- do not host `codex review` inside a Codex ACP turn +- do not host `gh api` polling loops inside an ACP turn +- do not host CI approval or CI inspection loops inside an ACP turn + +Those belong to the runtime. + +### 3. Native action steps should handle deterministic work + +The runtime should support native action steps for deterministic operations such +as: + +- `git_fetch` +- `checkout_pr` +- `gh_api` +- `codex_review` +- `approve_workflow_run` +- `post_pr_comment` +- `close_pr` +- `run_tests` +- `run_targeted_validation` + +These actions should be: + +- directly observable +- cancellable +- time-bounded +- resumable when possible +- recorded with machine-readable receipts + +The worker can still decide whether they should run, but the runtime should +actually execute them. + +### 4. One durable run state, updated while the step is still active + +The flow runtime must persist live state before awaiting a step result. + +At minimum, run state should include: + +- `status` +- `currentNode` +- `currentNodeKind` +- `currentNodeStartedAt` +- `lastHeartbeatAt` +- `statusDetail` +- `outputs` +- `steps` +- `sessionBindings` +- `waitingOn` +- `error` + +This avoids the current ambiguity where `run.json` only changes after a node +completes and a healthy run looks frozen. + +### 5. Every long-running step needs heartbeat, deadline, and cancellation + +Every `acp` or `action` step should support: + +- `timeoutMs` +- optional heartbeat updates +- cancellation on timeout +- explicit terminal result if timed out + +For ACP steps: + +- timeout should cancel the active session prompt if possible + +For native action steps: + +- timeout should kill the child process and mark the step `timed_out` + +This is the minimum production liveness contract. + +### 6. Side effects must be idempotent and recorded + +A production workflow runtime must assume retries and restarts. + +For effectful steps such as posting comments or closing PRs, the runtime should +store receipts such as: + +- GitHub comment id +- workflow run id +- CI approval id +- commit sha +- pushed branch sha + +That allows safe resume and retry behavior without duplicated actions. + +### 7. Session handling should stay simple + +The session model should remain: + +- one main ACP session by default +- explicit isolated sessions only when a step truly needs a blind or separate + conversation + +The runtime should track those bindings internally. + +The flow author should usually think in terms of: + +- main reasoning session +- isolated critic session when needed + +not in terms of queue-owner mechanics or persistence internals. + +## Recommended step model + +The core step kinds should stay small: + +- `acp` +- `compute` +- `action` +- `checkpoint` + +But the semantics should be tighter. + +### `acp` + +Use for model-shaped work: + +- judgment +- code generation +- summarization +- route recommendation + +### `compute` + +Use for local pure transforms: + +- normalizing outputs +- computing branch keys +- reducing multiple findings into one route + +### `action` + +Use for deterministic external work supervised by the runtime: + +- git commands +- GitHub API calls +- test execution +- local `codex review` +- comment posting +- CI approval + +### `checkpoint` + +Use for explicit wait states: + +- human approval +- external webhook +- workflow approval gate that the runtime cannot clear + +## PR triage under the production model + +The PR triage workflow should still follow the same logical flow, but some +current ACP steps should become native actions. + +A better shape is: + +1. `load_pr` — `action` +2. `prepare_workspace` — `action` +3. `extract_intent` — `acp` +4. `judge_implementation_or_solution` — `acp` +5. `bug_or_feature` — `acp` +6. `reproduce_bug_and_test_fix` or `test_feature_directly` — `action` +7. `judge_refactor` — `acp` +8. `collect_github_review_state` — `action` +9. `run_local_codex_review` — `action` +10. `judge_review_outcome` — `acp` or `compute` +11. `check_ci_state` — `action` +12. `fix_ci_failures` — `acp` plus `action` test steps as needed +13. `render_final_comment` — `acp` +14. `post_comment` / `close_pr` / `checkpoint` — `action` or `checkpoint` + +This keeps the worker in charge of judgment while making execution much more +reliable. + +## What the runtime should expose + +The runtime should eventually expose: + +- per-node timeout configuration +- per-node heartbeat policy +- per-action retry policy +- per-action idempotency keys +- live `flow status` +- `flow cancel` +- `flow resume` +- step receipts in run state + +This does not require a giant orchestration DSL. + +It requires a small set of strong primitives. + +## Failure model + +The runtime should distinguish these states clearly: + +- `running` +- `waiting` +- `completed` +- `failed` +- `timed_out` +- `cancelled` + +And these error classes should be surfaced distinctly when possible: + +- child process hung +- child process failed +- ACP prompt timed out +- external API failed +- blocked on permission gate +- blocked on human approval +- invalid step output + +That makes debugging and operator behavior much cleaner. + +## Incremental path from the current implementation + +The best migration path is: + +### Step 1: improve liveness and observability + +Add: + +- `currentNode` +- `currentNodeStartedAt` +- `lastHeartbeatAt` +- live `run.json` updates at node start +- per-node `timeoutMs` + +This should land first. + +### Step 2: add native action execution + +Keep the same graph model, but make deterministic work first-class: + +- command-backed actions +- GitHub-backed actions +- review/test actions + +This should land second. + +### Step 3: move recursive mechanics out of ACP prompts + +Refactor workflows so they stop asking the worker to supervise: + +- `codex review` +- `gh api` +- CI approval loops +- comment posting + +The runtime should do those directly. + +### Step 4: add receipts and idempotency + +This makes comment posting, PR closing, and CI approval safe under retries and +resume. + +## What not to do + +Do not move toward: + +- a single giant conversational agent that does everything +- recursive agent-inside-agent orchestration for core mechanics +- implicit run state that only exists in model context +- prose-only routing for effectful decisions + +That shape may feel flexible at first, but it is the least production-ready +option. + +## Final position + +The most production-ready `acpx` flow architecture is: + +- durable runtime-owned workflow execution +- native deterministic action steps +- ACP reasoning steps for judgment and coding +- explicit liveness, heartbeat, timeout, and cancellation +- idempotent recorded side effects + +That is the cleanest long-term model. + +It is also the most credible path if `acpx` wants flows that survive real, +long-running, autonomous workloads without turning the worker itself into a +fragile orchestration layer. diff --git a/examples/flows/README.md b/examples/flows/README.md index 31cea21..7091d0b 100644 --- a/examples/flows/README.md +++ b/examples/flows/README.md @@ -4,6 +4,7 @@ These are simple source-tree examples for `acpx flow run`. - `echo.flow.ts`: one ACP step that returns a JSON reply - `branch.flow.ts`: ACP classification followed by a deterministic branch into either `continue` or `checkpoint` +- `shell.flow.ts`: one native runtime-owned shell action that returns structured JSON - `two-turn.flow.ts`: two ACP prompts in the same implicit main session Run them from the repo root: @@ -15,6 +16,9 @@ acpx flow run examples/flows/echo.flow.ts \ acpx flow run examples/flows/branch.flow.ts \ --input-json '{"task":"FIX: add a regression test for the reconnect bug"}' +acpx flow run examples/flows/shell.flow.ts \ + --input-json '{"text":"hello from shell"}' + acpx flow run examples/flows/two-turn.flow.ts \ --input-json '{"topic":"How should we validate a new ACP adapter?"}' ``` diff --git a/examples/flows/shell.flow.ts b/examples/flows/shell.flow.ts new file mode 100644 index 0000000..59c4fae --- /dev/null +++ b/examples/flows/shell.flow.ts @@ -0,0 +1,30 @@ +import { compute, defineFlow, extractJsonObject, shell } from "../../src/flows.js"; + +type ShellInput = { + text?: string; +}; + +export default defineFlow({ + name: "example-shell", + startAt: "transform", + nodes: { + transform: shell({ + async exec({ input }) { + const text = (input as ShellInput).text ?? "hello from shell"; + return { + command: process.execPath, + args: [ + "-e", + `process.stdout.write(JSON.stringify({ original: ${JSON.stringify(text)}, upper: ${JSON.stringify(text.toUpperCase())} }))`, + ], + }; + }, + parse: (result) => extractJsonObject(result.stdout), + statusDetail: "Run native shell-backed action", + }), + finalize: compute({ + run: ({ outputs }) => outputs.transform, + }), + }, + edges: [{ from: "transform", to: "finalize" }], +}); diff --git a/src/cli.ts b/src/cli.ts index ab81ac2..91b099a 100644 --- a/src/cli.ts +++ b/src/cli.ts @@ -1,12 +1,18 @@ #!/usr/bin/env node import { realpathSync } from "node:fs"; -import { pathToFileURL } from "node:url"; +import { fileURLToPath, pathToFileURL } from "node:url"; import { main } from "./cli-core.js"; export { formatPromptSessionBannerLine } from "./cli-core.js"; export { parseAllowedTools, parseMaxTurns, parseTtlSeconds } from "./cli/flags.js"; +process.env.ACPX_QUEUE_OWNER_ARGS ??= JSON.stringify([ + ...process.execArgv, + fileURLToPath(import.meta.url), + "__queue-owner", +]); + function isCliEntrypoint(argv: string[]): boolean { const entry = argv[1]; if (!entry) { diff --git a/src/flows.ts b/src/flows.ts index 7529b5e..7ae3665 100644 --- a/src/flows.ts +++ b/src/flows.ts @@ -6,10 +6,13 @@ export { compute, defineFlow, flowRunsBaseDir, + shell, + type FlowNodeCommon, type AcpNodeDefinition, type ActionNodeDefinition, type CheckpointNodeDefinition, type ComputeNodeDefinition, + type FunctionActionNodeDefinition, type FlowDefinition, type FlowEdge, type FlowNodeContext, @@ -19,5 +22,8 @@ export { type FlowRunnerOptions, type FlowSessionBinding, type FlowStepRecord, + type ShellActionExecution, + type ShellActionNodeDefinition, + type ShellActionResult, } from "./flows/runtime.js"; export { extractJsonObject } from "./flows/json.js"; diff --git a/src/flows/cli.ts b/src/flows/cli.ts index bf685bf..6a8722e 100644 --- a/src/flows/cli.ts +++ b/src/flows/cli.ts @@ -152,6 +152,11 @@ function printFlowRunResult( flowName: result.state.flowName, flowPath: result.state.flowPath, status: result.state.status, + currentNode: result.state.currentNode, + currentNodeKind: result.state.currentNodeKind, + currentNodeStartedAt: result.state.currentNodeStartedAt, + lastHeartbeatAt: result.state.lastHeartbeatAt, + statusDetail: result.state.statusDetail, waitingOn: result.state.waitingOn, runDir: result.runDir, outputs: result.state.outputs, @@ -172,6 +177,12 @@ function printFlowRunResult( process.stdout.write(`flow: ${payload.flowName}\n`); process.stdout.write(`status: ${payload.status}\n`); process.stdout.write(`runDir: ${payload.runDir}\n`); + if (payload.currentNode) { + process.stdout.write(`currentNode: ${payload.currentNode}\n`); + } + if (payload.statusDetail) { + process.stdout.write(`statusDetail: ${payload.statusDetail}\n`); + } if (payload.waitingOn) { process.stdout.write(`waitingOn: ${payload.waitingOn}\n`); } diff --git a/src/flows/runtime.ts b/src/flows/runtime.ts index 1f98dfa..bd80b22 100644 --- a/src/flows/runtime.ts +++ b/src/flows/runtime.ts @@ -1,3 +1,4 @@ +import { spawn } from "node:child_process"; import { randomUUID } from "node:crypto"; import fs from "node:fs/promises"; import os from "node:os"; @@ -5,7 +6,14 @@ import path from "node:path"; import { createOutputFormatter } from "../output.js"; import { promptToDisplayText, textPrompt } from "../prompt-content.js"; import { resolveSessionRecord } from "../session-persistence.js"; -import { createSession, runOnce, sendSession, type SessionAgentOptions } from "../session.js"; +import { TimeoutError, withTimeout } from "../session-runtime-helpers.js"; +import { + cancelSessionPrompt, + createSession, + runOnce, + sendSession, + type SessionAgentOptions, +} from "../session.js"; import type { AuthPolicy, McpServer, @@ -15,6 +23,7 @@ import type { } from "../types.js"; type MaybePromise = T | Promise; +const DEFAULT_FLOW_HEARTBEAT_MS = 5_000; export type FlowNodeContext = { input: TInput; @@ -23,6 +32,12 @@ export type FlowNodeContext = { services: Record; }; +export type FlowNodeCommon = { + timeoutMs?: number; + heartbeatMs?: number; + statusDetail?: string; +}; + export type FlowEdge = | { from: string; @@ -36,7 +51,7 @@ export type FlowEdge = }; }; -export type AcpNodeDefinition = { +export type AcpNodeDefinition = FlowNodeCommon & { kind: "acp"; profile?: string; session?: { @@ -47,17 +62,48 @@ export type AcpNodeDefinition = { parse?: (text: string, context: FlowNodeContext) => MaybePromise; }; -export type ComputeNodeDefinition = { +export type ComputeNodeDefinition = FlowNodeCommon & { kind: "compute"; run: (context: FlowNodeContext) => MaybePromise; }; -export type ActionNodeDefinition = { +export type FunctionActionNodeDefinition = FlowNodeCommon & { kind: "action"; run: (context: FlowNodeContext) => MaybePromise; }; -export type CheckpointNodeDefinition = { +export type ShellActionExecution = { + command: string; + args?: string[]; + cwd?: string; + env?: Record; + stdin?: string; + shell?: boolean | string; + allowNonZeroExit?: boolean; + timeoutMs?: number; +}; + +export type ShellActionResult = { + command: string; + args: string[]; + cwd: string; + stdout: string; + stderr: string; + combinedOutput: string; + exitCode: number | null; + signal: NodeJS.Signals | null; + durationMs: number; +}; + +export type ShellActionNodeDefinition = FlowNodeCommon & { + kind: "action"; + exec: (context: FlowNodeContext) => MaybePromise; + parse?: (result: ShellActionResult, context: FlowNodeContext) => MaybePromise; +}; + +export type ActionNodeDefinition = FunctionActionNodeDefinition | ShellActionNodeDefinition; + +export type CheckpointNodeDefinition = FlowNodeCommon & { kind: "checkpoint"; summary?: string; run?: (context: FlowNodeContext) => MaybePromise; @@ -112,11 +158,16 @@ export type FlowRunState = { startedAt: string; finishedAt?: string; updatedAt: string; - status: "running" | "waiting" | "completed" | "failed"; + status: "running" | "waiting" | "completed" | "failed" | "timed_out"; input: unknown; outputs: Record; steps: FlowStepRecord[]; sessionBindings: Record; + currentNode?: string; + currentNodeKind?: FlowNodeDefinition["kind"]; + currentNodeStartedAt?: string; + lastHeartbeatAt?: string; + statusDetail?: string; waitingOn?: string; error?: string; }; @@ -168,7 +219,24 @@ export function compute(definition: Omit): Comput }; } -export function action(definition: Omit): ActionNodeDefinition { +export function action( + definition: Omit, +): FunctionActionNodeDefinition; +export function action( + definition: Omit, +): ShellActionNodeDefinition; +export function action( + definition: Omit | Omit, +): ActionNodeDefinition { + return { + kind: "action", + ...definition, + } as ActionNodeDefinition; +} + +export function shell( + definition: Omit, +): ShellActionNodeDefinition { return { kind: "action", ...definition, @@ -264,14 +332,48 @@ export class FlowRunner { let rawText: string | null = null; let sessionInfo: FlowSessionBinding | null = null; let agentInfo: ReturnType | null = null; + this.markNodeStarted(state, current, node.kind, startedAt, node.statusDetail); + await this.persist(runDir, state, { + type: "node_started", + nodeId: current, + kind: node.kind, + }); switch (node.kind) { case "compute": { - output = await node.run(context); + output = await this.runWithHeartbeat(runDir, state, current, node, async () => { + return await withTimeout( + Promise.resolve(node.run(context)), + node.timeoutMs ?? this.timeoutMs, + ); + }); break; } case "action": { - output = await node.run(context); + if ("run" in node) { + output = await this.runWithHeartbeat(runDir, state, current, node, async () => { + return await withTimeout( + Promise.resolve(node.run(context)), + node.timeoutMs ?? this.timeoutMs, + ); + }); + } else { + const execution = await Promise.resolve(node.exec(context)); + const effectiveExecution: ShellActionExecution = { + ...execution, + timeoutMs: execution.timeoutMs ?? node.timeoutMs ?? this.timeoutMs, + }; + this.updateStatusDetail(state, formatShellActionSummary(effectiveExecution)); + await this.persist(runDir, state, { + type: "node_detail", + nodeId: current, + detail: state.statusDetail, + }); + const result = await this.runWithHeartbeat(runDir, state, current, node, async () => { + return await runShellAction(effectiveExecution); + }); + output = node.parse ? await node.parse(result, context) : result; + } break; } case "checkpoint": { @@ -286,6 +388,7 @@ export class FlowRunner { state.waitingOn = current; state.updatedAt = isoNow(); state.status = "waiting"; + this.clearActiveNode(state, node.summary ?? current); state.steps.push({ nodeId: current, kind: node.kind, @@ -308,15 +411,50 @@ export class FlowRunner { }; } case "acp": { - agentInfo = this.resolveAgent(node.profile); + const resolvedAgent = this.resolveAgent(node.profile); + agentInfo = resolvedAgent; const prompt = normalizePromptInput(await node.prompt(context)); promptText = promptToDisplayText(prompt); + this.updateStatusDetail(state, summarizePrompt(promptText, node.statusDetail)); + await this.persist(runDir, state, { + type: "node_detail", + nodeId: current, + detail: state.statusDetail, + }); if (node.session?.isolated) { - rawText = await this.runIsolatedPrompt(agentInfo, prompt); + rawText = await this.runWithHeartbeat(runDir, state, current, node, async () => { + return await this.runIsolatedPrompt( + resolvedAgent, + prompt, + node.timeoutMs ?? this.timeoutMs, + ); + }); } else { - sessionInfo = await this.ensureSessionBinding(state, flow, node, agentInfo); - rawText = await this.runPersistentPrompt(sessionInfo, prompt); - sessionInfo = await this.refreshSessionBinding(sessionInfo); + const boundSession = await this.ensureSessionBinding( + state, + flow, + node, + resolvedAgent, + ); + sessionInfo = boundSession; + rawText = await this.runWithHeartbeat( + runDir, + state, + current, + node, + async () => + await this.runPersistentPrompt( + boundSession, + prompt, + node.timeoutMs ?? this.timeoutMs, + ), + async () => { + await cancelSessionPrompt({ + sessionId: boundSession.acpxRecordId, + }); + }, + ); + sessionInfo = await this.refreshSessionBinding(boundSession); state.sessionBindings[sessionInfo.key] = sessionInfo; } output = node.parse ? await node.parse(rawText, context) : rawText; @@ -330,6 +468,7 @@ export class FlowRunner { state.outputs[current] = output; state.updatedAt = isoNow(); + this.clearActiveNode(state); state.steps.push({ nodeId: current, kind: node.kind, @@ -354,16 +493,20 @@ export class FlowRunner { state.status = "completed"; state.finishedAt = isoNow(); state.updatedAt = state.finishedAt; + this.clearActiveNode(state); await this.persist(runDir, state, { type: "run_completed" }); return { runDir, state, }; } catch (error) { - state.status = "failed"; + state.status = error instanceof TimeoutError ? "timed_out" : "failed"; state.updatedAt = isoNow(); state.finishedAt = state.updatedAt; state.error = error instanceof Error ? error.message : String(error); + state.statusDetail = state.currentNode + ? `Failed in ${state.currentNode}: ${state.error}` + : state.error; await this.persist(runDir, state, { type: "run_failed", error: state.error, @@ -381,6 +524,83 @@ export class FlowRunner { }; } + private markNodeStarted( + state: FlowRunState, + nodeId: string, + kind: FlowNodeDefinition["kind"], + startedAt: string, + detail?: string, + ): void { + state.status = "running"; + state.waitingOn = undefined; + state.currentNode = nodeId; + state.currentNodeKind = kind; + state.currentNodeStartedAt = startedAt; + state.lastHeartbeatAt = startedAt; + state.statusDetail = detail ?? `Running ${kind} node ${nodeId}`; + } + + private clearActiveNode(state: FlowRunState, detail?: string): void { + state.currentNode = undefined; + state.currentNodeKind = undefined; + state.currentNodeStartedAt = undefined; + state.lastHeartbeatAt = undefined; + state.statusDetail = detail; + } + + private updateStatusDetail(state: FlowRunState, detail?: string): void { + if (!detail) { + return; + } + state.statusDetail = detail; + } + + private async runWithHeartbeat( + runDir: string, + state: FlowRunState, + nodeId: string, + node: FlowNodeCommon, + run: () => Promise, + onTimeout?: () => Promise, + ): Promise { + const heartbeatMs = Math.max(0, Math.round(node.heartbeatMs ?? DEFAULT_FLOW_HEARTBEAT_MS)); + let timer: NodeJS.Timeout | undefined; + let active = true; + const heartbeat = async (): Promise => { + if (!active) { + return; + } + state.lastHeartbeatAt = isoNow(); + state.updatedAt = state.lastHeartbeatAt; + await this.persist(runDir, state, { + type: "node_heartbeat", + nodeId, + }); + }; + + if (heartbeatMs > 0) { + timer = setInterval(() => { + void heartbeat(); + }, heartbeatMs); + } + + try { + return await run(); + } catch (error) { + if (error instanceof TimeoutError && onTimeout) { + await onTimeout().catch(() => { + // best effort cancellation only + }); + } + throw error; + } finally { + active = false; + if (timer) { + clearInterval(timer); + } + } + } + private async ensureSessionBinding( state: FlowRunState, flow: FlowDefinition, @@ -437,6 +657,7 @@ export class FlowRunner { private async runPersistentPrompt( binding: FlowSessionBinding, prompt: PromptInput, + timeoutMs?: number, ): Promise { const capture = createQuietCaptureOutput(); await sendSession({ @@ -449,7 +670,7 @@ export class FlowRunner { authPolicy: this.authPolicy, outputFormatter: capture.formatter, suppressSdkConsoleErrors: this.suppressSdkConsoleErrors, - timeoutMs: this.timeoutMs, + timeoutMs, ttlMs: this.ttlMs, verbose: this.verbose, waitForCompletion: true, @@ -460,6 +681,7 @@ export class FlowRunner { private async runIsolatedPrompt( agent: ReturnType, prompt: PromptInput, + timeoutMs?: number, ): Promise { const capture = createQuietCaptureOutput(); await runOnce({ @@ -473,7 +695,7 @@ export class FlowRunner { authPolicy: this.authPolicy, outputFormatter: capture.formatter, suppressSdkConsoleErrors: this.suppressSdkConsoleErrors, - timeoutMs: this.timeoutMs, + timeoutMs, verbose: this.verbose, sessionOptions: this.sessionOptions, }); @@ -566,6 +788,119 @@ function getByPath(value: unknown, jsonPath: string): unknown { }, value); } +function summarizePrompt(promptText: string, explicitDetail?: string): string { + if (explicitDetail) { + return explicitDetail; + } + + const line = promptText + .split("\n") + .map((candidate) => candidate.trim()) + .find((candidate) => candidate.length > 0); + + if (!line) { + return "Running ACP prompt"; + } + + const truncated = line.length > 120 ? `${line.slice(0, 117)}...` : line; + return `ACP: ${truncated}`; +} + +function formatShellActionSummary(spec: ShellActionExecution): string { + return `shell: ${renderShellCommand(spec.command, spec.args ?? [])}`; +} + +function renderShellCommand(command: string, args: string[]): string { + const renderedArgs = args.map((arg) => JSON.stringify(arg)).join(" "); + return renderedArgs.length > 0 ? `${command} ${renderedArgs}` : command; +} + +async function runShellAction(spec: ShellActionExecution): Promise { + const cwd = spec.cwd ?? process.cwd(); + const args = spec.args ?? []; + const startMs = Date.now(); + const child = spawn(spec.command, args, { + cwd, + env: { + ...process.env, + ...spec.env, + }, + shell: spec.shell, + stdio: ["pipe", "pipe", "pipe"], + windowsHide: true, + }); + + let stdout = ""; + let stderr = ""; + let timedOut = false; + let timeout: NodeJS.Timeout | undefined; + + const finish = new Promise((resolve, reject) => { + child.stdout.setEncoding("utf8"); + child.stderr.setEncoding("utf8"); + child.stdout.on("data", (chunk: string) => { + stdout += chunk; + }); + child.stderr.on("data", (chunk: string) => { + stderr += chunk; + }); + + child.once("error", reject); + child.once("exit", (exitCode, signal) => { + const result: ShellActionResult = { + command: spec.command, + args, + cwd, + stdout, + stderr, + combinedOutput: `${stdout}${stderr}`, + exitCode, + signal, + durationMs: Date.now() - startMs, + }; + + if (timedOut) { + reject(new TimeoutError(spec.timeoutMs ?? 0)); + return; + } + + if ((exitCode ?? 0) !== 0 && spec.allowNonZeroExit !== true) { + reject( + new Error( + `Shell action failed (${renderShellCommand(spec.command, args)}): exit ${String(exitCode)}${stderr.length > 0 ? `\n${stderr.trim()}` : ""}`, + ), + ); + return; + } + + resolve(result); + }); + }); + + if (spec.stdin != null) { + child.stdin.write(spec.stdin); + } + child.stdin.end(); + + if (spec.timeoutMs != null && spec.timeoutMs > 0) { + timeout = setTimeout(() => { + timedOut = true; + child.kill("SIGTERM"); + setTimeout(() => { + child.kill("SIGKILL"); + }, 1_000).unref(); + }, spec.timeoutMs); + } + + try { + return await finish; + } finally { + if (timeout) { + clearTimeout(timeout); + } + } +} + function createQuietCaptureOutput(): { formatter: ReturnType; read: () => string; diff --git a/src/session-runtime/queue-owner-process.ts b/src/session-runtime/queue-owner-process.ts index 0e67693..527ed97 100644 --- a/src/session-runtime/queue-owner-process.ts +++ b/src/session-runtime/queue-owner-process.ts @@ -34,6 +34,19 @@ type SessionSendLike = { }; export function resolveQueueOwnerSpawnArgs(argv: readonly string[] = process.argv): string[] { + const override = process.env.ACPX_QUEUE_OWNER_ARGS; + if (override) { + const parsed = JSON.parse(override) as unknown; + if ( + Array.isArray(parsed) && + parsed.length > 0 && + parsed.every((value) => typeof value === "string" && value.length > 0) + ) { + return [...parsed]; + } + throw new Error("acpx self-spawn failed: invalid ACPX_QUEUE_OWNER_ARGS"); + } + const entry = argv[1]; if (!entry || entry.trim().length === 0) { throw new Error("acpx self-spawn failed: missing CLI entry path"); diff --git a/test/flows.test.ts b/test/flows.test.ts index 9b061de..0e35f07 100644 --- a/test/flows.test.ts +++ b/test/flows.test.ts @@ -13,7 +13,9 @@ import { defineFlow, extractJsonObject, flowRunsBaseDir, + shell, } from "../src/flows.js"; +import { TimeoutError } from "../src/session-runtime-helpers.js"; const MOCK_AGENT_PATH = fileURLToPath(new URL("./mock-agent.js", import.meta.url)); const MOCK_AGENT_COMMAND = `node ${JSON.stringify(MOCK_AGENT_PATH)}`; @@ -141,6 +143,129 @@ test("FlowRunner stops at checkpoint nodes and marks the run as waiting", async }); }); +test("FlowRunner executes native shell actions and parses structured output", async () => { + await withTempHome(async () => { + const runner = new FlowRunner({ + resolveAgent: () => ({ + agentName: "unused", + agentCommand: "unused", + cwd: process.cwd(), + }), + permissionMode: "approve-all", + outputRoot: await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-store-")), + }); + + const flow = defineFlow({ + name: "shell-test", + startAt: "transform", + nodes: { + transform: shell({ + exec: () => ({ + command: process.execPath, + args: ["-e", 'process.stdout.write(JSON.stringify({ok:true, value:"shell"}))'], + }), + parse: (result) => extractJsonObject(result.stdout), + }), + }, + edges: [], + }); + + const result = await runner.run(flow, {}); + assert.equal(result.state.status, "completed"); + assert.deepEqual(result.state.outputs.transform, { ok: true, value: "shell" }); + }); +}); + +test("FlowRunner persists active node state while a shell step is running", async () => { + await withTempHome(async () => { + const outputRoot = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-store-")); + const runner = new FlowRunner({ + resolveAgent: () => ({ + agentName: "unused", + agentCommand: "unused", + cwd: process.cwd(), + }), + permissionMode: "approve-all", + outputRoot, + }); + + const flow = defineFlow({ + name: "heartbeat-test", + startAt: "slow", + nodes: { + slow: shell({ + heartbeatMs: 25, + exec: () => ({ + command: process.execPath, + args: [ + "-e", + "setTimeout(() => process.stdout.write(JSON.stringify({done:true})), 150)", + ], + }), + parse: (result) => extractJsonObject(result.stdout), + }), + }, + edges: [], + }); + + const runPromise = runner.run(flow, {}); + const runDir = await waitForRunDir(outputRoot, "heartbeat-test"); + const activeState = await waitFor(async () => { + const state = await readRunJson(runDir); + if (state.currentNode === "slow" && state.status === "running") { + return state; + } + return null; + }, 2_000); + + assert.equal(activeState.currentNode, "slow"); + assert.equal(activeState.currentNodeKind, "action"); + assert.ok(typeof activeState.currentNodeStartedAt === "string"); + assert.ok(typeof activeState.lastHeartbeatAt === "string"); + + const result = await runPromise; + assert.equal(result.state.status, "completed"); + assert.equal(result.state.currentNode, undefined); + }); +}); + +test("FlowRunner marks timed out shell steps explicitly", async () => { + await withTempHome(async () => { + const outputRoot = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-store-")); + const runner = new FlowRunner({ + resolveAgent: () => ({ + agentName: "unused", + agentCommand: "unused", + cwd: process.cwd(), + }), + permissionMode: "approve-all", + outputRoot, + }); + + const flow = defineFlow({ + name: "timeout-test", + startAt: "slow", + nodes: { + slow: shell({ + exec: () => ({ + command: process.execPath, + args: ["-e", "setTimeout(() => {}, 1000)"], + timeoutMs: 50, + }), + }), + }, + edges: [], + }); + + await assert.rejects(async () => await runner.run(flow, {}), TimeoutError); + const runDir = await waitForRunDir(outputRoot, "timeout-test"); + const state = await readRunJson(runDir); + assert.equal(state.status, "timed_out"); + assert.equal(state.currentNode, "slow"); + assert.match(String(state.error), /Timed out after 50ms/); + }); +}); + async function withTempHome(run: (homeDir: string) => Promise): Promise { const previousHome = process.env.HOME; const homeDir = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-home-")); @@ -161,3 +286,35 @@ async function withTempHome(run: (homeDir: string) => Promise): Promise { + return await waitFor(async () => { + const entries = await fs.readdir(outputRoot); + const match = entries.find((entry) => entry.includes(flowName)); + return match ? path.join(outputRoot, match) : null; + }, 2_000); +} + +async function readRunJson(runDir: string): Promise> { + const payload = await fs.readFile(path.join(runDir, "run.json"), "utf8"); + return JSON.parse(payload) as Record; +} + +async function waitFor(fn: () => Promise, timeoutMs: number): Promise { + const deadline = Date.now() + timeoutMs; + let lastError: unknown; + + while (Date.now() < deadline) { + try { + const value = await fn(); + if (value != null) { + return value; + } + } catch (error) { + lastError = error; + } + await new Promise((resolve) => setTimeout(resolve, 20)); + } + + throw lastError instanceof Error ? lastError : new Error("Timed out waiting for condition"); +} diff --git a/test/queue-owner-process.test.ts b/test/queue-owner-process.test.ts index c72fe48..8a11ab2 100644 --- a/test/queue-owner-process.test.ts +++ b/test/queue-owner-process.test.ts @@ -16,6 +16,26 @@ async function withTempDir(run: (dir: string) => Promise): Promise { } describe("resolveQueueOwnerSpawnArgs", () => { + it("prefers ACPX_QUEUE_OWNER_ARGS when provided", () => { + const previous = process.env.ACPX_QUEUE_OWNER_ARGS; + process.env.ACPX_QUEUE_OWNER_ARGS = JSON.stringify([ + "--import", + "tsx", + "src/cli.ts", + "__queue-owner", + ]); + try { + const args = resolveQueueOwnerSpawnArgs(["node", "ignored.js"]); + assert.deepEqual(args, ["--import", "tsx", "src/cli.ts", "__queue-owner"]); + } finally { + if (previous === undefined) { + delete process.env.ACPX_QUEUE_OWNER_ARGS; + } else { + process.env.ACPX_QUEUE_OWNER_ARGS = previous; + } + } + }); + it("returns and __queue-owner", async () => { await withTempDir(async (dir) => { const cliFile = path.join(dir, "cli.js"); From 69348c11f9fe2b09af0ab65331e072ad93f13854 Mon Sep 17 00:00:00 2001 From: Onur Solmaz <2453968+osolmaz@users.noreply.github.com> Date: Wed, 25 Mar 2026 21:26:34 +0100 Subject: [PATCH 09/22] refactor: split flow runtime concerns --- src/flows.ts | 2 +- src/flows/executors/shell.ts | 98 +++++++ src/flows/runtime.ts | 491 +++++++++++++++++------------------ src/flows/store.ts | 94 +++++++ 4 files changed, 427 insertions(+), 258 deletions(-) create mode 100644 src/flows/executors/shell.ts create mode 100644 src/flows/store.ts diff --git a/src/flows.ts b/src/flows.ts index 7ae3665..4845899 100644 --- a/src/flows.ts +++ b/src/flows.ts @@ -5,7 +5,6 @@ export { checkpoint, compute, defineFlow, - flowRunsBaseDir, shell, type FlowNodeCommon, type AcpNodeDefinition, @@ -26,4 +25,5 @@ export { type ShellActionNodeDefinition, type ShellActionResult, } from "./flows/runtime.js"; +export { flowRunsBaseDir } from "./flows/store.js"; export { extractJsonObject } from "./flows/json.js"; diff --git a/src/flows/executors/shell.ts b/src/flows/executors/shell.ts new file mode 100644 index 0000000..3400a9d --- /dev/null +++ b/src/flows/executors/shell.ts @@ -0,0 +1,98 @@ +import { spawn } from "node:child_process"; +import { TimeoutError } from "../../session-runtime-helpers.js"; +import type { ShellActionExecution, ShellActionResult } from "../runtime.js"; + +export function formatShellActionSummary(spec: ShellActionExecution): string { + return `shell: ${renderShellCommand(spec.command, spec.args ?? [])}`; +} + +export function renderShellCommand(command: string, args: string[]): string { + const renderedArgs = args.map((arg) => JSON.stringify(arg)).join(" "); + return renderedArgs.length > 0 ? `${command} ${renderedArgs}` : command; +} + +export async function runShellAction(spec: ShellActionExecution): Promise { + const cwd = spec.cwd ?? process.cwd(); + const args = spec.args ?? []; + const startMs = Date.now(); + const child = spawn(spec.command, args, { + cwd, + env: { + ...process.env, + ...spec.env, + }, + shell: spec.shell, + stdio: ["pipe", "pipe", "pipe"], + windowsHide: true, + }); + + let stdout = ""; + let stderr = ""; + let timedOut = false; + let timeout: NodeJS.Timeout | undefined; + + const finish = new Promise((resolve, reject) => { + child.stdout.setEncoding("utf8"); + child.stderr.setEncoding("utf8"); + child.stdout.on("data", (chunk: string) => { + stdout += chunk; + }); + child.stderr.on("data", (chunk: string) => { + stderr += chunk; + }); + + child.once("error", reject); + child.once("exit", (exitCode, signal) => { + const result: ShellActionResult = { + command: spec.command, + args, + cwd, + stdout, + stderr, + combinedOutput: `${stdout}${stderr}`, + exitCode, + signal, + durationMs: Date.now() - startMs, + }; + + if (timedOut) { + reject(new TimeoutError(spec.timeoutMs ?? 0)); + return; + } + + if ((exitCode ?? 0) !== 0 && spec.allowNonZeroExit !== true) { + reject( + new Error( + `Shell action failed (${renderShellCommand(spec.command, args)}): exit ${String(exitCode)}${stderr.length > 0 ? `\n${stderr.trim()}` : ""}`, + ), + ); + return; + } + + resolve(result); + }); + }); + + if (spec.stdin != null) { + child.stdin.write(spec.stdin); + } + child.stdin.end(); + + if (spec.timeoutMs != null && spec.timeoutMs > 0) { + timeout = setTimeout(() => { + timedOut = true; + child.kill("SIGTERM"); + setTimeout(() => { + child.kill("SIGKILL"); + }, 1_000).unref(); + }, spec.timeoutMs); + } + + try { + return await finish; + } finally { + if (timeout) { + clearTimeout(timeout); + } + } +} diff --git a/src/flows/runtime.ts b/src/flows/runtime.ts index bd80b22..8180c36 100644 --- a/src/flows/runtime.ts +++ b/src/flows/runtime.ts @@ -1,8 +1,4 @@ -import { spawn } from "node:child_process"; import { randomUUID } from "node:crypto"; -import fs from "node:fs/promises"; -import os from "node:os"; -import path from "node:path"; import { createOutputFormatter } from "../output.js"; import { promptToDisplayText, textPrompt } from "../prompt-content.js"; import { resolveSessionRecord } from "../session-persistence.js"; @@ -21,6 +17,8 @@ import type { PermissionMode, PromptInput, } from "../types.js"; +import { formatShellActionSummary, runShellAction } from "./executors/shell.js"; +import { FlowRunStore, flowRunsBaseDir } from "./store.js"; type MaybePromise = T | Promise; const DEFAULT_FLOW_HEARTBEAT_MS = 5_000; @@ -181,6 +179,14 @@ type MemoryWritable = { write(chunk: string): void; }; +type FlowNodeExecutionResult = { + output: unknown; + promptText: string | null; + rawText: string | null; + sessionInfo: FlowSessionBinding | null; + agentInfo: ReturnType | null; +}; + export type FlowRunnerOptions = { resolveAgent: (profile?: string) => { agentName: string; @@ -252,10 +258,6 @@ export function checkpoint( }; } -export function flowRunsBaseDir(homeDir: string = os.homedir()): string { - return path.join(homeDir, ".acpx", "flows", "runs"); -} - export class FlowRunner { private readonly resolveAgent; private readonly permissionMode; @@ -269,7 +271,7 @@ export class FlowRunner { private readonly suppressSdkConsoleErrors?; private readonly sessionOptions?; private readonly services; - private readonly outputRoot; + private readonly store; constructor(options: FlowRunnerOptions) { this.resolveAgent = options.resolveAgent; @@ -284,7 +286,7 @@ export class FlowRunner { this.suppressSdkConsoleErrors = options.suppressSdkConsoleErrors; this.sessionOptions = options.sessionOptions; this.services = options.services ?? {}; - this.outputRoot = options.outputRoot ?? flowRunsBaseDir(); + this.store = new FlowRunStore(options.outputRoot ?? flowRunsBaseDir()); } async run( @@ -295,7 +297,7 @@ export class FlowRunner { validateFlowDefinition(flow); const runId = createRunId(flow.name); - const runDir = path.join(this.outputRoot, runId); + const runDir = await this.store.createRunDir(runId); const state: FlowRunState = { runId, flowName: flow.name, @@ -309,8 +311,7 @@ export class FlowRunner { sessionBindings: {}, }; - await fs.mkdir(runDir, { recursive: true }); - await this.persist(runDir, state, { + await this.store.writeSnapshot(runDir, state, { type: "run_started", flowName: flow.name, flowPath: options.flowPath, @@ -333,137 +334,46 @@ export class FlowRunner { let sessionInfo: FlowSessionBinding | null = null; let agentInfo: ReturnType | null = null; this.markNodeStarted(state, current, node.kind, startedAt, node.statusDetail); - await this.persist(runDir, state, { + await this.store.writeSnapshot(runDir, state, { type: "node_started", nodeId: current, kind: node.kind, }); - - switch (node.kind) { - case "compute": { - output = await this.runWithHeartbeat(runDir, state, current, node, async () => { - return await withTimeout( - Promise.resolve(node.run(context)), - node.timeoutMs ?? this.timeoutMs, - ); - }); - break; - } - case "action": { - if ("run" in node) { - output = await this.runWithHeartbeat(runDir, state, current, node, async () => { - return await withTimeout( - Promise.resolve(node.run(context)), - node.timeoutMs ?? this.timeoutMs, - ); - }); - } else { - const execution = await Promise.resolve(node.exec(context)); - const effectiveExecution: ShellActionExecution = { - ...execution, - timeoutMs: execution.timeoutMs ?? node.timeoutMs ?? this.timeoutMs, - }; - this.updateStatusDetail(state, formatShellActionSummary(effectiveExecution)); - await this.persist(runDir, state, { - type: "node_detail", - nodeId: current, - detail: state.statusDetail, - }); - const result = await this.runWithHeartbeat(runDir, state, current, node, async () => { - return await runShellAction(effectiveExecution); - }); - output = node.parse ? await node.parse(result, context) : result; - } - break; - } - case "checkpoint": { - output = - typeof node.run === "function" - ? await node.run(context) - : { - checkpoint: current, - summary: node.summary ?? current, - }; - state.outputs[current] = output; - state.waitingOn = current; - state.updatedAt = isoNow(); - state.status = "waiting"; - this.clearActiveNode(state, node.summary ?? current); - state.steps.push({ - nodeId: current, - kind: node.kind, - startedAt, - finishedAt: isoNow(), - promptText, - rawText, - output, - session: null, - agent: null, - }); - await this.persist(runDir, state, { - type: "checkpoint_entered", - nodeId: current, - output, - }); - return { - runDir, - state, - }; - } - case "acp": { - const resolvedAgent = this.resolveAgent(node.profile); - agentInfo = resolvedAgent; - const prompt = normalizePromptInput(await node.prompt(context)); - promptText = promptToDisplayText(prompt); - this.updateStatusDetail(state, summarizePrompt(promptText, node.statusDetail)); - await this.persist(runDir, state, { - type: "node_detail", - nodeId: current, - detail: state.statusDetail, - }); - if (node.session?.isolated) { - rawText = await this.runWithHeartbeat(runDir, state, current, node, async () => { - return await this.runIsolatedPrompt( - resolvedAgent, - prompt, - node.timeoutMs ?? this.timeoutMs, - ); - }); - } else { - const boundSession = await this.ensureSessionBinding( - state, - flow, - node, - resolvedAgent, - ); - sessionInfo = boundSession; - rawText = await this.runWithHeartbeat( - runDir, - state, - current, - node, - async () => - await this.runPersistentPrompt( - boundSession, - prompt, - node.timeoutMs ?? this.timeoutMs, - ), - async () => { - await cancelSessionPrompt({ - sessionId: boundSession.acpxRecordId, - }); - }, - ); - sessionInfo = await this.refreshSessionBinding(boundSession); - state.sessionBindings[sessionInfo.key] = sessionInfo; - } - output = node.parse ? await node.parse(rawText, context) : rawText; - break; - } - default: { - const exhaustive: never = node; - throw new Error(`Unsupported flow node: ${String(exhaustive)}`); - } + ({ output, promptText, rawText, sessionInfo, agentInfo } = await this.executeNode( + runDir, + state, + flow, + current, + node, + context, + )); + + if (node.kind === "checkpoint") { + state.outputs[current] = output; + state.waitingOn = current; + state.updatedAt = isoNow(); + state.status = "waiting"; + this.clearActiveNode(state, (output as { summary?: string } | null)?.summary ?? current); + state.steps.push({ + nodeId: current, + kind: node.kind, + startedAt, + finishedAt: isoNow(), + promptText, + rawText, + output, + session: null, + agent: null, + }); + await this.store.writeSnapshot(runDir, state, { + type: "checkpoint_entered", + nodeId: current, + output, + }); + return { + runDir, + state, + }; } state.outputs[current] = output; @@ -481,7 +391,7 @@ export class FlowRunner { agent: agentInfo, }); - await this.persist(runDir, state, { + await this.store.writeSnapshot(runDir, state, { type: "node_completed", nodeId: current, output, @@ -494,7 +404,7 @@ export class FlowRunner { state.finishedAt = isoNow(); state.updatedAt = state.finishedAt; this.clearActiveNode(state); - await this.persist(runDir, state, { type: "run_completed" }); + await this.store.writeSnapshot(runDir, state, { type: "run_completed" }); return { runDir, state, @@ -507,7 +417,7 @@ export class FlowRunner { state.statusDetail = state.currentNode ? `Failed in ${state.currentNode}: ${state.error}` : state.error; - await this.persist(runDir, state, { + await this.store.writeSnapshot(runDir, state, { type: "run_failed", error: state.error, }); @@ -524,6 +434,186 @@ export class FlowRunner { }; } + private async executeNode( + runDir: string, + state: FlowRunState, + flow: FlowDefinition, + nodeId: string, + node: FlowNodeDefinition, + context: FlowNodeContext, + ): Promise { + switch (node.kind) { + case "compute": + return await this.executeComputeNode(runDir, state, node, context); + case "action": + return await this.executeActionNode(runDir, state, node, context); + case "checkpoint": + return await this.executeCheckpointNode(nodeId, node, context); + case "acp": + return await this.executeAcpNode(runDir, state, flow, node, context); + default: { + const exhaustive: never = node; + throw new Error(`Unsupported flow node: ${String(exhaustive)}`); + } + } + } + + private async executeComputeNode( + runDir: string, + state: FlowRunState, + node: ComputeNodeDefinition, + context: FlowNodeContext, + ): Promise { + const output = await this.runWithHeartbeat( + runDir, + state, + state.currentNode ?? "", + node, + async () => + await withTimeout(Promise.resolve(node.run(context)), node.timeoutMs ?? this.timeoutMs), + ); + return { + output, + promptText: null, + rawText: null, + sessionInfo: null, + agentInfo: null, + }; + } + + private async executeActionNode( + runDir: string, + state: FlowRunState, + node: ActionNodeDefinition, + context: FlowNodeContext, + ): Promise { + if ("run" in node) { + const output = await this.runWithHeartbeat( + runDir, + state, + state.currentNode ?? "", + node, + async () => + await withTimeout(Promise.resolve(node.run(context)), node.timeoutMs ?? this.timeoutMs), + ); + return { + output, + promptText: null, + rawText: null, + sessionInfo: null, + agentInfo: null, + }; + } + + const execution = await Promise.resolve(node.exec(context)); + const effectiveExecution: ShellActionExecution = { + ...execution, + timeoutMs: execution.timeoutMs ?? node.timeoutMs ?? this.timeoutMs, + }; + this.updateStatusDetail(state, formatShellActionSummary(effectiveExecution)); + await this.store.writeLive(runDir, state, { + type: "node_detail", + nodeId: state.currentNode, + detail: state.statusDetail, + }); + const result = await this.runWithHeartbeat( + runDir, + state, + state.currentNode ?? "", + node, + async () => await runShellAction(effectiveExecution), + ); + const output = node.parse ? await node.parse(result, context) : result; + return { + output, + promptText: null, + rawText: result.combinedOutput, + sessionInfo: null, + agentInfo: null, + }; + } + + private async executeCheckpointNode( + nodeId: string, + node: CheckpointNodeDefinition, + context: FlowNodeContext, + ): Promise { + const output = + typeof node.run === "function" + ? await node.run(context) + : { + checkpoint: nodeId, + summary: node.summary ?? nodeId, + }; + return { + output, + promptText: null, + rawText: null, + sessionInfo: null, + agentInfo: null, + }; + } + + private async executeAcpNode( + runDir: string, + state: FlowRunState, + flow: FlowDefinition, + node: AcpNodeDefinition, + context: FlowNodeContext, + ): Promise { + const agentInfo = this.resolveAgent(node.profile); + const prompt = normalizePromptInput(await node.prompt(context)); + const promptText = promptToDisplayText(prompt); + this.updateStatusDetail(state, summarizePrompt(promptText, node.statusDetail)); + await this.store.writeLive(runDir, state, { + type: "node_detail", + nodeId: state.currentNode, + detail: state.statusDetail, + }); + + if (node.session?.isolated) { + const rawText = await this.runWithHeartbeat( + runDir, + state, + state.currentNode ?? "", + node, + async () => + await this.runIsolatedPrompt(agentInfo, prompt, node.timeoutMs ?? this.timeoutMs), + ); + return { + output: node.parse ? await node.parse(rawText, context) : rawText, + promptText, + rawText, + sessionInfo: null, + agentInfo, + }; + } + + const boundSession = await this.ensureSessionBinding(state, flow, node, agentInfo); + const rawText = await this.runWithHeartbeat( + runDir, + state, + state.currentNode ?? "", + node, + async () => + await this.runPersistentPrompt(boundSession, prompt, node.timeoutMs ?? this.timeoutMs), + async () => { + await cancelSessionPrompt({ + sessionId: boundSession.acpxRecordId, + }); + }, + ); + const sessionInfo = await this.refreshSessionBinding(boundSession); + state.sessionBindings[sessionInfo.key] = sessionInfo; + return { + output: node.parse ? await node.parse(rawText, context) : rawText, + promptText, + rawText, + sessionInfo, + agentInfo, + }; + } + private markNodeStarted( state: FlowRunState, nodeId: string, @@ -572,7 +662,7 @@ export class FlowRunner { } state.lastHeartbeatAt = isoNow(); state.updatedAt = state.lastHeartbeatAt; - await this.persist(runDir, state, { + await this.store.writeLive(runDir, state, { type: "node_heartbeat", nodeId, }); @@ -701,24 +791,6 @@ export class FlowRunner { }); return capture.read(); } - - private async persist( - runDir: string, - state: FlowRunState, - event: Record, - ): Promise { - state.updatedAt = isoNow(); - const runPath = path.join(runDir, "run.json"); - const tempPath = `${runPath}.${process.pid}.${Date.now()}.tmp`; - const payload = JSON.stringify(state, null, 2); - await fs.writeFile(tempPath, `${payload}\n`, "utf8"); - await fs.rename(tempPath, runPath); - await fs.appendFile( - path.join(runDir, "events.ndjson"), - `${JSON.stringify({ at: isoNow(), ...event })}\n`, - "utf8", - ); - } } function validateFlowDefinition(flow: FlowDefinition): void { @@ -806,101 +878,6 @@ function summarizePrompt(promptText: string, explicitDetail?: string): string { return `ACP: ${truncated}`; } -function formatShellActionSummary(spec: ShellActionExecution): string { - return `shell: ${renderShellCommand(spec.command, spec.args ?? [])}`; -} - -function renderShellCommand(command: string, args: string[]): string { - const renderedArgs = args.map((arg) => JSON.stringify(arg)).join(" "); - return renderedArgs.length > 0 ? `${command} ${renderedArgs}` : command; -} - -async function runShellAction(spec: ShellActionExecution): Promise { - const cwd = spec.cwd ?? process.cwd(); - const args = spec.args ?? []; - const startMs = Date.now(); - const child = spawn(spec.command, args, { - cwd, - env: { - ...process.env, - ...spec.env, - }, - shell: spec.shell, - stdio: ["pipe", "pipe", "pipe"], - windowsHide: true, - }); - - let stdout = ""; - let stderr = ""; - let timedOut = false; - let timeout: NodeJS.Timeout | undefined; - - const finish = new Promise((resolve, reject) => { - child.stdout.setEncoding("utf8"); - child.stderr.setEncoding("utf8"); - child.stdout.on("data", (chunk: string) => { - stdout += chunk; - }); - child.stderr.on("data", (chunk: string) => { - stderr += chunk; - }); - - child.once("error", reject); - child.once("exit", (exitCode, signal) => { - const result: ShellActionResult = { - command: spec.command, - args, - cwd, - stdout, - stderr, - combinedOutput: `${stdout}${stderr}`, - exitCode, - signal, - durationMs: Date.now() - startMs, - }; - - if (timedOut) { - reject(new TimeoutError(spec.timeoutMs ?? 0)); - return; - } - - if ((exitCode ?? 0) !== 0 && spec.allowNonZeroExit !== true) { - reject( - new Error( - `Shell action failed (${renderShellCommand(spec.command, args)}): exit ${String(exitCode)}${stderr.length > 0 ? `\n${stderr.trim()}` : ""}`, - ), - ); - return; - } - - resolve(result); - }); - }); - - if (spec.stdin != null) { - child.stdin.write(spec.stdin); - } - child.stdin.end(); - - if (spec.timeoutMs != null && spec.timeoutMs > 0) { - timeout = setTimeout(() => { - timedOut = true; - child.kill("SIGTERM"); - setTimeout(() => { - child.kill("SIGKILL"); - }, 1_000).unref(); - }, spec.timeoutMs); - } - - try { - return await finish; - } finally { - if (timeout) { - clearTimeout(timeout); - } - } -} - function createQuietCaptureOutput(): { formatter: ReturnType; read: () => string; diff --git a/src/flows/store.ts b/src/flows/store.ts new file mode 100644 index 0000000..0967d0b --- /dev/null +++ b/src/flows/store.ts @@ -0,0 +1,94 @@ +import fs from "node:fs/promises"; +import os from "node:os"; +import path from "node:path"; +import type { FlowRunState } from "./runtime.js"; + +export type FlowStoreEvent = Record; + +export function flowRunsBaseDir(homeDir: string = os.homedir()): string { + return path.join(homeDir, ".acpx", "flows", "runs"); +} + +export class FlowRunStore { + readonly outputRoot: string; + + constructor(outputRoot: string = flowRunsBaseDir()) { + this.outputRoot = outputRoot; + } + + async createRunDir(runId: string): Promise { + const runDir = path.join(this.outputRoot, runId); + await fs.mkdir(runDir, { recursive: true }); + return runDir; + } + + async writeSnapshot(runDir: string, state: FlowRunState, event: FlowStoreEvent): Promise { + state.updatedAt = isoNow(); + await writeJsonAtomic(path.join(runDir, "run.json"), state); + await writeJsonAtomic(path.join(runDir, "live.json"), createLiveState(state)); + await appendEvent(runDir, event); + } + + async writeLive(runDir: string, state: FlowRunState, event: FlowStoreEvent): Promise { + state.updatedAt = isoNow(); + await writeJsonAtomic(path.join(runDir, "live.json"), createLiveState(state)); + await appendEvent(runDir, event); + } +} + +type FlowLiveState = { + runId: string; + flowName: string; + flowPath?: string; + startedAt: string; + finishedAt?: string; + updatedAt: string; + status: FlowRunState["status"]; + currentNode?: string; + currentNodeKind?: FlowRunState["currentNodeKind"]; + currentNodeStartedAt?: string; + lastHeartbeatAt?: string; + statusDetail?: string; + waitingOn?: string; + error?: string; + sessionBindings: FlowRunState["sessionBindings"]; +}; + +function createLiveState(state: FlowRunState): FlowLiveState { + return { + runId: state.runId, + flowName: state.flowName, + flowPath: state.flowPath, + startedAt: state.startedAt, + finishedAt: state.finishedAt, + updatedAt: state.updatedAt, + status: state.status, + currentNode: state.currentNode, + currentNodeKind: state.currentNodeKind, + currentNodeStartedAt: state.currentNodeStartedAt, + lastHeartbeatAt: state.lastHeartbeatAt, + statusDetail: state.statusDetail, + waitingOn: state.waitingOn, + error: state.error, + sessionBindings: state.sessionBindings, + }; +} + +async function writeJsonAtomic(filePath: string, value: unknown): Promise { + const tempPath = `${filePath}.${process.pid}.${Date.now()}.tmp`; + const payload = JSON.stringify(value, null, 2); + await fs.writeFile(tempPath, `${payload}\n`, "utf8"); + await fs.rename(tempPath, filePath); +} + +async function appendEvent(runDir: string, event: FlowStoreEvent): Promise { + await fs.appendFile( + path.join(runDir, "events.ndjson"), + `${JSON.stringify({ at: isoNow(), ...event })}\n`, + "utf8", + ); +} + +function isoNow(): string { + return new Date().toISOString(); +} From 393e8ca1ca8f25ba8fef42e198d5ffc7ed5f8e4a Mon Sep 17 00:00:00 2001 From: Onur Solmaz <2453968+osolmaz@users.noreply.github.com> Date: Wed, 25 Mar 2026 22:05:10 +0100 Subject: [PATCH 10/22] feat: harden flow runtime workspaces and step bounds --- CHANGELOG.md | 1 + README.md | 3 + docs/CLI.md | 2 + examples/flows/README.md | 3 + examples/flows/workdir.flow.ts | 52 ++++++++++++ src/flows.ts | 7 +- src/flows/json.ts | 42 ++++++--- src/flows/runtime.ts | 70 ++++++++++++--- test/fixtures/flow-workdir.flow.ts | 44 ++++++++++ test/flows.test.ts | 131 +++++++++++++++++++++++++++++ test/integration.test.ts | 49 +++++++++++ 11 files changed, 382 insertions(+), 22 deletions(-) create mode 100644 examples/flows/workdir.flow.ts create mode 100644 test/fixtures/flow-workdir.flow.ts diff --git a/CHANGELOG.md b/CHANGELOG.md index c113d12..8b5e5e2 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,6 +9,7 @@ Repo: https://github.com/openclaw/acpx - Conformance/ACP: add a data-driven ACP core v1 conformance suite with CI smoke coverage, nightly coverage, and a hardened runner that reports startup failures cleanly and scopes filesystem checks to the session cwd. (#130) Thanks @lynnzc. - Agents/droid: add `factory-droid` and `factorydroid` aliases for the built-in Factory Droid adapter and sync the built-in docs. Thanks @vincentkoc. - Flows/workflows: add an initial `flow run` command, an `acpx/flows` runtime surface, and file-backed flow run state under `~/.acpx/flows/runs` for user-authored workflow modules. Thanks @osolmaz. +- Flows/workspaces: let `acp` nodes bind to an explicit per-step cwd, add a native isolated-workspace example, and default active flow steps to a 15 minute timeout unless overridden. Thanks @osolmaz. ### Breaking diff --git a/README.md b/README.md index ff7ca6d..6e3c058 100644 --- a/README.md +++ b/README.md @@ -40,6 +40,8 @@ One command surface for Pi, OpenClaw ACP, Codex, Claude, and other ACP-compatibl - **Any ACP agent**: built-in registry + `--agent` escape hatch for custom servers - **One-shot mode**: `exec` for stateless fire-and-forget tasks - **Experimental flows**: `flow run ` for user-authored ACP workflows over multiple prompts +- **Runtime-owned flow actions**: shell-backed action steps can prepare workspaces and other deterministic mechanics outside the agent turn +- **Flow workspace isolation**: `acp` nodes can target an explicit per-step cwd, so flows can keep agent work inside disposable worktrees ```bash $ acpx codex sessions new @@ -206,6 +208,7 @@ acpx --format text codex 'summarize your findings' acpx --format json codex exec 'review changed files' acpx --format json --json-strict codex exec 'machine-safe JSON only' acpx flow run ./my-flow.ts --input-file ./flow-input.json +acpx --timeout 1800 flow run ./my-flow.ts acpx --format quiet codex 'final recommendation only' acpx --timeout 90 codex 'investigate intermittent test timeout' diff --git a/docs/CLI.md b/docs/CLI.md index e3c566a..79f9a63 100644 --- a/docs/CLI.md +++ b/docs/CLI.md @@ -74,6 +74,8 @@ acpx [global_options] flow run [--input-json | --input-file - Runs a user-authored workflow module step by step through the `acpx/flows` runtime. - Persists run artifacts under `~/.acpx/flows/runs//`. - Reuses one implicit main ACP session by default for non-isolated `acp` nodes. +- `acp` nodes may override their working directory per step, which lets flows prepare an isolated workspace with an action node and then keep the agent session inside that cwd. +- `acp` and `action` nodes use the global `--timeout` value as their default step timeout. If `--timeout` is omitted, flows default to 15 minutes per active step. - `--input-json` passes flow input inline as JSON. - `--input-file` reads flow input JSON from disk. - `--default-agent` supplies the default agent profile for `acp` nodes that do not pin one. diff --git a/examples/flows/README.md b/examples/flows/README.md index 7091d0b..1e46367 100644 --- a/examples/flows/README.md +++ b/examples/flows/README.md @@ -5,6 +5,7 @@ These are simple source-tree examples for `acpx flow run`. - `echo.flow.ts`: one ACP step that returns a JSON reply - `branch.flow.ts`: ACP classification followed by a deterministic branch into either `continue` or `checkpoint` - `shell.flow.ts`: one native runtime-owned shell action that returns structured JSON +- `workdir.flow.ts`: native workspace prep followed by an ACP step that runs inside that isolated cwd - `two-turn.flow.ts`: two ACP prompts in the same implicit main session Run them from the repo root: @@ -19,6 +20,8 @@ acpx flow run examples/flows/branch.flow.ts \ acpx flow run examples/flows/shell.flow.ts \ --input-json '{"text":"hello from shell"}' +acpx flow run examples/flows/workdir.flow.ts + acpx flow run examples/flows/two-turn.flow.ts \ --input-json '{"topic":"How should we validate a new ACP adapter?"}' ``` diff --git a/examples/flows/workdir.flow.ts b/examples/flows/workdir.flow.ts new file mode 100644 index 0000000..5bc2cf8 --- /dev/null +++ b/examples/flows/workdir.flow.ts @@ -0,0 +1,52 @@ +import { acp, compute, defineFlow, extractJsonObject, shell } from "../../src/flows.js"; + +export default defineFlow({ + name: "example-workdir", + startAt: "prepare_workspace", + nodes: { + prepare_workspace: shell({ + exec: () => ({ + command: process.execPath, + args: [ + "-e", + [ + "const fs = require('node:fs/promises');", + "const os = require('node:os');", + "const path = require('node:path');", + "(async () => {", + " const workdir = await fs.mkdtemp(path.join(os.tmpdir(), 'acpx-flow-workdir-'));", + " await fs.writeFile(path.join(workdir, 'note.txt'), 'hello from isolated workspace\\n', 'utf8');", + " process.stdout.write(JSON.stringify({ workdir }));", + "})().catch((error) => {", + " console.error(error);", + " process.exitCode = 1;", + "});", + ].join(" "), + ], + }), + parse: (result) => extractJsonObject(result.stdout), + statusDetail: "Create isolated workspace for later ACP steps", + }), + inspect_workspace: acp({ + cwd: ({ outputs }) => (outputs.prepare_workspace as { workdir: string }).workdir, + async prompt() { + return [ + "You are already inside an isolated workspace created by the flow runtime.", + "Read note.txt from the current working directory and return exactly one JSON object with this shape:", + "{", + ' "cwd": "current working directory",', + ' "note": "contents of note.txt"', + "}", + ].join("\n"); + }, + parse: (text) => extractJsonObject(text), + }), + finalize: compute({ + run: ({ outputs }) => outputs.inspect_workspace, + }), + }, + edges: [ + { from: "prepare_workspace", to: "inspect_workspace" }, + { from: "inspect_workspace", to: "finalize" }, + ], +}); diff --git a/src/flows.ts b/src/flows.ts index 4845899..ad7c63c 100644 --- a/src/flows.ts +++ b/src/flows.ts @@ -26,4 +26,9 @@ export { type ShellActionResult, } from "./flows/runtime.js"; export { flowRunsBaseDir } from "./flows/store.js"; -export { extractJsonObject } from "./flows/json.js"; +export { + extractJsonObject, + parseJsonObject, + parseStrictJsonObject, + type JsonObjectParseMode, +} from "./flows/json.js"; diff --git a/src/flows/json.ts b/src/flows/json.ts index a8d7d3d..30089ba 100644 --- a/src/flows/json.ts +++ b/src/flows/json.ts @@ -1,33 +1,53 @@ -export function extractJsonObject(text: string): unknown { +export type JsonObjectParseMode = "strict" | "fenced" | "compat"; + +export function parseJsonObject( + text: string, + options: { + mode?: JsonObjectParseMode; + } = {}, +): unknown { const trimmed = String(text ?? "").trim(); if (!trimmed) { throw new Error("Expected JSON output, got empty text"); } + const mode = options.mode ?? "compat"; const direct = tryParse(trimmed); if (direct.ok) { return direct.value; } - const fencedMatch = trimmed.match(/```(?:json)?\s*([\s\S]*?)```/i); - if (fencedMatch) { - const fenced = tryParse(fencedMatch[1].trim()); - if (fenced.ok) { - return fenced.value; + if (mode === "fenced" || mode === "compat") { + const fencedMatch = trimmed.match(/```(?:json)?\s*([\s\S]*?)```/i); + if (fencedMatch) { + const fenced = tryParse(fencedMatch[1].trim()); + if (fenced.ok) { + return fenced.value; + } } } - const balanced = extractBalancedJson(trimmed); - if (balanced) { - const parsed = tryParse(balanced); - if (parsed.ok) { - return parsed.value; + if (mode === "compat") { + const balanced = extractBalancedJson(trimmed); + if (balanced) { + const parsed = tryParse(balanced); + if (parsed.ok) { + return parsed.value; + } } } throw new Error(`Could not parse JSON from assistant output:\n${trimmed}`); } +export function parseStrictJsonObject(text: string): unknown { + return parseJsonObject(text, { mode: "strict" }); +} + +export function extractJsonObject(text: string): unknown { + return parseJsonObject(text, { mode: "compat" }); +} + function tryParse(text: string): { ok: true; value: unknown } | { ok: false } { try { return { diff --git a/src/flows/runtime.ts b/src/flows/runtime.ts index 8180c36..10ffa4d 100644 --- a/src/flows/runtime.ts +++ b/src/flows/runtime.ts @@ -1,4 +1,5 @@ -import { randomUUID } from "node:crypto"; +import { createHash, randomUUID } from "node:crypto"; +import path from "node:path"; import { createOutputFormatter } from "../output.js"; import { promptToDisplayText, textPrompt } from "../prompt-content.js"; import { resolveSessionRecord } from "../session-persistence.js"; @@ -22,6 +23,7 @@ import { FlowRunStore, flowRunsBaseDir } from "./store.js"; type MaybePromise = T | Promise; const DEFAULT_FLOW_HEARTBEAT_MS = 5_000; +const DEFAULT_FLOW_STEP_TIMEOUT_MS = 15 * 60_000; export type FlowNodeContext = { input: TInput; @@ -52,6 +54,7 @@ export type FlowEdge = export type AcpNodeDefinition = FlowNodeCommon & { kind: "acp"; profile?: string; + cwd?: string | ((context: FlowNodeContext) => MaybePromise); session?: { handle?: string; isolated?: boolean; @@ -199,6 +202,7 @@ export type FlowRunnerOptions = { authCredentials?: Record; authPolicy?: AuthPolicy; timeoutMs?: number; + defaultNodeTimeoutMs?: number; ttlMs?: number; verbose?: boolean; suppressSdkConsoleErrors?: boolean; @@ -266,6 +270,7 @@ export class FlowRunner { private readonly authCredentials?; private readonly authPolicy?; private readonly timeoutMs?; + private readonly defaultNodeTimeoutMs; private readonly ttlMs?; private readonly verbose?; private readonly suppressSdkConsoleErrors?; @@ -281,6 +286,8 @@ export class FlowRunner { this.authCredentials = options.authCredentials; this.authPolicy = options.authPolicy; this.timeoutMs = options.timeoutMs; + this.defaultNodeTimeoutMs = + options.defaultNodeTimeoutMs ?? options.timeoutMs ?? DEFAULT_FLOW_STEP_TIMEOUT_MS; this.ttlMs = options.ttlMs; this.verbose = options.verbose; this.suppressSdkConsoleErrors = options.suppressSdkConsoleErrors; @@ -470,7 +477,10 @@ export class FlowRunner { state.currentNode ?? "", node, async () => - await withTimeout(Promise.resolve(node.run(context)), node.timeoutMs ?? this.timeoutMs), + await withTimeout( + Promise.resolve(node.run(context)), + node.timeoutMs ?? this.defaultNodeTimeoutMs, + ), ); return { output, @@ -494,7 +504,10 @@ export class FlowRunner { state.currentNode ?? "", node, async () => - await withTimeout(Promise.resolve(node.run(context)), node.timeoutMs ?? this.timeoutMs), + await withTimeout( + Promise.resolve(node.run(context)), + node.timeoutMs ?? this.defaultNodeTimeoutMs, + ), ); return { output, @@ -508,7 +521,7 @@ export class FlowRunner { const execution = await Promise.resolve(node.exec(context)); const effectiveExecution: ShellActionExecution = { ...execution, - timeoutMs: execution.timeoutMs ?? node.timeoutMs ?? this.timeoutMs, + timeoutMs: execution.timeoutMs ?? node.timeoutMs ?? this.defaultNodeTimeoutMs, }; this.updateStatusDetail(state, formatShellActionSummary(effectiveExecution)); await this.store.writeLive(runDir, state, { @@ -561,7 +574,11 @@ export class FlowRunner { node: AcpNodeDefinition, context: FlowNodeContext, ): Promise { - const agentInfo = this.resolveAgent(node.profile); + const resolvedAgent = this.resolveAgent(node.profile); + const agentInfo = { + ...resolvedAgent, + cwd: await resolveNodeCwd(resolvedAgent.cwd, node.cwd, context), + }; const prompt = normalizePromptInput(await node.prompt(context)); const promptText = promptToDisplayText(prompt); this.updateStatusDetail(state, summarizePrompt(promptText, node.statusDetail)); @@ -578,7 +595,11 @@ export class FlowRunner { state.currentNode ?? "", node, async () => - await this.runIsolatedPrompt(agentInfo, prompt, node.timeoutMs ?? this.timeoutMs), + await this.runIsolatedPrompt( + agentInfo, + prompt, + node.timeoutMs ?? this.defaultNodeTimeoutMs, + ), ); return { output: node.parse ? await node.parse(rawText, context) : rawText, @@ -596,7 +617,11 @@ export class FlowRunner { state.currentNode ?? "", node, async () => - await this.runPersistentPrompt(boundSession, prompt, node.timeoutMs ?? this.timeoutMs), + await this.runPersistentPrompt( + boundSession, + prompt, + node.timeoutMs ?? this.defaultNodeTimeoutMs, + ), async () => { await cancelSessionPrompt({ sessionId: boundSession.acpxRecordId, @@ -698,13 +723,13 @@ export class FlowRunner { agent: ReturnType, ): Promise { const handle = node.session?.handle ?? "main"; - const key = `${agent.agentCommand}::${handle}`; + const key = createSessionBindingKey(agent.agentCommand, agent.cwd, handle); const existing = state.sessionBindings[key]; if (existing) { return existing; } - const name = `${flow.name}-${handle}-${state.runId.slice(-8)}`; + const name = createSessionName(flow.name, handle, agent.cwd, state.runId); const created = await createSession({ agentCommand: agent.agentCommand, cwd: agent.cwd, @@ -714,7 +739,7 @@ export class FlowRunner { nonInteractivePermissions: this.nonInteractivePermissions, authCredentials: this.authCredentials, authPolicy: this.authPolicy, - timeoutMs: this.timeoutMs, + timeoutMs: this.defaultNodeTimeoutMs, verbose: this.verbose, sessionOptions: this.sessionOptions, }); @@ -823,6 +848,18 @@ function normalizePromptInput(prompt: PromptInput | string): PromptInput { return typeof prompt === "string" ? textPrompt(prompt) : prompt; } +async function resolveNodeCwd( + defaultCwd: string, + cwd: string | ((context: FlowNodeContext) => MaybePromise) | undefined, + context: FlowNodeContext, +): Promise { + if (typeof cwd === "function") { + const resolved = (await cwd(context)) ?? defaultCwd; + return path.resolve(defaultCwd, resolved); + } + return path.resolve(defaultCwd, cwd ?? defaultCwd); +} + function resolveNext(edges: FlowEdge[], from: string, output: unknown): string | null { const edge = edges.find((candidate) => candidate.from === from); if (!edge) { @@ -906,6 +943,19 @@ function createRunId(flowName: string): string { return `${stamp}-${slug}-${randomUUID().slice(0, 8)}`; } +function createSessionBindingKey(agentCommand: string, cwd: string, handle: string): string { + return `${agentCommand}::${cwd}::${handle}`; +} + +function createSessionName(flowName: string, handle: string, cwd: string, runId: string): string { + const stamp = stableShortHash(cwd); + return `${flowName}-${handle}-${stamp}-${runId.slice(-8)}`; +} + +function stableShortHash(value: string): string { + return createHash("sha1").update(value).digest("hex").slice(0, 8); +} + function isoNow(): string { return new Date().toISOString(); } diff --git a/test/fixtures/flow-workdir.flow.ts b/test/fixtures/flow-workdir.flow.ts new file mode 100644 index 0000000..c87de73 --- /dev/null +++ b/test/fixtures/flow-workdir.flow.ts @@ -0,0 +1,44 @@ +import { acp, compute, defineFlow, extractJsonObject, shell } from "../../src/flows.js"; + +export default defineFlow({ + name: "fixture-workdir", + startAt: "prepare", + nodes: { + prepare: shell({ + exec: () => ({ + command: process.execPath, + args: [ + "-e", + [ + "const fs = require('node:fs/promises');", + "const os = require('node:os');", + "const path = require('node:path');", + "(async () => {", + " const workdir = await fs.mkdtemp(path.join(os.tmpdir(), 'acpx-fixture-workdir-'));", + " process.stdout.write(JSON.stringify({ workdir }));", + "})().catch((error) => {", + " console.error(error);", + " process.exitCode = 1;", + "});", + ].join(" "), + ], + }), + parse: (result) => extractJsonObject(result.stdout), + }), + inspect: acp({ + cwd: ({ outputs }) => (outputs.prepare as { workdir: string }).workdir, + prompt: () => { + const script = "process.stdout.write(JSON.stringify({ cwd: process.cwd() }))"; + return `terminal ${JSON.stringify(process.execPath)} -e ${JSON.stringify(script)}`; + }, + parse: (text) => extractJsonObject(text), + }), + finalize: compute({ + run: ({ outputs }) => outputs.inspect, + }), + }, + edges: [ + { from: "prepare", to: "inspect" }, + { from: "inspect", to: "finalize" }, + ], +}); diff --git a/test/flows.test.ts b/test/flows.test.ts index 0e35f07..f441a48 100644 --- a/test/flows.test.ts +++ b/test/flows.test.ts @@ -13,12 +13,16 @@ import { defineFlow, extractJsonObject, flowRunsBaseDir, + parseJsonObject, + parseStrictJsonObject, shell, } from "../src/flows.js"; import { TimeoutError } from "../src/session-runtime-helpers.js"; const MOCK_AGENT_PATH = fileURLToPath(new URL("./mock-agent.js", import.meta.url)); const MOCK_AGENT_COMMAND = `node ${JSON.stringify(MOCK_AGENT_PATH)}`; +const TEST_CLI_PATH = fileURLToPath(new URL("../src/cli.js", import.meta.url)); +const TEST_QUEUE_OWNER_ARGS = JSON.stringify([TEST_CLI_PATH, "__queue-owner"]); test("extractJsonObject parses direct, fenced, and embedded JSON", () => { assert.deepEqual(extractJsonObject('{"ok":true}'), { ok: true }); @@ -26,6 +30,18 @@ test("extractJsonObject parses direct, fenced, and embedded JSON", () => { assert.deepEqual(extractJsonObject('before {"ok":true} after'), { ok: true }); }); +test("parseJsonObject supports strict and fenced-only modes", () => { + assert.deepEqual(parseStrictJsonObject('{"ok":true}'), { ok: true }); + assert.deepEqual(parseJsonObject('```json\n{"ok":true}\n```', { mode: "fenced" }), { + ok: true, + }); + assert.throws(() => parseStrictJsonObject('before {"ok":true} after'), /Could not parse JSON/); + assert.throws( + () => parseJsonObject('before {"ok":true} after', { mode: "fenced" }), + /Could not parse JSON/, + ); +}); + test("FlowRunner executes isolated ACP nodes and branches deterministically", async () => { await withTempHome(async (homeDir) => { const cwd = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-cwd-")); @@ -229,6 +245,114 @@ test("FlowRunner persists active node state while a shell step is running", asyn }); }); +test("FlowRunner lets ACP nodes run in a dynamic working directory", async () => { + await withTempHome(async () => { + const baseCwd = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-base-cwd-")); + const worktree = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-worktree-")); + + try { + const runner = new FlowRunner({ + resolveAgent: () => ({ + agentName: "mock", + agentCommand: MOCK_AGENT_COMMAND, + cwd: baseCwd, + }), + permissionMode: "approve-all", + outputRoot: await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-store-")), + }); + + const flow = defineFlow({ + name: "dynamic-cwd-test", + startAt: "prepare", + nodes: { + prepare: action({ + run: () => ({ worktree }), + }), + inspect: acp({ + cwd: ({ outputs }) => (outputs.prepare as { worktree: string }).worktree, + prompt: () => { + const script = "process.stdout.write(process.cwd())"; + return `terminal ${JSON.stringify(process.execPath)} -e ${JSON.stringify(script)}`; + }, + parse: (text) => text.trim().split("\n")[0] ?? "", + }), + }, + edges: [{ from: "prepare", to: "inspect" }], + }); + + const result = await runner.run(flow, {}); + assert.equal(result.state.status, "completed"); + assert.equal( + await fs.realpath(String(result.state.outputs.inspect)), + await fs.realpath(worktree), + ); + const bindings = Object.values(result.state.sessionBindings); + assert.equal(bindings.length, 1); + assert.equal(await fs.realpath(bindings[0]?.cwd ?? ""), await fs.realpath(worktree)); + } finally { + await fs.rm(baseCwd, { recursive: true, force: true }); + await fs.rm(worktree, { recursive: true, force: true }); + } + }); +}); + +test("FlowRunner keeps same session handles isolated by working directory", async () => { + await withTempHome(async () => { + const baseCwd = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-base-cwd-")); + const worktreeA = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-worktree-a-")); + const worktreeB = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-worktree-b-")); + + try { + const runner = new FlowRunner({ + resolveAgent: () => ({ + agentName: "mock", + agentCommand: MOCK_AGENT_COMMAND, + cwd: baseCwd, + }), + permissionMode: "approve-all", + outputRoot: await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-store-")), + }); + + const flow = defineFlow({ + name: "session-cwd-split-test", + startAt: "first", + nodes: { + first: acp({ + session: { + handle: "main", + }, + cwd: () => worktreeA, + prompt: () => 'echo {"where":"A"}', + parse: (text) => extractJsonObject(text), + }), + second: acp({ + session: { + handle: "main", + }, + cwd: () => worktreeB, + prompt: () => 'echo {"where":"B"}', + parse: (text) => extractJsonObject(text), + }), + }, + edges: [{ from: "first", to: "second" }], + }); + + const result = await runner.run(flow, {}); + assert.equal(result.state.status, "completed"); + assert.deepEqual(result.state.outputs.first, { where: "A" }); + assert.deepEqual(result.state.outputs.second, { where: "B" }); + const bindings = Object.values(result.state.sessionBindings); + assert.equal(bindings.length, 2); + const bindingCwds = new Set(bindings.map((binding) => binding.cwd)); + assert.deepEqual(bindingCwds, new Set([worktreeA, worktreeB])); + } finally { + await fs.rm(baseCwd, { recursive: true, force: true }); + await fs.rm(worktreeA, { recursive: true, force: true }); + await fs.rm(worktreeB, { recursive: true, force: true }); + } + }); +}); + test("FlowRunner marks timed out shell steps explicitly", async () => { await withTempHome(async () => { const outputRoot = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-store-")); @@ -268,8 +392,10 @@ test("FlowRunner marks timed out shell steps explicitly", async () => { async function withTempHome(run: (homeDir: string) => Promise): Promise { const previousHome = process.env.HOME; + const previousQueueOwnerArgs = process.env.ACPX_QUEUE_OWNER_ARGS; const homeDir = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-home-")); process.env.HOME = homeDir; + process.env.ACPX_QUEUE_OWNER_ARGS = TEST_QUEUE_OWNER_ARGS; try { await run(homeDir); @@ -279,6 +405,11 @@ async function withTempHome(run: (homeDir: string) => Promise): Promise { + await withTempHome(async (homeDir) => { + const cwd = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-integration-cwd-")); + + try { + const result = await runCli( + [ + ...baseAgentArgs(cwd), + "--format", + "json", + "--ttl", + "1", + "flow", + "run", + FLOW_WORKDIR_FIXTURE_PATH, + ], + homeDir, + ); + + assert.equal(result.code, 0, result.stderr); + const payload = JSON.parse(result.stdout.trim()) as { + action?: string; + status?: string; + outputs?: { + prepare?: { workdir: string }; + finalize?: { cwd: string }; + }; + sessionBindings?: Record; + }; + + assert.equal(payload.action, "flow_run_result"); + assert.equal(payload.status, "completed"); + const workdir = payload.outputs?.prepare?.workdir; + const finalCwd = payload.outputs?.finalize?.cwd; + assert.equal(typeof workdir, "string"); + assert.equal(typeof finalCwd, "string"); + assert.equal(await fs.realpath(String(finalCwd)), await fs.realpath(String(workdir))); + const bindings = Object.values(payload.sessionBindings ?? {}); + assert.equal(bindings.length, 1); + assert.equal(await fs.realpath(bindings[0]?.cwd ?? ""), await fs.realpath(String(workdir))); + } finally { + await fs.rm(cwd, { recursive: true, force: true }); + } + }); +}); + test("integration: built-in droid agent resolves to droid exec --output-format acp", async () => { await withTempHome(async (homeDir) => { const cwd = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-integration-cwd-")); From f2d498d5d44f1a55ecbd0ced1b933343414a93c8 Mon Sep 17 00:00:00 2001 From: Onur Solmaz <2453968+osolmaz@users.noreply.github.com> Date: Wed, 25 Mar 2026 22:32:17 +0100 Subject: [PATCH 11/22] docs: clarify flow JSON simplicity rules --- ...3-25-acpx-flows-production-architecture.md | 30 +++++++++++++++++++ src/flows/json.ts | 6 ++++ 2 files changed, 36 insertions(+) diff --git a/docs/2026-03-25-acpx-flows-production-architecture.md b/docs/2026-03-25-acpx-flows-production-architecture.md index a9a39ba..e948cbc 100644 --- a/docs/2026-03-25-acpx-flows-production-architecture.md +++ b/docs/2026-03-25-acpx-flows-production-architecture.md @@ -265,6 +265,36 @@ Use for explicit wait states: - external webhook - workflow approval gate that the runtime cannot clear +## Simplicity rules + +The runtime should stay boring. + +That means: + +- keep the core node set small +- prefer generic primitives over workload-specific helpers +- add fewer conventions, not more + +Some concrete examples: + +- a per-step `cwd` override is enough; `acpx` does not need a built-in + `git_worktree_for_pr` primitive +- a shell-backed `action` step is enough for many deterministic mechanics; do + not rush to add a new first-class node type for every external tool +- keep JSON parsing simple: + - use compatibility parsing by default for real workflows, because models do + sometimes wrap valid JSON in extra chatter + - use strict JSON parsing only when the contract truly must fail on any extra + text + - do not turn structured-output handling into a giant parser framework + +The right bias is: + +- generic runtime capabilities in `acpx` +- workload-specific policy in user-authored workflow files + +That keeps the library production-ready without making it heavy. + ## PR triage under the production model The PR triage workflow should still follow the same logical flow, but some diff --git a/src/flows/json.ts b/src/flows/json.ts index 30089ba..1ff82fb 100644 --- a/src/flows/json.ts +++ b/src/flows/json.ts @@ -1,5 +1,7 @@ export type JsonObjectParseMode = "strict" | "fenced" | "compat"; +// The generic entrypoint when a workflow wants to choose its tolerance level +// explicitly. Most callers should still use one of the small helpers below. export function parseJsonObject( text: string, options: { @@ -40,10 +42,14 @@ export function parseJsonObject( throw new Error(`Could not parse JSON from assistant output:\n${trimmed}`); } +// Use this when the model contract must be exact JSON and any extra text +// should fail the step immediately. export function parseStrictJsonObject(text: string): unknown { return parseJsonObject(text, { mode: "strict" }); } +// Default workflow parser: direct JSON first, fenced JSON second, and finally +// a balanced embedded object for compatibility with chatty model output. export function extractJsonObject(text: string): unknown { return parseJsonObject(text, { mode: "compat" }); } From 0a7c72045c11c320209baade30277569f3733336 Mon Sep 17 00:00:00 2001 From: Onur Solmaz <2453968+osolmaz@users.noreply.github.com> Date: Thu, 26 Mar 2026 00:17:08 +0100 Subject: [PATCH 12/22] Unify flows architecture docs --- docs/2026-03-25-acpx-flows-architecture.md | 357 ++++++++ ...26-03-25-acpx-flows-implementation-plan.md | 791 ------------------ ...3-25-acpx-flows-production-architecture.md | 431 ---------- 3 files changed, 357 insertions(+), 1222 deletions(-) create mode 100644 docs/2026-03-25-acpx-flows-architecture.md delete mode 100644 docs/2026-03-25-acpx-flows-implementation-plan.md delete mode 100644 docs/2026-03-25-acpx-flows-production-architecture.md diff --git a/docs/2026-03-25-acpx-flows-architecture.md b/docs/2026-03-25-acpx-flows-architecture.md new file mode 100644 index 0000000..8465339 --- /dev/null +++ b/docs/2026-03-25-acpx-flows-architecture.md @@ -0,0 +1,357 @@ +--- +title: acpx Flows Architecture +description: Execution model, runtime boundary, and design principles for acpx flows. +author: OpenClaw Team +date: 2026-03-25 +--- + +# acpx Flows Architecture + +## Why this document exists + +`acpx` flows add a small workflow layer on top of the existing ACP runtime. + +That workflow layer exists to make multi-step ACP work practical without +turning one long agent conversation into the workflow engine. + +This document describes the shape that `acpx` flows use: + +- flows are TypeScript modules +- the runtime owns graph execution and liveness +- ACP steps are used for model-shaped work +- deterministic mechanics can run as runtime actions +- conversations stay in the existing `~/.acpx/sessions/*.json` store + +## Core position + +`acpx` should stay a small ACP client with composable primitives. + +Flows fit that goal when they keep the boundary clear: + +- the runtime owns execution, persistence, routing, and liveness +- ACP workers own reasoning, judgment, summarization, and code changes + +The worker is not the workflow engine. + +## Goals + +- Make multi-step ACP workflows first-class in `acpx` +- Keep flow definitions readable and inspectable +- Keep branching deterministic outside the worker +- Reuse the existing session runtime and session store +- Support both pure ACP workflows and hybrid workflows when deterministic steps + are better supervised by the runtime + +## Non-goals + +- No ACP protocol redesign +- No large custom DSL +- No built-in GitHub or PR-specific workflow language in core +- No duplicate transcript store for flow conversations +- No visual builder + +## Flow model + +Flows are `.ts` modules that export a graph definition. + +The topology should read like data: + +- `nodes` +- `edges` +- declarative routing + +Node-local behavior can still be code. + +Typical authoring shape: + +```ts +import { defineFlow, acp, action, compute, checkpoint } from "acpx/flows"; + +export default defineFlow({ + name: "example", + nodes: { + analyze: acp({ ... }), + route: compute({ ... }), + run_check: action({ ... }), + wait: checkpoint(), + }, + edges: [ + { from: "analyze", to: "route" }, + { + from: "route", + switch: { + on: "$.next", + cases: { + run_check: "run_check", + wait: "wait", + }, + }, + }, + ], +}); +``` + +## Step kinds + +Keep the primitive set small: + +- `acp` +- `action` +- `compute` +- `checkpoint` + +### `acp` + +Use `acp` for model-shaped work: + +- extract intent +- judge solution shape +- classify bug vs feature +- decide whether refactor is needed +- summarize findings +- write human-facing output +- make code changes when the work is genuinely model-driven + +### `action` + +Use `action` for deterministic work supervised by the runtime: + +- prepare an isolated workspace +- run shell commands +- call `gh api` +- run tests +- run local `codex review` +- post a comment +- close a PR + +`shell(...)` is just a convenience form of `action(...)`. + +### `compute` + +Use `compute` for pure local transforms: + +- normalize earlier outputs +- derive the next route +- reduce multiple signals into one decision key + +### `checkpoint` + +Use `checkpoint` when the flow must pause for something outside the runtime: + +- a human decision +- an external event +- a later resume + +## Routing + +Routing must stay deterministic outside the worker. + +Workers produce outputs. + +The runtime decides: + +- the next node +- whether to retry +- whether to wait +- whether to fork or join + +Do not route on prose alone. + +Prefer: + +- structured ACP outputs +- declarative `switch` edges +- `compute` nodes for custom routing logic + +## Session model + +Each flow run gets one main ACP session by default. + +Most `acp` nodes should use that main conversation. + +If a flow truly needs a separate or isolated conversation, it should ask for it +explicitly. The runtime tracks those bindings internally. + +The flow author should usually think in terms of: + +- the main reasoning session +- optional isolated side sessions + +not low-level persistence details. + +## Working directories + +`cwd` already exists in `acpx` session handling. + +Flows extend that by allowing each node to choose its own working directory, +including dynamically from earlier outputs. + +That means a flow can: + +1. create an isolated temp clone or worktree in an `action` step +2. run later `acp` nodes inside that directory +3. keep the main repo checkout untouched + +Session bindings include `cwd`, so different workspaces do not accidentally +share one persisted ACP session. + +## Runtime boundary + +The important boundary is: + +- ACP for reasoning +- runtime for supervision + +That boundary matters most when a workflow would otherwise ask the model to do +open-ended orchestration inside one prompt turn. + +Examples of mechanics that are usually better owned by the runtime: + +- `git fetch` +- `gh api` calls +- local `codex review` +- targeted test execution +- posting comments + +This does not make ACP less important. + +It keeps ACP focused on the part it is good at while giving the flow runtime +direct ownership of timeouts, heartbeats, and side-effect execution. + +## Persistence + +Conversation state stays in the existing `acpx` session store: + +- `~/.acpx/sessions/*.json` + +Flow state lives separately under: + +- `~/.acpx/flows/runs/` + +The flow store keeps orchestration state such as: + +- run status +- current node +- outputs +- step history +- session bindings +- errors +- live liveness state + +The flow layer should reference session records, not duplicate full ACP +transcripts. + +## Liveness + +Long-running steps need explicit liveness. + +Flows should persist live state while a step is active, not only after it +finishes. + +Important live fields include: + +- `status` +- `currentNode` +- `currentNodeStartedAt` +- `lastHeartbeatAt` +- `statusDetail` +- `error` + +`acp` and `action` steps should support timeouts, heartbeats, and cancellation. + +That keeps a healthy run distinguishable from a hung run. + +## JSON output handling + +Flows often need structured model output. + +`acpx` supports a forgiving default because models sometimes wrap JSON with +extra text. + +The intended parsing layers are: + +- `extractJsonObject(...)` for compatibility +- `parseStrictJsonObject(...)` when the contract must be exact +- `parseJsonObject(..., { mode })` when a flow needs explicit control + +Default rule: + +- use compatibility JSON unless the workflow truly needs strict parsing + +Do not turn output parsing into a large framework. + +## Simplicity rules + +- Keep the node set small +- Keep `acpx` generic +- Prefer clear runtime boundaries over specialized built-ins +- Add fewer conventions, not more +- Use one main session by default +- Keep workload-specific logic in user flow files, not in `acpx` core +- Use compatibility JSON by default and strict JSON only when it pays for itself + +## PR triage example shape + +A maintainability-first PR triage workflow can fit this model cleanly: + +1. `action`: prepare isolated workspace +2. `acp`: extract intent +3. `acp`: judge implementation or solution +4. `acp`: classify bug vs feature +5. `action`: run validation mechanics +6. `acp`: judge refactor need +7. `action`: collect review mechanics +8. `acp`: decide whether blocking findings remain +9. `action`: collect CI mechanics +10. `acp`: decide whether to continue, close, or escalate +11. `action`: post the final comment or take the final GitHub action + +This keeps the reasoning in ACP while keeping the mechanics observable and +bounded. + +## CLI shape + +The main user-facing entrypoint is: + +```bash +acpx flow run [--input-json | --input-file ] +``` + +Related commands: + +- `acpx flow resume ` +- `acpx flow show ` +- `acpx flow graph ` +- `acpx flow validate ` + +## What belongs in core + +Core flow support in `acpx` should stay generic: + +- graph execution +- ACP step execution +- runtime actions +- run persistence +- liveness +- session bindings +- parsing helpers + +What should stay outside core: + +- PR-triage policy +- repository-specific prompts +- workload-specific route logic +- GitHub-specific business rules beyond generic command execution + +## Current direction + +The implemented direction in this branch is: + +- TypeScript flow modules +- small node set +- runtime-owned liveness and persistence +- optional runtime actions for deterministic work +- per-node `cwd` +- one main ACP session by default + +That is the shape flows should continue to follow. diff --git a/docs/2026-03-25-acpx-flows-implementation-plan.md b/docs/2026-03-25-acpx-flows-implementation-plan.md deleted file mode 100644 index 15319bc..0000000 --- a/docs/2026-03-25-acpx-flows-implementation-plan.md +++ /dev/null @@ -1,791 +0,0 @@ ---- -title: acpx Flows Implementation Plan -description: Monorepo plan for adding a general workflow library and CLI to acpx for orchestrating ACP workers with simple primitives. -author: OpenClaw Team -date: 2026-03-25 ---- - -# acpx Flows Implementation Plan - -## Why this document exists - -`acpx` already has the hard parts of ACP execution: - -- ACP transport over stdio -- agent spawning and lifecycle handling -- persistent session storage -- queue ownership and prompt serialization -- machine-readable output -- MCP server attachment on session setup - -What it does not have yet is a general workflow layer that can orchestrate ACP -workers step by step with: - -- explicit graphs -- programmable branching -- selective context visibility -- persistent workflow state outside the worker -- one main conversation by default -- explicit isolated conversations where blind judgment is required - -This document defines that plan. - -It assumes `acpx` will move to a monorepo, but all code will remain in the same -repository and under the same product family. - -## Core position - -`acpx` should become a swiss army knife for ACP, but it should do that through -small, composable primitives rather than one undifferentiated blob. - -The correct split is: - -- one repo -- one product family -- multiple packages -- one clear runtime boundary - -The worker is not the workflow engine. - -The workflow runtime owns: - -- graph execution -- branching -- retries -- wait states -- checkpointing -- selective context visibility -- bindings to persistent `acpx` sessions - -The ACP worker only executes one step at a time. - -## Goals - -- Add a general workflow library for ACP workers, not a PR-specific automation tool. -- Keep workflow definitions readable as TypeScript modules with object-shaped graphs. -- Support arbitrary branching and forks/joins with deterministic routing outside the worker. -- Reuse existing `acpx` session persistence for conversations instead of duplicating transcripts. -- Keep the first implementation simple enough to land incrementally in the current codebase. -- Preserve a coherent CLI surface under the `acpx` name. - -## Non-goals - -- No ACP protocol redesign. -- No requirement to introduce a distributed scheduler. -- No visual builder. -- No giant custom DSL. -- No requirement that every result must come back through a custom MCP tool on day one. -- No transcript duplication into a second workflow database. - -## Design principles - -### 1. Graph topology should read like data - -The default authoring format should be: - -- plain object for graph topology -- code only for node-local logic - -This keeps flows inspectable, serializable, and renderable. - -### 2. Routing must be deterministic outside the worker - -Workers produce outputs. - -The runtime chooses: - -- next node -- retry vs fail -- fan-out -- join behavior -- wait states - -Never route on prose alone. - -### 3. Context visibility is a first-class primitive - -Each node should receive only what its `read(...)` projection returns. - -If a step should not know earlier conclusions, that must be enforced by: - -- a narrow `read(...)` -- an isolated ACP session - -### 4. One main session by default, explicit extra sessions when needed - -Each flow run should get one implicit main ACP conversation. - -Most `acp` nodes should just use that main conversation. - -If a step needs isolation or a separate line of work, the flow should ask for -that explicitly instead of relying on hidden session policy defaults. - -### 5. Conversations stay in the existing session store - -`acpx` already stores persistent ACP conversations in `~/.acpx/sessions/*.json`. - -The workflow layer should store: - -- run state -- node state -- branch state -- session references -- artifacts - -It should not store duplicate full transcripts. - -### 6. Start with the existing runtime, not the CLI - -The flow engine should call the current runtime functions directly: - -- `runOnce` -- `createSession` -- `ensureSession` -- `sendSession` -- cancel and control operations - -It should not shell out to `acpx` as a subprocess. - -## Target monorepo shape - -The repository should become a workspace monorepo with these packages: - -- `packages/acpx` -- `packages/flows` -- `packages/core` if extracting the shared runtime into its own workspace - package proves useful - -Recommended responsibilities: - -### `packages/acpx` - -Published package name: `acpx` - -Responsibilities: - -- CLI binary -- public umbrella exports -- `acpx/flows` subpath export - -This package is the user-facing umbrella. - -### `packages/core` - -Internal workspace package for the reusable ACP runtime. - -Responsibilities: - -- ACP transport -- agent spawning -- session lifecycle -- session persistence -- queue runtime -- output formatters -- config loading -- prompt content helpers -- agent registry and capability helpers - -This is where the current `src/client.ts`, `src/session-runtime.ts`, -`src/session-persistence/**`, `src/output.ts`, and related files should move -over time. - -### `packages/flows` - -Internal workspace package for the workflow library. - -Responsibilities: - -- flow graph types -- graph validation -- flow loader -- run store -- graph executor -- branching and fork/join runtime -- checkpoint/resume -- step result extraction and validation -- optional flow-specific helpers - -### Why this shape - -This gives `acpx` the swiss-army-knife product shape while keeping the code -modular: - -- one repo -- one public brand -- separate runtime layers - -## Public package surface - -The public API should start with a single umbrella: - -- `acpx` -- `acpx/flows` - -`acpx/core` can exist later if the lower-level runtime surface proves worth -stabilizing. It should not be forced into the first public compatibility -contract unless there is a clear need. - -`packages/acpx` should re-export the public surfaces from the workspace -libraries rather than forcing users to import package-internal names. - -## Flow authoring model - -Flows are `.ts` files. - -Each file exports one flow definition. - -The canonical authoring style is: - -```ts -import { defineFlow, acp, compute, action, checkpoint } from "acpx/flows"; - -export default defineFlow({ - name: "triage", - input: InputSchema, - nodes: { - facts: acp({ ... }), - judge: acp({ ... }), - route: compute({ ... }), - external: checkpoint(), - continue_work: action({ ... }), - }, - edges: [ - { from: "facts", to: "judge" }, - { from: "judge", to: "route" }, - { - from: "route", - switch: { - on: "$.next", - cases: { - external: "external", - continue: "continue_work", - }, - }, - }, - ], -}); -``` - -The recommended repository layout is: - -- library/runtime code under the package workspace -- user-authored and example flows under a repo-level `workflows/` directory - -Example: - -- `workflows/example.flow.ts` -- `workflows/review.flow.ts` - -That keeps the workflow library separate from the workflows it executes and -gives the CLI one obvious path shape for local development. - -### Why object-shaped graphs - -This format is better than a fluent chain for: - -- readability -- validation -- static analysis -- visualization -- IR generation -- tooling - -### Canonical execution model - -Authoring format: - -- TypeScript module - -Execution format: - -- normalized graph IR - -The engine should normalize every flow into one internal representation before -execution. - -## Core primitives - -Keep the primitive set small. - -### `acp(...)` - -Run one ACP worker step. - -Use this for any step executed by Codex, OpenClaw, Claude, Pi, or another -ACP-compatible worker. - -### `compute(...)` - -Pure local transformation. - -Used for: - -- result normalization -- reducers -- route preparation -- branch aggregation - -### `action(...)` - -Explicit local side effect. - -Used for: - -- file writes -- notifications -- external API calls -- arbitrary local side effects - -### `checkpoint(...)` - -Pause and wait for an external actor or event. - -This is the correct primitive, not `human(...)`. - -The external actor may be: - -- a person -- another worker -- a CI system -- a webhook -- an operator action - -### Edge primitives - -Support: - -- linear edge -- `switch` -- `fork` -- `join` - -That is enough for most workflows. - -## Branching rules - -Branching must support two modes. - -### 1. Declarative branching - -For common structured cases: - -```ts -{ - from: "judge", - switch: { - on: "$.decision", - cases: { - yes: "yes_path", - no: "no_path", - }, - }, -} -``` - -### 2. Arbitrary code-based branching - -For custom logic, use a local `compute` router node: - -```ts -route: compute({ - run: ({ outputs }) => { - const answer = String(outputs.judge.answer).trim().toUpperCase(); - - if (answer === "Y") return { next: "yes_path" }; - if (answer === "N") return { next: "no_path" }; - return { next: "fallback_path" }; - }, -}); -``` - -Then branch declaratively on `route.next`. - -This keeps the graph readable while allowing arbitrary branching rules. - -## Session model - -The public model should be simple. - -### Default behavior - -Each flow run gets one implicit main ACP session. - -Every `acp` node uses that main session by default. - -That should be the common case for: - -- exploratory analysis -- implementation -- follow-up fixes -- review/fix loops - -### Isolated steps - -If a step must be independent, the flow should opt into an isolated session -explicitly. - -Use isolation for: - -- blind judgment -- independent critics -- adversarial review -- any step that must not inherit earlier conversation state - -This should be expressed as a simple flow-level option such as "run this step in -its own session", not by forcing every author to learn internal session-policy -keywords. - -### Extra long-lived sessions - -Most flows should not need to manually name sessions. - -If a workflow truly needs multiple persistent conversations, it may declare -additional session handles explicitly. That is an advanced case, not the -default. - -The runtime should own the mapping from those logical handles to underlying -`acpxRecordId` and ACP session identifiers for the run. - -### Internal runtime model - -Internally, the runtime will still need semantics equivalent to: - -- reuse the main session -- create an isolated one-off session -- continue a previously created non-main session - -Those are implementation concerns. They do not need to be the first public API -surface. - -### Validation rules - -The flow validator should reject: - -- isolated or blind steps that try to reuse the main conversation -- concurrent branches that would interleave prompts into the same session -- explicit extra-session references that cannot be resolved for the run - -## Context visibility - -Each `acp` node gets a `read(...)` projection. - -The runtime state may be broad, but the node sees only the projected view. - -Example: - -- node A sees raw issue and diff -- node B sees extracted facts but not earlier verdicts -- node C sees the verdict and executes a side effect - -This is the main mechanism for reducing anchoring and confirmation bias. - -It is not sufficient by itself for blind review. If a node must not inherit -earlier worker memory, it must use an isolated session as well. - -## Prompt model - -The workflow layer should build on the existing ACP prompt content model. - -`acpx` already has prompt helpers and validation for: - -- text blocks -- image blocks -- resource links -- embedded resources - -That should remain the base prompt type for flow steps rather than inventing a -second prompt representation. - -## Result capture - -Do not make a custom MCP result tool mandatory for the first implementation. - -The current runtime forwards MCP server config to `session/new` and -`session/load`, but it does not yet host a built-in MCP server runtime for flow -steps. The first implementation should respect that. - -### Initial result path - -The first implementation should support one result path: - -- ask the worker to return a final structured JSON object -- capture the ACP output stream -- extract the final assistant payload -- parse JSON -- validate it - -### Future extension - -Later, a flow-specific MCP tool can be added behind the same abstraction for -more reliable structured returns. That should be an enhancement, not a -prerequisite. - -## Schema model - -The flow engine should not hard-require one validation library. - -Accept any schema-like object that supports one of: - -- `parse(value)` -- `safeParse(value)` - -This keeps the core flexible and avoids baking a new large dependency into the -runtime contract. - -## Persistence model - -Do not use SQLite first. - -The current repo already uses file-based JSON and NDJSON persistence for -sessions. The workflow layer should match that style. - -### Conversation storage - -Persistent conversations remain in: - -- `~/.acpx/sessions/*.json` -- `~/.acpx/sessions/*.stream.ndjson` - -The workflow engine should reference those sessions by `acpxRecordId`. - -### Workflow storage - -Store workflow state under: - -- `~/.acpx/flows/` - -Recommended initial layout: - -- `~/.acpx/flows/runs//run.json` -- `~/.acpx/flows/runs//events.ndjson` -- `~/.acpx/flows/runs//lock` -- `~/.acpx/flows/runs//artifacts/...` - -If later experience shows that fast global lookup is necessary, an index file -can be added then. It should not be required up front. - -### What a run record should store - -- `runId` -- `flowName` -- `flowPath` -- `flowVersion` -- `status` -- `cwd` -- `createdAt` -- `updatedAt` -- `input` -- `nodeStates` -- `outputs` -- `activeBranches` -- `sessionBindings` mapping runtime-owned handles to persisted session ids -- `waitingOn` -- `artifacts` - -### What it should not store - -- duplicate conversation transcripts -- duplicate token usage copied out of session records - -## Run model - -Each run is a checkpointed state machine. - -The runtime should persist after every node transition: - -- node started -- node completed -- branch chosen -- checkpoint entered -- run resumed -- run failed -- run completed - -This is required for: - -- crash recovery -- inspectability -- replay -- long-lived checkpoints - -## CLI surface - -Add a new top-level command family: - -- `acpx flow run ` -- `acpx flow resume ` -- `acpx flow show ` -- `acpx flow graph ` -- `acpx flow validate ` - -### Important compatibility note - -Today, unknown first tokens are treated as agent names. Adding `flow` is -therefore a real top-level surface change and must be treated as a deliberate -reserved verb. - -### Loader model - -Flow files should be authored as `.ts`. - -The CLI should load them directly. - -The canonical local invocation should look like: - -- `acpx flow run workflows/example.flow.ts` - -That means the monorepo needs a dedicated runtime loader path for TypeScript -flow modules instead of pretending the current CLI-only build is enough. - -## Agent selection inside flows - -Do not rely on CLI-level `--agent` overrides for the first implementation. - -Flows may contain multiple `acp` nodes with different profiles, so one global -raw command override is ambiguous. - -Instead, flow nodes should name an agent profile resolved through the existing -config and registry layer. - -Example: - -- `profile: "codex"` -- `profile: "openclaw"` -- `profile: "claude"` - -Later, per-node raw command overrides can be added if they are actually needed. - -## Use of existing runtime - -The flow engine should build on the current runtime instead of duplicating it. - -Recommended mapping: - -- default main-session step -> `ensureSession` then `sendSession` -- isolated one-off step -> `runOnce` -- explicit extra persistent session -> `ensureSession` then `sendSession` -- cancel/control -> existing session control functions - -This keeps ACP execution in one place. - -## Testing strategy - -Build on the existing mock ACP agent and integration test style. - -### Library tests - -Add tests for: - -- graph validation -- default main-session reuse -- isolated-step semantics -- explicit extra-session validation -- declarative branching -- arbitrary code-based routing via `compute` -- fork/join execution -- checkpoint persistence and resume -- run store locking -- result parsing failures - -### CLI tests - -Add integration tests for: - -- `acpx flow run ...` -- `acpx flow validate ...` -- `acpx flow graph ...` -- `acpx flow resume ...` -- reserved `flow` verb behavior - -### Mock worker coverage - -The existing mock worker should remain the base for flow tests so the workflow -layer is validated against ACP behavior, not ad-hoc stubs. - -## Implementation phases - -### 1. Monorepo cutover - -- create workspace structure -- add `packages/acpx` -- add `packages/flows` -- extract a shared runtime package only if it materially clarifies the split -- move existing code into the monorepo with minimal logic changes -- keep published `acpx` CLI behavior unchanged - -### 2. Core library surface - -- define the internal runtime surface that `acpx/flows` depends on -- stop treating all runtime code as CLI-internal implementation detail -- expose stable session and prompt APIs for flow execution inside the repo -- publish `acpx/core` only if that lower-level surface proves worth freezing - -### 3. Flow graph and validator - -- implement `defineFlow` -- implement node and edge types -- normalize to internal graph IR -- validate graph structure and session-isolation constraints - -### 4. File-based run store - -- add `~/.acpx/flows/` store -- implement run record persistence -- implement event log -- implement run locks -- implement checkpoint and resume - -### 5. Flow executor - -- execute `acp`, `compute`, `action`, `checkpoint` -- wire `acp` nodes into existing session runtime -- implement branch, fork, join, and failure semantics - -### 6. Result extraction - -- add structured final-result capture -- add validator bridge for schema parsing -- add normalized failure handling for malformed worker results - -### 7. CLI - -- add `flow` command family -- add TypeScript flow loader -- add run/validate/graph/show/resume commands - -### 8. Hardening - -- improve inspectability -- add graph rendering -- add richer artifacts -- evaluate whether a custom MCP return tool is worth adding - -## Resolved decisions - -- The repo becomes a monorepo. -- The public product family remains `acpx`. -- The first-class workflow API lives under `acpx/flows`. -- Graph topology is object-shaped, not fluent-first. -- Branching is fully programmable. -- `checkpoint` is the right primitive, not `human`. -- Each flow run has one implicit main session by default. -- Extra sessions must be explicit. -- Example and user-authored flows should live under a repo-level `workflows/` - directory rather than inside the library package tree. -- Conversations remain in the existing session store. -- Workflow state uses file-based persistence first. -- The flow runtime uses the current `acpx` runtime directly, not CLI subprocesses. -- A custom MCP result server is optional later, not required up front. - -## Success criteria - -This work is successful when all of the following are true: - -- a flow can be authored as one `.ts` file -- `acpx flow run file.ts` executes it end to end -- the default main-session model is simple and reliable -- isolated steps do not inherit hidden worker memory accidentally -- arbitrary routing rules can be expressed cleanly -- fork/join works across multiple ACP workers -- run state survives process exit and resume -- worker conversations are still stored exactly once through the existing session model diff --git a/docs/2026-03-25-acpx-flows-production-architecture.md b/docs/2026-03-25-acpx-flows-production-architecture.md deleted file mode 100644 index e948cbc..0000000 --- a/docs/2026-03-25-acpx-flows-production-architecture.md +++ /dev/null @@ -1,431 +0,0 @@ ---- -title: acpx Flows Production Architecture -description: Production-ready execution model for acpx workflows, with runtime-owned control, native actions, ACP reasoning steps, and strong liveness guarantees. -author: OpenClaw Team -date: 2026-03-25 ---- - -# acpx Flows Production Architecture - -## Why this document exists - -The first experimental `acpx` flow runner proved that multi-step ACP workflows -are viable, but it also exposed the wrong execution boundary. - -The clearest example was PR triage: - -- the flow itself was structurally fine -- the worker made good judgments -- the run still stalled because a long-running nested `codex review` subprocess - was launched inside an ACP turn and never returned a final result - -This document defines the production-ready architecture for `acpx` flows. - -The goal is not to make the worker do everything. - -The goal is to make the runtime own execution and liveness, while the ACP worker -owns reasoning, judgment, and code changes. - -## Core position - -The correct long-term shape is a hybrid workflow engine: - -- the runtime is the control plane -- ACP workers are reasoning workers -- deterministic mechanics run as native runtime actions - -In other words: - -- the runtime should own step execution, deadlines, retries, heartbeats, - cancellation, state, and side effects -- the worker should own analysis, coding, judgment, summarization, and - decisions that are genuinely model-shaped - -This is the cleanest and most production-ready boundary. - -It is also the most robust answer to the question raised by the prototype: - -why did the flow stall? - -Because a child tool run was hosted inside an ACP turn instead of being owned -and supervised by the runtime. - -## What went wrong in the prototype - -The current prototype runner executes ACP steps synchronously and waits for each -step to finish before persisting completion. - -That is acceptable for simple prompts, but it becomes fragile when an ACP step -tries to orchestrate external mechanics itself. - -In the PR triage case, the failure mode was: - -1. the runtime entered the review step -2. the worker decided to run `codex review` -3. that review launched as a nested subprocess inside the worker turn -4. the review got stuck on transport/runtime behavior -5. the parent ACP turn never returned structured output -6. the outer flow looked hung - -This exposed three separate issues: - -- the wrong boundary for deterministic actions -- no explicit per-step liveness signal in run state -- no reliable step deadline or timeout behavior at the flow layer - -## Production model - -### 1. The runtime is the control plane - -The flow runtime should own: - -- flow graph execution -- current node and next node -- step deadlines and timeouts -- retries and retry policy -- run persistence -- heartbeats and staleness detection -- cancellation -- side-effect execution -- idempotency and action receipts - -The runtime must always know: - -- which node is active -- how long it has been active -- whether it is making progress -- whether it timed out -- whether it is blocked on a human or an external dependency - -### 2. ACP steps are for reasoning, not orchestration - -ACP steps should be used for: - -- extracting intent -- judging solution shape -- classifying bug vs feature -- deciding whether refactor is needed -- deciding whether human escalation is required -- editing code when the change is genuinely model-driven -- summarizing findings for a final comment - -ACP steps should not be the place where the model is expected to supervise -long-running deterministic subprocesses. - -That means: - -- do not host `codex review` inside a Codex ACP turn -- do not host `gh api` polling loops inside an ACP turn -- do not host CI approval or CI inspection loops inside an ACP turn - -Those belong to the runtime. - -### 3. Native action steps should handle deterministic work - -The runtime should support native action steps for deterministic operations such -as: - -- `git_fetch` -- `checkout_pr` -- `gh_api` -- `codex_review` -- `approve_workflow_run` -- `post_pr_comment` -- `close_pr` -- `run_tests` -- `run_targeted_validation` - -These actions should be: - -- directly observable -- cancellable -- time-bounded -- resumable when possible -- recorded with machine-readable receipts - -The worker can still decide whether they should run, but the runtime should -actually execute them. - -### 4. One durable run state, updated while the step is still active - -The flow runtime must persist live state before awaiting a step result. - -At minimum, run state should include: - -- `status` -- `currentNode` -- `currentNodeKind` -- `currentNodeStartedAt` -- `lastHeartbeatAt` -- `statusDetail` -- `outputs` -- `steps` -- `sessionBindings` -- `waitingOn` -- `error` - -This avoids the current ambiguity where `run.json` only changes after a node -completes and a healthy run looks frozen. - -### 5. Every long-running step needs heartbeat, deadline, and cancellation - -Every `acp` or `action` step should support: - -- `timeoutMs` -- optional heartbeat updates -- cancellation on timeout -- explicit terminal result if timed out - -For ACP steps: - -- timeout should cancel the active session prompt if possible - -For native action steps: - -- timeout should kill the child process and mark the step `timed_out` - -This is the minimum production liveness contract. - -### 6. Side effects must be idempotent and recorded - -A production workflow runtime must assume retries and restarts. - -For effectful steps such as posting comments or closing PRs, the runtime should -store receipts such as: - -- GitHub comment id -- workflow run id -- CI approval id -- commit sha -- pushed branch sha - -That allows safe resume and retry behavior without duplicated actions. - -### 7. Session handling should stay simple - -The session model should remain: - -- one main ACP session by default -- explicit isolated sessions only when a step truly needs a blind or separate - conversation - -The runtime should track those bindings internally. - -The flow author should usually think in terms of: - -- main reasoning session -- isolated critic session when needed - -not in terms of queue-owner mechanics or persistence internals. - -## Recommended step model - -The core step kinds should stay small: - -- `acp` -- `compute` -- `action` -- `checkpoint` - -But the semantics should be tighter. - -### `acp` - -Use for model-shaped work: - -- judgment -- code generation -- summarization -- route recommendation - -### `compute` - -Use for local pure transforms: - -- normalizing outputs -- computing branch keys -- reducing multiple findings into one route - -### `action` - -Use for deterministic external work supervised by the runtime: - -- git commands -- GitHub API calls -- test execution -- local `codex review` -- comment posting -- CI approval - -### `checkpoint` - -Use for explicit wait states: - -- human approval -- external webhook -- workflow approval gate that the runtime cannot clear - -## Simplicity rules - -The runtime should stay boring. - -That means: - -- keep the core node set small -- prefer generic primitives over workload-specific helpers -- add fewer conventions, not more - -Some concrete examples: - -- a per-step `cwd` override is enough; `acpx` does not need a built-in - `git_worktree_for_pr` primitive -- a shell-backed `action` step is enough for many deterministic mechanics; do - not rush to add a new first-class node type for every external tool -- keep JSON parsing simple: - - use compatibility parsing by default for real workflows, because models do - sometimes wrap valid JSON in extra chatter - - use strict JSON parsing only when the contract truly must fail on any extra - text - - do not turn structured-output handling into a giant parser framework - -The right bias is: - -- generic runtime capabilities in `acpx` -- workload-specific policy in user-authored workflow files - -That keeps the library production-ready without making it heavy. - -## PR triage under the production model - -The PR triage workflow should still follow the same logical flow, but some -current ACP steps should become native actions. - -A better shape is: - -1. `load_pr` — `action` -2. `prepare_workspace` — `action` -3. `extract_intent` — `acp` -4. `judge_implementation_or_solution` — `acp` -5. `bug_or_feature` — `acp` -6. `reproduce_bug_and_test_fix` or `test_feature_directly` — `action` -7. `judge_refactor` — `acp` -8. `collect_github_review_state` — `action` -9. `run_local_codex_review` — `action` -10. `judge_review_outcome` — `acp` or `compute` -11. `check_ci_state` — `action` -12. `fix_ci_failures` — `acp` plus `action` test steps as needed -13. `render_final_comment` — `acp` -14. `post_comment` / `close_pr` / `checkpoint` — `action` or `checkpoint` - -This keeps the worker in charge of judgment while making execution much more -reliable. - -## What the runtime should expose - -The runtime should eventually expose: - -- per-node timeout configuration -- per-node heartbeat policy -- per-action retry policy -- per-action idempotency keys -- live `flow status` -- `flow cancel` -- `flow resume` -- step receipts in run state - -This does not require a giant orchestration DSL. - -It requires a small set of strong primitives. - -## Failure model - -The runtime should distinguish these states clearly: - -- `running` -- `waiting` -- `completed` -- `failed` -- `timed_out` -- `cancelled` - -And these error classes should be surfaced distinctly when possible: - -- child process hung -- child process failed -- ACP prompt timed out -- external API failed -- blocked on permission gate -- blocked on human approval -- invalid step output - -That makes debugging and operator behavior much cleaner. - -## Incremental path from the current implementation - -The best migration path is: - -### Step 1: improve liveness and observability - -Add: - -- `currentNode` -- `currentNodeStartedAt` -- `lastHeartbeatAt` -- live `run.json` updates at node start -- per-node `timeoutMs` - -This should land first. - -### Step 2: add native action execution - -Keep the same graph model, but make deterministic work first-class: - -- command-backed actions -- GitHub-backed actions -- review/test actions - -This should land second. - -### Step 3: move recursive mechanics out of ACP prompts - -Refactor workflows so they stop asking the worker to supervise: - -- `codex review` -- `gh api` -- CI approval loops -- comment posting - -The runtime should do those directly. - -### Step 4: add receipts and idempotency - -This makes comment posting, PR closing, and CI approval safe under retries and -resume. - -## What not to do - -Do not move toward: - -- a single giant conversational agent that does everything -- recursive agent-inside-agent orchestration for core mechanics -- implicit run state that only exists in model context -- prose-only routing for effectful decisions - -That shape may feel flexible at first, but it is the least production-ready -option. - -## Final position - -The most production-ready `acpx` flow architecture is: - -- durable runtime-owned workflow execution -- native deterministic action steps -- ACP reasoning steps for judgment and coding -- explicit liveness, heartbeat, timeout, and cancellation -- idempotent recorded side effects - -That is the cleanest long-term model. - -It is also the most credible path if `acpx` wants flows that survive real, -long-running, autonomous workloads without turning the worker itself into a -fragile orchestration layer. From 30f535296d518c95c1ccfdb3bf9d25f20c7fccb3 Mon Sep 17 00:00:00 2001 From: Onur Solmaz <2453968+osolmaz@users.noreply.github.com> Date: Thu, 26 Mar 2026 09:26:00 +0100 Subject: [PATCH 13/22] Tighten flow validation and queue-owner args --- src/cli.ts | 3 +- src/flows/runtime.ts | 5 +++ src/session-runtime/queue-owner-process.ts | 25 ++++++++++++++ test/flows.test.ts | 39 ++++++++++++++++++++++ test/queue-owner-process.test.ts | 23 ++++++++++++- 5 files changed, 93 insertions(+), 2 deletions(-) diff --git a/src/cli.ts b/src/cli.ts index 91b099a..57e26e3 100644 --- a/src/cli.ts +++ b/src/cli.ts @@ -3,12 +3,13 @@ import { realpathSync } from "node:fs"; import { fileURLToPath, pathToFileURL } from "node:url"; import { main } from "./cli-core.js"; +import { sanitizeQueueOwnerExecArgv } from "./session-runtime/queue-owner-process.js"; export { formatPromptSessionBannerLine } from "./cli-core.js"; export { parseAllowedTools, parseMaxTurns, parseTtlSeconds } from "./cli/flags.js"; process.env.ACPX_QUEUE_OWNER_ARGS ??= JSON.stringify([ - ...process.execArgv, + ...sanitizeQueueOwnerExecArgv(), fileURLToPath(import.meta.url), "__queue-owner", ]); diff --git a/src/flows/runtime.ts b/src/flows/runtime.ts index 10ffa4d..665f18f 100644 --- a/src/flows/runtime.ts +++ b/src/flows/runtime.ts @@ -826,10 +826,15 @@ function validateFlowDefinition(flow: FlowDefinition): void { throw new Error(`Flow start node is missing: ${flow.startAt}`); } + const outgoingEdges = new Set(); for (const edge of flow.edges) { if (!flow.nodes[edge.from]) { throw new Error(`Flow edge references unknown from-node: ${edge.from}`); } + if (outgoingEdges.has(edge.from)) { + throw new Error(`Flow node must not declare multiple outgoing edges: ${edge.from}`); + } + outgoingEdges.add(edge.from); if ("to" in edge) { if (!flow.nodes[edge.to]) { throw new Error(`Flow edge references unknown to-node: ${edge.to}`); diff --git a/src/session-runtime/queue-owner-process.ts b/src/session-runtime/queue-owner-process.ts index 527ed97..df97f94 100644 --- a/src/session-runtime/queue-owner-process.ts +++ b/src/session-runtime/queue-owner-process.ts @@ -33,6 +33,31 @@ type SessionSendLike = { maxQueueDepth?: number; }; +export function sanitizeQueueOwnerExecArgv( + execArgv: readonly string[] = process.execArgv, +): string[] { + const sanitized: string[] = []; + for (let index = 0; index < execArgv.length; index += 1) { + const value = execArgv[index]; + if (value === "--experimental-test-coverage" || value === "--test") { + continue; + } + if ( + value === "--test-name-pattern" || + value === "--test-reporter" || + value === "--test-reporter-destination" + ) { + index += 1; + continue; + } + if (value.startsWith("--test-")) { + continue; + } + sanitized.push(value); + } + return sanitized; +} + export function resolveQueueOwnerSpawnArgs(argv: readonly string[] = process.argv): string[] { const override = process.env.ACPX_QUEUE_OWNER_ARGS; if (override) { diff --git a/test/flows.test.ts b/test/flows.test.ts index f441a48..e27e797 100644 --- a/test/flows.test.ts +++ b/test/flows.test.ts @@ -192,6 +192,45 @@ test("FlowRunner executes native shell actions and parses structured output", as }); }); +test("FlowRunner rejects multiple outgoing edges from the same node", async () => { + await withTempHome(async () => { + const runner = new FlowRunner({ + resolveAgent: () => ({ + agentName: "unused", + agentCommand: "unused", + cwd: process.cwd(), + }), + permissionMode: "approve-all", + outputRoot: await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-store-")), + }); + + const flow = defineFlow({ + name: "ambiguous-edges", + startAt: "start", + nodes: { + start: compute({ + run: () => ({ ok: true }), + }), + one: action({ + run: () => ({ branch: 1 }), + }), + two: action({ + run: () => ({ branch: 2 }), + }), + }, + edges: [ + { from: "start", to: "one" }, + { from: "start", to: "two" }, + ], + }); + + await assert.rejects( + async () => await runner.run(flow, {}), + /Flow node must not declare multiple outgoing edges: start/, + ); + }); +}); + test("FlowRunner persists active node state while a shell step is running", async () => { await withTempHome(async () => { const outputRoot = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-store-")); diff --git a/test/queue-owner-process.test.ts b/test/queue-owner-process.test.ts index 8a11ab2..e1d57d7 100644 --- a/test/queue-owner-process.test.ts +++ b/test/queue-owner-process.test.ts @@ -4,7 +4,10 @@ import { mkdtemp, rm, symlink, writeFile } from "node:fs/promises"; import os from "node:os"; import path from "node:path"; import { describe, it } from "node:test"; -import { resolveQueueOwnerSpawnArgs } from "../src/session-runtime/queue-owner-process.js"; +import { + resolveQueueOwnerSpawnArgs, + sanitizeQueueOwnerExecArgv, +} from "../src/session-runtime/queue-owner-process.js"; async function withTempDir(run: (dir: string) => Promise): Promise { const dir = await mkdtemp(path.join(os.tmpdir(), "acpx-queue-owner-path-")); @@ -54,3 +57,21 @@ describe("resolveQueueOwnerSpawnArgs", () => { }); }); }); + +describe("sanitizeQueueOwnerExecArgv", () => { + it("drops test runner coverage flags but keeps loader args", () => { + assert.deepEqual( + sanitizeQueueOwnerExecArgv([ + "--experimental-test-coverage", + "--test", + "--test-name-pattern", + "flow", + "--import", + "tsx", + "--loader", + "custom-loader", + ]), + ["--import", "tsx", "--loader", "custom-loader"], + ); + }); +}); From dd31087a9b82a8980a9ebbd3c578cd0cdbbc1bb9 Mon Sep 17 00:00:00 2001 From: Onur Solmaz <2453968+osolmaz@users.noreply.github.com> Date: Thu, 26 Mar 2026 10:10:07 +0100 Subject: [PATCH 14/22] test: cover flows and queue runtime --- src/cli.ts | 11 +- src/flows/cli.ts | 37 ++- src/session-runtime/queue-owner-process.ts | 11 + test/events.test.ts | 84 ++++- test/fixtures/flow-branch.flow.ts | 3 +- test/fixtures/flow-shell.flow.ts | 34 ++ test/fixtures/flow-wait.flow.ts | 27 ++ test/fixtures/flow-workdir.flow.ts | 3 +- test/flows-shell.test.ts | 68 ++++ test/flows-store.test.ts | 90 ++++++ test/flows.test.ts | 8 +- test/integration.test.ts | 97 ++++++ test/mcp-servers.test.ts | 169 ++++++++++ test/prompt-content.test.ts | 110 +++++++ test/queue-ipc-errors.test.ts | 54 ++++ test/queue-ipc-server.test.ts | 220 +++++++++++++ test/queue-lease-store.test.ts | 167 ++++++++++ test/queue-messages.test.ts | 356 +++++++++++++++++++++ test/queue-owner-process.test.ts | 22 ++ test/queue-paths.test.ts | 26 +- test/version.test.ts | 50 +++ 21 files changed, 1627 insertions(+), 20 deletions(-) create mode 100644 test/fixtures/flow-shell.flow.ts create mode 100644 test/fixtures/flow-wait.flow.ts create mode 100644 test/flows-shell.test.ts create mode 100644 test/flows-store.test.ts create mode 100644 test/mcp-servers.test.ts create mode 100644 test/queue-ipc-server.test.ts create mode 100644 test/queue-lease-store.test.ts diff --git a/src/cli.ts b/src/cli.ts index 57e26e3..a5ac717 100644 --- a/src/cli.ts +++ b/src/cli.ts @@ -3,16 +3,15 @@ import { realpathSync } from "node:fs"; import { fileURLToPath, pathToFileURL } from "node:url"; import { main } from "./cli-core.js"; -import { sanitizeQueueOwnerExecArgv } from "./session-runtime/queue-owner-process.js"; +import { buildQueueOwnerArgOverride } from "./session-runtime/queue-owner-process.js"; export { formatPromptSessionBannerLine } from "./cli-core.js"; export { parseAllowedTools, parseMaxTurns, parseTtlSeconds } from "./cli/flags.js"; -process.env.ACPX_QUEUE_OWNER_ARGS ??= JSON.stringify([ - ...sanitizeQueueOwnerExecArgv(), - fileURLToPath(import.meta.url), - "__queue-owner", -]); +const queueOwnerArgOverride = buildQueueOwnerArgOverride(fileURLToPath(import.meta.url)); +if (queueOwnerArgOverride) { + process.env.ACPX_QUEUE_OWNER_ARGS ??= queueOwnerArgOverride; +} function isCliEntrypoint(argv: string[]): boolean { const entry = argv[1]; diff --git a/src/flows/cli.ts b/src/flows/cli.ts index 6a8722e..b9df3cd 100644 --- a/src/flows/cli.ts +++ b/src/flows/cli.ts @@ -2,7 +2,6 @@ import fs from "node:fs/promises"; import path from "node:path"; import { pathToFileURL } from "node:url"; import { InvalidArgumentError, type Command } from "commander"; -import { tsImport } from "tsx/esm/api"; import { resolveAgentInvocation, resolveGlobalFlags, @@ -78,10 +77,9 @@ async function readFlowInput(flags: FlowRunFlags): Promise { } async function loadFlowModule(flowPath: string): Promise { - const module = (await tsImport(pathToFileURL(flowPath).href, import.meta.url)) as { - default?: unknown; - "module.exports"?: unknown; - }; + const flowUrl = pathToFileURL(flowPath).href; + const extension = path.extname(flowPath).toLowerCase(); + const module = await loadFlowRuntimeModule(flowUrl, extension); const candidate = findFlowDefinition(module); if (!candidate) { @@ -90,6 +88,35 @@ async function loadFlowModule(flowPath: string): Promise { return candidate; } +async function loadFlowRuntimeModule( + flowUrl: string, + extension: string, +): Promise<{ + default?: unknown; + "module.exports"?: unknown; +}> { + if (extension === ".ts" || extension === ".tsx" || extension === ".mts" || extension === ".cts") { + const { tsImport } = (await import("tsx/esm/api")) as { + tsImport: ( + specifier: string, + parentURL: string, + ) => Promise<{ + default?: unknown; + "module.exports"?: unknown; + }>; + }; + return (await tsImport(flowUrl, import.meta.url)) as { + default?: unknown; + "module.exports"?: unknown; + }; + } + + return (await import(flowUrl)) as { + default?: unknown; + "module.exports"?: unknown; + }; +} + function findFlowDefinition(module: { default?: unknown; "module.exports"?: unknown; diff --git a/src/session-runtime/queue-owner-process.ts b/src/session-runtime/queue-owner-process.ts index df97f94..6b3cde7 100644 --- a/src/session-runtime/queue-owner-process.ts +++ b/src/session-runtime/queue-owner-process.ts @@ -58,6 +58,17 @@ export function sanitizeQueueOwnerExecArgv( return sanitized; } +export function buildQueueOwnerArgOverride( + entryPath: string, + execArgv: readonly string[] = process.execArgv, +): string | null { + const sanitized = sanitizeQueueOwnerExecArgv(execArgv); + if (sanitized.length === 0) { + return null; + } + return JSON.stringify([...sanitized, entryPath, "__queue-owner"]); +} + export function resolveQueueOwnerSpawnArgs(argv: readonly string[] = process.argv): string[] { const override = process.env.ACPX_QUEUE_OWNER_ARGS; if (override) { diff --git a/test/events.test.ts b/test/events.test.ts index 7a31515..1a650d9 100644 --- a/test/events.test.ts +++ b/test/events.test.ts @@ -1,6 +1,12 @@ import assert from "node:assert/strict"; import test from "node:test"; -import { isAcpJsonRpcMessage, isSessionUpdateNotification } from "../src/acp-jsonrpc.js"; +import { + isAcpJsonRpcMessage, + isJsonRpcNotification, + isSessionUpdateNotification, + parseJsonRpcErrorMessage, + parsePromptStopReason, +} from "../src/acp-jsonrpc.js"; test("isAcpJsonRpcMessage accepts JSON-RPC request", () => { assert.equal( @@ -69,6 +75,42 @@ test("isAcpJsonRpcMessage rejects non-JSON-RPC payload", () => { ); }); +test("isAcpJsonRpcMessage rejects invalid request and response shapes", () => { + assert.equal( + isAcpJsonRpcMessage({ + jsonrpc: "2.0", + id: {}, + method: "session/prompt", + }), + false, + ); + + assert.equal( + isAcpJsonRpcMessage({ + jsonrpc: "2.0", + id: "req-1", + result: {}, + error: { + code: -32000, + message: "runtime error", + }, + }), + false, + ); + + assert.equal( + isAcpJsonRpcMessage({ + jsonrpc: "2.0", + id: "req-1", + error: { + code: "bad", + message: "runtime error", + }, + }), + false, + ); +}); + test("isAcpJsonRpcMessage accepts request/notification/response fixtures after roundtrip", () => { const fixtures: unknown[] = [ { @@ -150,3 +192,43 @@ test("isSessionUpdateNotification matches session/update notifications only", () false, ); }); + +test("notification and response helpers parse expected fields only", () => { + const notification = { + jsonrpc: "2.0", + method: "session/update", + params: { + sessionId: "session-1", + }, + } as const; + assert.equal(isJsonRpcNotification(notification), true); + + const response = { + jsonrpc: "2.0", + id: "req-1", + result: { stopReason: "end_turn" }, + } as const; + assert.equal(parsePromptStopReason(response), "end_turn"); + assert.equal(parsePromptStopReason(notification as never), undefined); + + const errorResponse = { + jsonrpc: "2.0", + id: "req-2", + error: { + code: -32000, + message: "bad request", + }, + } as const; + assert.equal(parseJsonRpcErrorMessage(errorResponse), "bad request"); + assert.equal(parseJsonRpcErrorMessage(response as never), undefined); + assert.equal( + parseJsonRpcErrorMessage({ + jsonrpc: "2.0", + id: "req-3", + error: { + code: -32000, + }, + } as never), + undefined, + ); +}); diff --git a/test/fixtures/flow-branch.flow.ts b/test/fixtures/flow-branch.flow.ts index 978a36b..17c7055 100644 --- a/test/fixtures/flow-branch.flow.ts +++ b/test/fixtures/flow-branch.flow.ts @@ -1,4 +1,5 @@ -import { acp, compute, defineFlow, extractJsonObject } from "../../src/flows.js"; +import { extractJsonObject } from "../../src/flows/json.js"; +import { acp, compute, defineFlow } from "../../src/flows/runtime.js"; export default defineFlow({ name: "fixture-branch", diff --git a/test/fixtures/flow-shell.flow.ts b/test/fixtures/flow-shell.flow.ts new file mode 100644 index 0000000..fa648f4 --- /dev/null +++ b/test/fixtures/flow-shell.flow.ts @@ -0,0 +1,34 @@ +import { extractJsonObject } from "../../src/flows/json.js"; +import { action, defineFlow, shell } from "../../src/flows/runtime.js"; + +export default defineFlow({ + name: "fixture-shell", + startAt: "prepare", + nodes: { + prepare: action({ + run: ({ input }) => ({ + text: String((input as { text?: string }).text ?? "").toUpperCase(), + }), + }), + run_shell: shell({ + exec: ({ outputs }) => ({ + command: process.execPath, + args: [ + "-e", + "process.stdout.write(JSON.stringify({ value: process.env.FLOW_TEXT, cwd: process.cwd() }))", + ], + env: { + FLOW_TEXT: String((outputs.prepare as { text: string }).text), + }, + }), + parse: (result) => extractJsonObject(result.stdout), + }), + finalize: action({ + run: ({ outputs }) => outputs.run_shell, + }), + }, + edges: [ + { from: "prepare", to: "run_shell" }, + { from: "run_shell", to: "finalize" }, + ], +}); diff --git a/test/fixtures/flow-wait.flow.ts b/test/fixtures/flow-wait.flow.ts new file mode 100644 index 0000000..1613161 --- /dev/null +++ b/test/fixtures/flow-wait.flow.ts @@ -0,0 +1,27 @@ +import { action, checkpoint, defineFlow } from "../../src/flows/runtime.js"; + +export default defineFlow({ + name: "fixture-wait", + startAt: "prepare", + nodes: { + prepare: action({ + run: ({ input }) => ({ + ticket: String((input as { ticket?: string }).ticket ?? "review"), + }), + }), + wait_for_human: checkpoint({ + summary: "needs review", + run: ({ outputs }) => ({ + checkpoint: "wait_for_human", + summary: `review ${(outputs.prepare as { ticket: string }).ticket}`, + }), + }), + unreachable: action({ + run: () => ({ ok: false }), + }), + }, + edges: [ + { from: "prepare", to: "wait_for_human" }, + { from: "wait_for_human", to: "unreachable" }, + ], +}); diff --git a/test/fixtures/flow-workdir.flow.ts b/test/fixtures/flow-workdir.flow.ts index c87de73..067cb5a 100644 --- a/test/fixtures/flow-workdir.flow.ts +++ b/test/fixtures/flow-workdir.flow.ts @@ -1,4 +1,5 @@ -import { acp, compute, defineFlow, extractJsonObject, shell } from "../../src/flows.js"; +import { extractJsonObject } from "../../src/flows/json.js"; +import { acp, compute, defineFlow, shell } from "../../src/flows/runtime.js"; export default defineFlow({ name: "fixture-workdir", diff --git a/test/flows-shell.test.ts b/test/flows-shell.test.ts new file mode 100644 index 0000000..a0b395b --- /dev/null +++ b/test/flows-shell.test.ts @@ -0,0 +1,68 @@ +import assert from "node:assert/strict"; +import test from "node:test"; +import { + formatShellActionSummary, + renderShellCommand, + runShellAction, +} from "../src/flows/executors/shell.js"; +import { TimeoutError } from "../src/session-runtime-helpers.js"; + +test("renderShellCommand quotes arguments consistently", () => { + assert.equal(renderShellCommand("echo", ["hello", "two words"]), 'echo "hello" "two words"'); +}); + +test("formatShellActionSummary prefixes rendered commands", () => { + assert.equal( + formatShellActionSummary({ + command: "git", + args: ["status", "--short"], + }), + 'shell: git "status" "--short"', + ); +}); + +test("runShellAction captures stdout and stderr", async () => { + const result = await runShellAction({ + command: process.execPath, + args: ["-e", 'process.stdout.write("ok"); process.stderr.write("warn");'], + }); + + assert.equal(result.stdout, "ok"); + assert.equal(result.stderr, "warn"); + assert.equal(result.combinedOutput, "okwarn"); + assert.equal(result.exitCode, 0); + assert.equal(result.signal, null); +}); + +test("runShellAction allows non-zero exits when requested", async () => { + const result = await runShellAction({ + command: process.execPath, + args: ["-e", "process.exit(3)"], + allowNonZeroExit: true, + }); + + assert.equal(result.exitCode, 3); +}); + +test("runShellAction rejects non-zero exits by default", async () => { + await assert.rejects( + async () => + await runShellAction({ + command: process.execPath, + args: ["-e", 'process.stderr.write("boom"); process.exit(2)'], + }), + /Shell action failed/, + ); +}); + +test("runShellAction times out long-running commands", async () => { + await assert.rejects( + async () => + await runShellAction({ + command: process.execPath, + args: ["-e", "setTimeout(() => {}, 10_000)"], + timeoutMs: 50, + }), + (error: unknown) => error instanceof TimeoutError, + ); +}); diff --git a/test/flows-store.test.ts b/test/flows-store.test.ts new file mode 100644 index 0000000..e87c242 --- /dev/null +++ b/test/flows-store.test.ts @@ -0,0 +1,90 @@ +import assert from "node:assert/strict"; +import fs from "node:fs/promises"; +import os from "node:os"; +import path from "node:path"; +import test from "node:test"; +import type { FlowRunState } from "../src/flows/runtime.js"; +import { FlowRunStore, flowRunsBaseDir } from "../src/flows/store.js"; + +test("flowRunsBaseDir defaults under the acpx home directory", () => { + assert.equal(flowRunsBaseDir("/tmp/home"), path.join("/tmp/home", ".acpx", "flows", "runs")); +}); + +test("FlowRunStore writes snapshots, live state, and events", async () => { + const outputRoot = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-store-test-")); + + try { + const store = new FlowRunStore(outputRoot); + const runDir = await store.createRunDir("run-123"); + const state: FlowRunState = { + runId: "run-123", + flowName: "demo", + flowPath: "/tmp/demo.flow.ts", + startedAt: "2026-03-26T00:00:00.000Z", + updatedAt: "2026-03-26T00:00:00.000Z", + status: "running", + input: { ok: true }, + outputs: {}, + steps: [], + sessionBindings: {}, + currentNode: "prepare", + currentNodeKind: "action", + currentNodeStartedAt: "2026-03-26T00:00:01.000Z", + lastHeartbeatAt: "2026-03-26T00:00:01.000Z", + statusDetail: "Preparing", + }; + + await store.writeSnapshot(runDir, state, { + type: "run_started", + }); + + const snapshot = JSON.parse(await fs.readFile(path.join(runDir, "run.json"), "utf8")) as { + runId: string; + currentNode?: string; + statusDetail?: string; + }; + const live = JSON.parse(await fs.readFile(path.join(runDir, "live.json"), "utf8")) as { + runId: string; + currentNode?: string; + statusDetail?: string; + }; + const events = (await fs.readFile(path.join(runDir, "events.ndjson"), "utf8")) + .trim() + .split("\n") + .map((line) => JSON.parse(line) as { type?: string; at?: string }); + + assert.equal(snapshot.runId, "run-123"); + assert.equal(snapshot.currentNode, "prepare"); + assert.equal(live.runId, "run-123"); + assert.equal(live.statusDetail, "Preparing"); + assert.equal(events.length, 1); + assert.equal(events[0]?.type, "run_started"); + assert.equal(typeof events[0]?.at, "string"); + + state.lastHeartbeatAt = "2026-03-26T00:00:02.000Z"; + state.statusDetail = "Still preparing"; + await store.writeLive(runDir, state, { + type: "node_heartbeat", + nodeId: "prepare", + }); + + const liveAfterHeartbeat = JSON.parse( + await fs.readFile(path.join(runDir, "live.json"), "utf8"), + ) as { + lastHeartbeatAt?: string; + statusDetail?: string; + }; + const eventsAfterHeartbeat = (await fs.readFile(path.join(runDir, "events.ndjson"), "utf8")) + .trim() + .split("\n") + .map((line) => JSON.parse(line) as { type?: string; nodeId?: string }); + + assert.equal(liveAfterHeartbeat.lastHeartbeatAt, "2026-03-26T00:00:02.000Z"); + assert.equal(liveAfterHeartbeat.statusDetail, "Still preparing"); + assert.equal(eventsAfterHeartbeat.length, 2); + assert.equal(eventsAfterHeartbeat[1]?.type, "node_heartbeat"); + assert.equal(eventsAfterHeartbeat[1]?.nodeId, "prepare"); + } finally { + await fs.rm(outputRoot, { recursive: true, force: true }); + } +}); diff --git a/test/flows.test.ts b/test/flows.test.ts index e27e797..f1b03b5 100644 --- a/test/flows.test.ts +++ b/test/flows.test.ts @@ -4,6 +4,7 @@ import os from "node:os"; import path from "node:path"; import test from "node:test"; import { fileURLToPath } from "node:url"; +import { extractJsonObject, parseJsonObject, parseStrictJsonObject } from "../src/flows/json.js"; import { FlowRunner, acp, @@ -11,12 +12,9 @@ import { checkpoint, compute, defineFlow, - extractJsonObject, - flowRunsBaseDir, - parseJsonObject, - parseStrictJsonObject, shell, -} from "../src/flows.js"; +} from "../src/flows/runtime.js"; +import { flowRunsBaseDir } from "../src/flows/store.js"; import { TimeoutError } from "../src/session-runtime-helpers.js"; const MOCK_AGENT_PATH = fileURLToPath(new URL("./mock-agent.js", import.meta.url)); diff --git a/test/integration.test.ts b/test/integration.test.ts index 1fd2e2b..ce8c16c 100644 --- a/test/integration.test.ts +++ b/test/integration.test.ts @@ -16,6 +16,12 @@ import { queuePaths } from "./queue-test-helpers.js"; const CLI_PATH = fileURLToPath(new URL("../src/cli.js", import.meta.url)); const MOCK_AGENT_PATH = fileURLToPath(new URL("./mock-agent.js", import.meta.url)); const FLOW_FIXTURE_PATH = fileURLToPath(new URL("./fixtures/flow-branch.flow.js", import.meta.url)); +const FLOW_SHELL_FIXTURE_PATH = fileURLToPath( + new URL("./fixtures/flow-shell.flow.js", import.meta.url), +); +const FLOW_WAIT_FIXTURE_PATH = fileURLToPath( + new URL("./fixtures/flow-wait.flow.js", import.meta.url), +); const FLOW_WORKDIR_FIXTURE_PATH = fileURLToPath( new URL("./fixtures/flow-workdir.flow.js", import.meta.url), ); @@ -166,6 +172,97 @@ test("integration: flow run supports dynamic ACP working directories", async () }); }); +test("integration: flow run executes function and shell actions from --input-file", async () => { + await withTempHome(async (homeDir) => { + const cwd = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-integration-cwd-")); + const inputPath = path.join(cwd, "input.json"); + + try { + await fs.writeFile(inputPath, JSON.stringify({ text: "smoke" }), "utf8"); + + const result = await runCli( + [ + "--approve-all", + "--cwd", + cwd, + "--format", + "json", + "flow", + "run", + FLOW_SHELL_FIXTURE_PATH, + "--input-file", + inputPath, + ], + homeDir, + ); + + assert.equal(result.code, 0, result.stderr); + const payload = JSON.parse(result.stdout.trim()) as { + action?: string; + status?: string; + outputs?: { + prepare?: { text: string }; + finalize?: { value: string; cwd: string }; + }; + }; + + assert.equal(payload.action, "flow_run_result"); + assert.equal(payload.status, "completed"); + assert.equal(payload.outputs?.prepare?.text, "SMOKE"); + assert.equal(payload.outputs?.finalize?.value, "SMOKE"); + assert.equal(typeof payload.outputs?.finalize?.cwd, "string"); + } finally { + await fs.rm(cwd, { recursive: true, force: true }); + } + }); +}); + +test("integration: flow run reports waiting checkpoints in json mode", async () => { + await withTempHome(async (homeDir) => { + const cwd = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-integration-cwd-")); + + try { + const result = await runCli( + [ + "--approve-all", + "--cwd", + cwd, + "--format", + "json", + "flow", + "run", + FLOW_WAIT_FIXTURE_PATH, + "--input-json", + JSON.stringify({ ticket: "pr-174" }), + ], + homeDir, + ); + + assert.equal(result.code, 0, result.stderr); + const payload = JSON.parse(result.stdout.trim()) as { + action?: string; + status?: string; + waitingOn?: string; + outputs?: { + prepare?: { ticket: string }; + wait_for_human?: { checkpoint: string; summary: string }; + unreachable?: unknown; + }; + }; + + assert.equal(payload.action, "flow_run_result"); + assert.equal(payload.status, "waiting"); + assert.equal(payload.waitingOn, "wait_for_human"); + assert.equal(payload.outputs?.prepare?.ticket, "pr-174"); + assert.equal(payload.outputs?.wait_for_human?.checkpoint, "wait_for_human"); + assert.equal(payload.outputs?.wait_for_human?.summary, "review pr-174"); + assert.equal(payload.outputs?.unreachable, undefined); + } finally { + await fs.rm(cwd, { recursive: true, force: true }); + } + }); +}); + test("integration: built-in droid agent resolves to droid exec --output-format acp", async () => { await withTempHome(async (homeDir) => { const cwd = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-integration-cwd-")); diff --git a/test/mcp-servers.test.ts b/test/mcp-servers.test.ts new file mode 100644 index 0000000..0a20515 --- /dev/null +++ b/test/mcp-servers.test.ts @@ -0,0 +1,169 @@ +import assert from "node:assert/strict"; +import test from "node:test"; +import { parseMcpServers, parseOptionalMcpServers } from "../src/mcp-servers.js"; + +test("parseOptionalMcpServers returns undefined for missing values", () => { + assert.equal(parseOptionalMcpServers(undefined, "config.json"), undefined); +}); + +test("parseMcpServers parses http, sse, and stdio servers", () => { + const servers = parseMcpServers( + [ + { + name: "http-server", + type: "http", + url: " https://example.com/mcp ", + headers: [{ name: "Authorization", value: "Bearer token" }], + _meta: { scope: "test" }, + }, + { + name: "sse-server", + type: "sse", + url: "https://example.com/sse", + }, + { + name: "stdio-server", + command: "node", + args: ["server.js"], + env: [{ name: "NODE_ENV", value: "test" }], + _meta: null, + }, + ], + "config.json", + ); + + assert.deepEqual(servers, [ + { + name: "http-server", + type: "http", + url: "https://example.com/mcp", + headers: [{ name: "Authorization", value: "Bearer token" }], + _meta: { scope: "test" }, + }, + { + name: "sse-server", + type: "sse", + url: "https://example.com/sse", + headers: [], + _meta: undefined, + }, + { + name: "stdio-server", + command: "node", + args: ["server.js"], + env: [{ name: "NODE_ENV", value: "test" }], + _meta: null, + }, + ]); +}); + +test("parseMcpServers rejects invalid top-level and entry fields", () => { + assert.throws(() => parseMcpServers({}, "config.json"), { + message: "Invalid mcpServers in config.json: expected array", + }); + + assert.throws( + () => + parseMcpServers( + [ + { + name: "broken", + type: "http", + url: "", + }, + ], + "config.json", + ), + { + message: "Invalid mcpServers[0] in config.json.url: expected non-empty string", + }, + ); + + assert.throws( + () => + parseMcpServers( + [ + { + name: "broken", + type: "udp", + url: "https://example.com", + }, + ], + "config.json", + ), + { + message: "Invalid mcpServers[0] in config.json.type: expected http, sse, or stdio", + }, + ); +}); + +test("parseMcpServers rejects invalid nested header, args, env, and meta values", () => { + assert.throws( + () => + parseMcpServers( + [ + { + name: "broken", + type: "http", + url: "https://example.com", + headers: [{ name: "X-Test", value: 123 }], + }, + ], + "config.json", + ), + { + message: "Invalid mcpServers[0] in config.json.headers[0].value: expected non-empty string", + }, + ); + + assert.throws( + () => + parseMcpServers( + [ + { + name: "broken", + command: "node", + args: ["ok", 123], + }, + ], + "config.json", + ), + { + message: "Invalid mcpServers[0] in config.json.args[1]: expected string", + }, + ); + + assert.throws( + () => + parseMcpServers( + [ + { + name: "broken", + command: "node", + env: [{ name: "X", value: "" }], + }, + ], + "config.json", + ), + { + message: "Invalid mcpServers[0] in config.json.env[0].value: expected non-empty string", + }, + ); + + assert.throws( + () => + parseMcpServers( + [ + { + name: "broken", + command: "node", + _meta: "bad", + }, + ], + "config.json", + ), + { + message: "Invalid mcpServers[0] in config.json._meta: expected object or null", + }, + ); +}); diff --git a/test/prompt-content.test.ts b/test/prompt-content.test.ts index 2969d40..c98c27f 100644 --- a/test/prompt-content.test.ts +++ b/test/prompt-content.test.ts @@ -1,7 +1,10 @@ import assert from "node:assert/strict"; import test from "node:test"; import { + isPromptInput, + mergePromptSourceWithText, parsePromptSource, + promptToDisplayText, PromptInputValidationError, textPrompt, } from "../src/prompt-content.js"; @@ -42,3 +45,110 @@ test("parsePromptSource keeps non-JSON bracket text as plain text", () => { textPrompt("[todo] validate image input"), ); }); + +test("parsePromptSource accepts resource and resource_link blocks", () => { + const prompt = parsePromptSource( + JSON.stringify([ + { + type: "resource_link", + uri: "file:///tmp/spec.md", + name: "spec", + title: "Spec", + }, + { + type: "resource", + resource: { + uri: "file:///tmp/context.txt", + text: "Context", + }, + }, + ]), + ); + + assert.deepEqual(prompt, [ + { + type: "resource_link", + uri: "file:///tmp/spec.md", + name: "spec", + title: "Spec", + }, + { + type: "resource", + resource: { + uri: "file:///tmp/context.txt", + text: "Context", + }, + }, + ]); + assert.equal(isPromptInput(prompt), true); +}); + +test("parsePromptSource rejects invalid text and resource block shapes", () => { + assert.throws( + () => parsePromptSource(JSON.stringify([{ type: "text", text: 123 }])), + (error: unknown) => + error instanceof PromptInputValidationError && + /text block must include a string text field/.test(error.message), + ); + + assert.throws( + () => + parsePromptSource( + JSON.stringify([ + { + type: "resource_link", + uri: "", + }, + ]), + ), + (error: unknown) => + error instanceof PromptInputValidationError && + /resource_link block must include a non-empty uri/.test(error.message), + ); + + assert.throws( + () => + parsePromptSource( + JSON.stringify([ + { + type: "resource", + resource: { + uri: "file:///tmp/context.txt", + text: 123, + }, + }, + ]), + ), + (error: unknown) => + error instanceof PromptInputValidationError && + /resource block resource must include a non-empty uri and optional text/.test(error.message), + ); +}); + +test("parsePromptSource returns an empty prompt for blank input", () => { + assert.deepEqual(parsePromptSource(" "), []); +}); + +test("mergePromptSourceWithText appends or creates prompt text", () => { + assert.deepEqual( + mergePromptSourceWithText(JSON.stringify([{ type: "text", text: "hello" }]), "world"), + [ + { type: "text", text: "hello" }, + { type: "text", text: "world" }, + ], + ); + + assert.deepEqual(mergePromptSourceWithText(" ", "world"), [{ type: "text", text: "world" }]); + assert.deepEqual(mergePromptSourceWithText("hello", " "), [{ type: "text", text: "hello" }]); +}); + +test("promptToDisplayText renders text, resources, and images", () => { + const display = promptToDisplayText([ + { type: "text", text: "hello" }, + { type: "resource_link", uri: "file:///tmp/spec.md", name: "spec", title: "Spec" }, + { type: "resource", resource: { uri: "file:///tmp/context.txt", text: "Context" } }, + { type: "image", mimeType: "image/png", data: "aW1hZ2U=" }, + ]); + + assert.equal(display, "hello\n\nSpec\n\nContext\n\n[image] image/png"); +}); diff --git a/test/queue-ipc-errors.test.ts b/test/queue-ipc-errors.test.ts index 5457ae2..e5799f9 100644 --- a/test/queue-ipc-errors.test.ts +++ b/test/queue-ipc-errors.test.ts @@ -8,6 +8,7 @@ import { SessionQueueOwner, releaseQueueOwnerLease, tryAcquireQueueOwnerLease, + trySetConfigOptionOnRunningOwner, trySetModeOnRunningOwner, trySubmitToRunningOwner, } from "../src/queue-ipc.js"; @@ -170,6 +171,59 @@ test("trySetModeOnRunningOwner propagates typed queue control errors", async () }); }); +test("trySetConfigOptionOnRunningOwner returns the queue owner response", async () => { + await withTempHome(async (homeDir) => { + const sessionId = "control-config-success-session"; + const keeper = await startKeeperProcess(); + const { lockPath, socketPath } = queuePaths(homeDir, sessionId); + await writeQueueOwnerLock({ + lockPath, + pid: keeper.pid, + sessionId, + socketPath, + }); + + const server = createSingleRequestServer((socket, request) => { + assert.equal(request.type, "set_config_option"); + socket.write( + `${JSON.stringify({ + type: "accepted", + requestId: request.requestId, + })}\n`, + ); + socket.write( + `${JSON.stringify({ + type: "set_config_option_result", + requestId: request.requestId, + response: { + configOptions: [], + }, + })}\n`, + ); + socket.end(); + }); + + await listenServer(server, socketPath); + + try { + const response = await trySetConfigOptionOnRunningOwner( + sessionId, + "thinking_level", + "high", + 1_000, + true, + ); + assert.deepEqual(response, { + configOptions: [], + }); + } finally { + await closeServer(server); + await cleanupOwnerArtifacts({ socketPath, lockPath }); + stopProcess(keeper); + } + }); +}); + test("trySubmitToRunningOwner surfaces protocol invalid JSON detail code", async () => { await withTempHome(async (homeDir) => { const sessionId = "submit-invalid-json-session"; diff --git a/test/queue-ipc-server.test.ts b/test/queue-ipc-server.test.ts new file mode 100644 index 0000000..72d602c --- /dev/null +++ b/test/queue-ipc-server.test.ts @@ -0,0 +1,220 @@ +import assert from "node:assert/strict"; +import readline from "node:readline"; +import test from "node:test"; +import type { SetSessionConfigOptionResponse } from "@agentclientprotocol/sdk"; +import { + SessionQueueOwner, + releaseQueueOwnerLease, + tryAcquireQueueOwnerLease, +} from "../src/queue-ipc.js"; +import { connectSocket, nextJsonLine, withTempHome } from "./queue-test-helpers.js"; + +test("SessionQueueOwner handles control requests and nextTask timeouts", async () => { + await withTempHome(async () => { + const lease = await tryAcquireQueueOwnerLease("owner-control-success"); + assert(lease); + + let cancelled = 0; + const modes: string[] = []; + const configRequests: Array<{ id: string; value: string }> = []; + + const owner = await SessionQueueOwner.start(lease, { + cancelPrompt: async () => { + cancelled += 1; + return true; + }, + setSessionMode: async (modeId) => { + modes.push(modeId); + }, + setSessionConfigOption: async (configId, value) => { + configRequests.push({ id: configId, value }); + return { + configOptions: [], + } as SetSessionConfigOptionResponse; + }, + }); + + try { + assert.equal(await owner.nextTask(10), undefined); + + const cancelSocket = await connectSocket(lease.socketPath); + const cancelLines = readline.createInterface({ input: cancelSocket }); + const cancelIterator = cancelLines[Symbol.asyncIterator](); + cancelSocket.write( + `${JSON.stringify({ + type: "cancel_prompt", + requestId: "req-cancel", + })}\n`, + ); + + const cancelAccepted = (await nextJsonLine(cancelIterator)) as { type: string }; + const cancelResult = (await nextJsonLine(cancelIterator)) as { + type: string; + cancelled: boolean; + }; + assert.equal(cancelAccepted.type, "accepted"); + assert.equal(cancelResult.type, "cancel_result"); + assert.equal(cancelResult.cancelled, true); + cancelLines.close(); + cancelSocket.destroy(); + + const modeSocket = await connectSocket(lease.socketPath); + const modeLines = readline.createInterface({ input: modeSocket }); + const modeIterator = modeLines[Symbol.asyncIterator](); + modeSocket.write( + `${JSON.stringify({ + type: "set_mode", + requestId: "req-mode", + modeId: "plan", + timeoutMs: 250, + })}\n`, + ); + + const modeAccepted = (await nextJsonLine(modeIterator)) as { type: string }; + const modeResult = (await nextJsonLine(modeIterator)) as { type: string; modeId: string }; + assert.equal(modeAccepted.type, "accepted"); + assert.equal(modeResult.type, "set_mode_result"); + assert.equal(modeResult.modeId, "plan"); + modeLines.close(); + modeSocket.destroy(); + + const configSocket = await connectSocket(lease.socketPath); + const configLines = readline.createInterface({ input: configSocket }); + const configIterator = configLines[Symbol.asyncIterator](); + configSocket.write( + `${JSON.stringify({ + type: "set_config_option", + requestId: "req-config", + configId: "thinking_level", + value: "high", + timeoutMs: 250, + })}\n`, + ); + + const configAccepted = (await nextJsonLine(configIterator)) as { type: string }; + const configResult = (await nextJsonLine(configIterator)) as { + type: string; + response: { configOptions: unknown[] }; + }; + assert.equal(configAccepted.type, "accepted"); + assert.equal(configResult.type, "set_config_option_result"); + assert.deepEqual(configResult.response.configOptions, []); + configLines.close(); + configSocket.destroy(); + + assert.equal(cancelled, 1); + assert.deepEqual(modes, ["plan"]); + assert.deepEqual(configRequests, [{ id: "thinking_level", value: "high" }]); + } finally { + await owner.close(); + await releaseQueueOwnerLease(lease); + } + }); +}); + +test("SessionQueueOwner enqueues fire-and-forget prompts and rejects invalid owner generations", async () => { + await withTempHome(async () => { + const lease = await tryAcquireQueueOwnerLease("owner-prompt-success"); + assert(lease); + + const queueDepths: number[] = []; + const owner = await SessionQueueOwner.start( + lease, + { + cancelPrompt: async () => false, + setSessionMode: async () => { + // no-op + }, + setSessionConfigOption: async () => + ({ + configOptions: [], + }) as SetSessionConfigOptionResponse, + }, + { + maxQueueDepth: 4, + onQueueDepthChanged: (depth) => { + queueDepths.push(depth); + }, + }, + ); + + try { + const promptSocket = await connectSocket(lease.socketPath); + const promptLines = readline.createInterface({ input: promptSocket }); + const promptIterator = promptLines[Symbol.asyncIterator](); + promptSocket.write( + `${JSON.stringify({ + type: "submit_prompt", + requestId: "req-submit", + ownerGeneration: lease.ownerGeneration, + message: "hello from queue", + permissionMode: "approve-reads", + waitForCompletion: false, + })}\n`, + ); + + const accepted = (await nextJsonLine(promptIterator)) as { + type: string; + ownerGeneration?: number; + }; + assert.equal(accepted.type, "accepted"); + assert.equal(accepted.ownerGeneration, lease.ownerGeneration); + + const task = await owner.nextTask(); + assert(task); + assert.equal(task.requestId, "req-submit"); + assert.equal(task.message, "hello from queue"); + assert.deepEqual(task.prompt, [{ type: "text", text: "hello from queue" }]); + assert.equal(owner.queueDepth(), 0); + assert.deepEqual(queueDepths, [1, 0]); + promptLines.close(); + promptSocket.destroy(); + + const badSocket = await connectSocket(lease.socketPath); + const badLines = readline.createInterface({ input: badSocket }); + const badIterator = badLines[Symbol.asyncIterator](); + badSocket.write( + `${JSON.stringify({ + type: "submit_prompt", + requestId: "req-bad-generation", + ownerGeneration: lease.ownerGeneration + 1, + message: "stale", + permissionMode: "approve-reads", + waitForCompletion: true, + })}\n`, + ); + + const mismatch = (await nextJsonLine(badIterator)) as { + type: string; + detailCode?: string; + }; + assert.equal(mismatch.type, "error"); + assert.equal(mismatch.detailCode, "QUEUE_OWNER_GENERATION_MISMATCH"); + badLines.close(); + badSocket.destroy(); + + const invalidSocket = await connectSocket(lease.socketPath); + const invalidLines = readline.createInterface({ input: invalidSocket }); + const invalidIterator = invalidLines[Symbol.asyncIterator](); + invalidSocket.write( + `${JSON.stringify({ + type: "set_mode", + requestId: "req-invalid", + modeId: "", + })}\n`, + ); + + const invalid = (await nextJsonLine(invalidIterator)) as { + type: string; + detailCode?: string; + }; + assert.equal(invalid.type, "error"); + assert.equal(invalid.detailCode, "QUEUE_REQUEST_INVALID"); + invalidLines.close(); + invalidSocket.destroy(); + } finally { + await owner.close(); + await releaseQueueOwnerLease(lease); + } + }); +}); diff --git a/test/queue-lease-store.test.ts b/test/queue-lease-store.test.ts new file mode 100644 index 0000000..65f5926 --- /dev/null +++ b/test/queue-lease-store.test.ts @@ -0,0 +1,167 @@ +import assert from "node:assert/strict"; +import fs from "node:fs/promises"; +import path from "node:path"; +import test from "node:test"; +import { + ensureOwnerIsUsable, + isProcessAlive, + readQueueOwnerRecord, + readQueueOwnerStatus, + refreshQueueOwnerLease, + releaseQueueOwnerLease, + terminateProcess, + terminateQueueOwnerForSession, + tryAcquireQueueOwnerLease, +} from "../src/queue-lease-store.js"; +import { queueLockFilePath } from "../src/queue-paths.js"; +import { + queuePaths, + startKeeperProcess, + stopProcess, + withTempHome, + writeQueueOwnerLock, +} from "./queue-test-helpers.js"; + +test("readQueueOwnerRecord returns undefined for missing and malformed lock files", async () => { + await withTempHome(async (homeDir) => { + const sessionId = "missing-record"; + assert.equal(await readQueueOwnerRecord(sessionId), undefined); + + const lockPath = queueLockFilePath(sessionId, homeDir); + await fs.mkdir(path.dirname(lockPath), { recursive: true }); + await fs.writeFile(lockPath, "{not-json\n", "utf8"); + assert.equal(await readQueueOwnerRecord(sessionId), undefined); + + await fs.writeFile(lockPath, `${JSON.stringify({ pid: "bad" })}\n`, "utf8"); + assert.equal(await readQueueOwnerRecord(sessionId), undefined); + }); +}); + +test("tryAcquireQueueOwnerLease creates a lease that can be refreshed and released", async () => { + await withTempHome(async () => { + const lease = await tryAcquireQueueOwnerLease("lease-create"); + assert(lease); + assert.equal(lease.sessionId, "lease-create"); + + await refreshQueueOwnerLease( + lease, + { + queueDepth: 1.7, + }, + () => "2026-03-26T00:00:00.000Z", + ); + + const record = await readQueueOwnerRecord("lease-create"); + assert(record); + assert.equal(record.queueDepth, 2); + assert.equal(record.heartbeatAt, "2026-03-26T00:00:00.000Z"); + + await releaseQueueOwnerLease(lease); + assert.equal(await readQueueOwnerRecord("lease-create"), undefined); + }); +}); + +test("tryAcquireQueueOwnerLease clears stale dead owners and can acquire on retry", async () => { + await withTempHome(async (homeDir) => { + const sessionId = "stale-dead-owner"; + const { lockPath, socketPath } = queuePaths(homeDir, sessionId); + + await writeQueueOwnerLock({ + lockPath, + pid: 999_999, + sessionId, + socketPath, + heartbeatAt: "2000-01-01T00:00:00.000Z", + }); + + assert.equal(await tryAcquireQueueOwnerLease(sessionId), undefined); + assert.equal(await readQueueOwnerRecord(sessionId), undefined); + + const lease = await tryAcquireQueueOwnerLease(sessionId); + assert(lease); + await releaseQueueOwnerLease(lease); + }); +}); + +test("readQueueOwnerStatus returns live owner details for a healthy owner", async () => { + await withTempHome(async (homeDir) => { + const sessionId = "healthy-owner"; + const keeper = await startKeeperProcess(); + const { lockPath, socketPath } = queuePaths(homeDir, sessionId); + + try { + await writeQueueOwnerLock({ + lockPath, + pid: keeper.pid, + sessionId, + socketPath, + queueDepth: 3, + }); + + const status = await readQueueOwnerStatus(sessionId); + assert(status); + assert.equal(status.pid, keeper.pid); + assert.equal(status.alive, true); + assert.equal(status.stale, false); + assert.equal(status.queueDepth, 3); + } finally { + stopProcess(keeper); + await fs.rm(lockPath, { force: true }); + if (process.platform !== "win32") { + await fs.rm(socketPath, { force: true }); + } + } + }); +}); + +test("ensureOwnerIsUsable cleans up stale live owners", async () => { + await withTempHome(async (homeDir) => { + const sessionId = "stale-live-owner"; + const keeper = await startKeeperProcess(); + const { lockPath, socketPath } = queuePaths(homeDir, sessionId); + + try { + await writeQueueOwnerLock({ + lockPath, + pid: keeper.pid, + sessionId, + socketPath, + heartbeatAt: "2000-01-01T00:00:00.000Z", + }); + + const owner = await readQueueOwnerRecord(sessionId); + assert(owner); + assert.equal(await ensureOwnerIsUsable(sessionId, owner), false); + assert.equal(await readQueueOwnerRecord(sessionId), undefined); + } finally { + stopProcess(keeper); + } + }); +}); + +test("terminateProcess and terminateQueueOwnerForSession handle live and missing owners", async () => { + await withTempHome(async (homeDir) => { + assert.equal(isProcessAlive(undefined), false); + assert.equal(isProcessAlive(process.pid), false); + assert.equal(await terminateProcess(999_999), false); + + const sessionId = "terminate-owner"; + const keeper = await startKeeperProcess(); + const { lockPath, socketPath } = queuePaths(homeDir, sessionId); + + try { + assert.equal(isProcessAlive(keeper.pid), true); + await writeQueueOwnerLock({ + lockPath, + pid: keeper.pid, + sessionId, + socketPath, + }); + + await terminateQueueOwnerForSession(sessionId); + assert.equal(await readQueueOwnerRecord(sessionId), undefined); + } finally { + stopProcess(keeper); + } + }); +}); diff --git a/test/queue-messages.test.ts b/test/queue-messages.test.ts index 467615d..0c15cc8 100644 --- a/test/queue-messages.test.ts +++ b/test/queue-messages.test.ts @@ -97,3 +97,359 @@ test("parseQueueRequest rejects invalid owner generation", () => { assert.equal(parsed, null); }); + +test("parseQueueRequest accepts control requests and explicit prompt blocks", () => { + assert.deepEqual( + parseQueueRequest({ + type: "cancel_prompt", + requestId: "req-cancel", + ownerGeneration: 5, + }), + { + type: "cancel_prompt", + requestId: "req-cancel", + ownerGeneration: 5, + }, + ); + + assert.deepEqual( + parseQueueRequest({ + type: "set_mode", + requestId: "req-mode", + modeId: "plan", + timeoutMs: 2_000, + }), + { + type: "set_mode", + requestId: "req-mode", + ownerGeneration: undefined, + modeId: "plan", + timeoutMs: 2_000, + }, + ); + + assert.deepEqual( + parseQueueRequest({ + type: "set_config_option", + requestId: "req-config", + configId: "thinking_level", + value: "high", + }), + { + type: "set_config_option", + requestId: "req-config", + ownerGeneration: undefined, + configId: "thinking_level", + value: "high", + timeoutMs: undefined, + }, + ); + + assert.deepEqual( + parseQueueRequest({ + type: "submit_prompt", + requestId: "req-prompt", + message: "ignored text fallback", + prompt: [{ type: "text", text: "structured" }], + permissionMode: "approve-all", + suppressSdkConsoleErrors: false, + waitForCompletion: false, + }), + { + type: "submit_prompt", + requestId: "req-prompt", + ownerGeneration: undefined, + message: "ignored text fallback", + prompt: [{ type: "text", text: "structured" }], + permissionMode: "approve-all", + nonInteractivePermissions: undefined, + timeoutMs: undefined, + suppressSdkConsoleErrors: false, + waitForCompletion: false, + }, + ); +}); + +test("parseQueueRequest rejects invalid control and prompt payload shapes", () => { + assert.equal(parseQueueRequest(null), null); + assert.equal( + parseQueueRequest({ + type: "set_mode", + requestId: "req-mode", + modeId: " ", + }), + null, + ); + assert.equal( + parseQueueRequest({ + type: "set_config_option", + requestId: "req-config", + configId: "thinking_level", + value: " ", + }), + null, + ); + assert.equal( + parseQueueRequest({ + type: "submit_prompt", + requestId: "req-prompt", + message: "hello", + permissionMode: "approve-reads", + prompt: [{ type: "image", mimeType: "text/plain", data: "bad" }], + waitForCompletion: true, + }), + null, + ); + assert.equal( + parseQueueRequest({ + type: "submit_prompt", + requestId: "req-prompt", + message: "hello", + permissionMode: "approve-reads", + suppressSdkConsoleErrors: "nope", + waitForCompletion: true, + }), + null, + ); +}); + +test("parseQueueOwnerMessage accepts structured non-error owner messages", () => { + assert.deepEqual( + parseQueueOwnerMessage({ + type: "accepted", + requestId: "req-accepted", + ownerGeneration: 9, + }), + { + type: "accepted", + requestId: "req-accepted", + ownerGeneration: 9, + }, + ); + + assert.deepEqual( + parseQueueOwnerMessage({ + type: "event", + requestId: "req-event", + message: { + jsonrpc: "2.0", + method: "session/update", + params: { + sessionId: "session-1", + }, + }, + }), + { + type: "event", + requestId: "req-event", + ownerGeneration: undefined, + message: { + jsonrpc: "2.0", + method: "session/update", + params: { + sessionId: "session-1", + }, + }, + }, + ); + + assert.deepEqual( + parseQueueOwnerMessage({ + type: "cancel_result", + requestId: "req-cancel", + cancelled: true, + }), + { + type: "cancel_result", + requestId: "req-cancel", + ownerGeneration: undefined, + cancelled: true, + }, + ); + + assert.deepEqual( + parseQueueOwnerMessage({ + type: "set_mode_result", + requestId: "req-mode", + modeId: "plan", + }), + { + type: "set_mode_result", + requestId: "req-mode", + ownerGeneration: undefined, + modeId: "plan", + }, + ); + + assert.deepEqual( + parseQueueOwnerMessage({ + type: "set_config_option_result", + requestId: "req-config", + response: { + configOptions: [ + { + id: "thinking_level", + value: "high", + }, + ], + }, + }), + { + type: "set_config_option_result", + requestId: "req-config", + ownerGeneration: undefined, + response: { + configOptions: [ + { + id: "thinking_level", + value: "high", + }, + ], + }, + }, + ); +}); + +test("parseQueueOwnerMessage accepts result payloads and optional emitted-error flag", () => { + assert.deepEqual( + parseQueueOwnerMessage({ + type: "result", + requestId: "req-result", + result: { + stopReason: "end_turn", + sessionId: "session-1", + resumed: true, + permissionStats: { + requested: 1, + approved: 1, + denied: 0, + cancelled: 0, + }, + record: { + acpxRecordId: "record-1", + acpSessionId: "session-1", + agentCommand: "codex", + cwd: "/tmp/work", + createdAt: "2026-03-26T00:00:00.000Z", + lastUsedAt: "2026-03-26T00:00:00.000Z", + messages: [], + updated_at: "2026-03-26T00:00:00.000Z", + lastSeq: 0, + eventLog: { + stream_count: 0, + segment_count: 0, + }, + }, + }, + }), + { + type: "result", + requestId: "req-result", + ownerGeneration: undefined, + result: { + stopReason: "end_turn", + sessionId: "session-1", + resumed: true, + permissionStats: { + requested: 1, + approved: 1, + denied: 0, + cancelled: 0, + }, + record: { + acpxRecordId: "record-1", + acpSessionId: "session-1", + agentCommand: "codex", + cwd: "/tmp/work", + createdAt: "2026-03-26T00:00:00.000Z", + lastUsedAt: "2026-03-26T00:00:00.000Z", + messages: [], + updated_at: "2026-03-26T00:00:00.000Z", + lastSeq: 0, + eventLog: { + stream_count: 0, + segment_count: 0, + }, + }, + }, + }, + ); + + assert.deepEqual( + parseQueueOwnerMessage({ + type: "error", + requestId: "req-err-emitted", + code: "RUNTIME", + origin: "queue", + message: "already emitted", + outputAlreadyEmitted: true, + }), + { + type: "error", + requestId: "req-err-emitted", + ownerGeneration: undefined, + code: "RUNTIME", + detailCode: undefined, + origin: "queue", + message: "already emitted", + retryable: undefined, + acp: undefined, + outputAlreadyEmitted: true, + }, + ); +}); + +test("parseQueueOwnerMessage rejects invalid structured owner message payloads", () => { + assert.equal( + parseQueueOwnerMessage({ + type: "accepted", + requestId: "req-bad-owner-generation", + ownerGeneration: 0, + }), + null, + ); + assert.equal( + parseQueueOwnerMessage({ + type: "event", + requestId: "req-event", + message: { + method: "session/update", + }, + }), + null, + ); + assert.equal( + parseQueueOwnerMessage({ + type: "result", + requestId: "req-result", + result: { + stopReason: "end_turn", + }, + }), + null, + ); + assert.equal( + parseQueueOwnerMessage({ + type: "cancel_result", + requestId: "req-cancel", + cancelled: "yes", + }), + null, + ); + assert.equal( + parseQueueOwnerMessage({ + type: "set_mode_result", + requestId: "req-mode", + modeId: 123, + }), + null, + ); + assert.equal( + parseQueueOwnerMessage({ + type: "set_config_option_result", + requestId: "req-config", + response: {}, + }), + null, + ); +}); diff --git a/test/queue-owner-process.test.ts b/test/queue-owner-process.test.ts index e1d57d7..8c839a9 100644 --- a/test/queue-owner-process.test.ts +++ b/test/queue-owner-process.test.ts @@ -5,6 +5,7 @@ import os from "node:os"; import path from "node:path"; import { describe, it } from "node:test"; import { + buildQueueOwnerArgOverride, resolveQueueOwnerSpawnArgs, sanitizeQueueOwnerExecArgv, } from "../src/session-runtime/queue-owner-process.js"; @@ -75,3 +76,24 @@ describe("sanitizeQueueOwnerExecArgv", () => { ); }); }); + +describe("buildQueueOwnerArgOverride", () => { + it("returns null when no loader args remain after sanitization", () => { + assert.equal( + buildQueueOwnerArgOverride("/tmp/cli.js", [ + "--experimental-test-coverage", + "--test", + "--test-name-pattern", + "flow", + ]), + null, + ); + }); + + it("returns a serialized override when loader args are required", () => { + assert.equal( + buildQueueOwnerArgOverride("/tmp/cli.js", ["--import", "tsx"]), + JSON.stringify(["--import", "tsx", "/tmp/cli.js", "__queue-owner"]), + ); + }); +}); diff --git a/test/queue-paths.test.ts b/test/queue-paths.test.ts index f3d6716..d8f6372 100644 --- a/test/queue-paths.test.ts +++ b/test/queue-paths.test.ts @@ -1,6 +1,30 @@ import assert from "node:assert/strict"; import test from "node:test"; -import { queueSocketPath } from "../src/queue-paths.js"; +import { + queueBaseDir, + queueKeyForSession, + queueLockFilePath, + queueSocketBaseDir, + queueSocketPath, +} from "../src/queue-paths.js"; + +test("queue path helpers derive stable lock and socket paths", () => { + const homeDir = "/tmp/example-home"; + const key = queueKeyForSession("session-id"); + + assert.equal(key.length, 24); + assert.equal(queueBaseDir(homeDir), "/tmp/example-home/.acpx/queues"); + assert.equal(queueLockFilePath("session-id", homeDir), `${queueBaseDir(homeDir)}/${key}.lock`); + + if (process.platform === "win32") { + assert.equal(queueSocketBaseDir(homeDir), undefined); + assert.equal(queueSocketPath("session-id", homeDir), `\\\\.\\pipe\\acpx-${key}`); + return; + } + + assert.equal(queueSocketBaseDir(homeDir)?.startsWith("/tmp/acpx-"), true); + assert.equal(queueSocketPath("session-id", homeDir).endsWith(`${key}.sock`), true); +}); test("queueSocketPath stays short on unix even for long home paths", () => { if (process.platform === "win32") { diff --git a/test/version.test.ts b/test/version.test.ts index 0717c38..834d068 100644 --- a/test/version.test.ts +++ b/test/version.test.ts @@ -64,3 +64,53 @@ test("resolveAcpxVersion falls back to unknown when version cannot be resolved", }); assert.equal(version, "0.0.0-unknown"); }); + +test("resolveAcpxVersion ignores blank env versions and blank package versions", async () => { + const tmpDir = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-version-test-")); + try { + const packagePath = path.join(tmpDir, "package.json"); + await fs.writeFile( + packagePath, + `${JSON.stringify({ name: "acpx", version: " " }, null, 2)}\n`, + "utf8", + ); + const version = resolveAcpxVersion({ + env: { + npm_package_name: "acpx", + npm_package_version: " ", + }, + packageJsonPath: packagePath, + }); + assert.equal(version, "0.0.0-unknown"); + } finally { + await fs.rm(tmpDir, { recursive: true, force: true }); + } +}); + +test("getAcpxVersion caches the first resolved version", async () => { + const versionModuleUrl = new URL(`../src/version.js?cachebust=${Date.now()}`, import.meta.url); + const previousName = process.env.npm_package_name; + const previousVersion = process.env.npm_package_version; + + process.env.npm_package_name = "acpx"; + process.env.npm_package_version = "7.8.9"; + + try { + const freshModule = (await import(versionModuleUrl.href)) as typeof import("../src/version.js"); + assert.equal(freshModule.getAcpxVersion(), "7.8.9"); + + process.env.npm_package_version = "9.9.9"; + assert.equal(freshModule.getAcpxVersion(), "7.8.9"); + } finally { + if (previousName === undefined) { + delete process.env.npm_package_name; + } else { + process.env.npm_package_name = previousName; + } + if (previousVersion === undefined) { + delete process.env.npm_package_version; + } else { + process.env.npm_package_version = previousVersion; + } + } +}); From 5428cfd0685e1d80e7506ea96598469ac976845c Mon Sep 17 00:00:00 2001 From: Onur Solmaz <2453968+osolmaz@users.noreply.github.com> Date: Thu, 26 Mar 2026 10:50:12 +0100 Subject: [PATCH 15/22] refactor: split flows queue and cli modules --- src/cli-core.ts | 216 +--------------------- src/cli/config-command.ts | 73 ++++++++ src/cli/json-output.ts | 9 + src/cli/output-render.ts | 9 +- src/cli/status-command.ts | 137 ++++++++++++++ src/flows.ts | 50 +++-- src/flows/definition.ts | 60 ++++++ src/flows/graph.ts | 69 +++++++ src/flows/runtime.ts | 370 ++++++------------------------------- src/flows/store.ts | 2 +- src/flows/types.ts | 186 +++++++++++++++++++ src/queue-ipc-health.ts | 64 +++++++ src/queue-ipc-transport.ts | 58 ++++++ src/queue-ipc.ts | 131 +------------ 14 files changed, 747 insertions(+), 687 deletions(-) create mode 100644 src/cli/config-command.ts create mode 100644 src/cli/json-output.ts create mode 100644 src/cli/status-command.ts create mode 100644 src/flows/definition.ts create mode 100644 src/flows/graph.ts create mode 100644 src/flows/types.ts create mode 100644 src/queue-ipc-health.ts create mode 100644 src/queue-ipc-transport.ts diff --git a/src/cli-core.ts b/src/cli-core.ts index 59ddae5..227201f 100644 --- a/src/cli-core.ts +++ b/src/cli-core.ts @@ -4,6 +4,7 @@ import fs from "node:fs/promises"; import path from "node:path"; import { Command, CommanderError, InvalidArgumentError } from "commander"; import { listBuiltInAgents } from "./agent-registry.js"; +import { registerConfigCommand } from "./cli/config-command.js"; import { addGlobalFlags, addPromptInputOption, @@ -26,12 +27,9 @@ import { type SessionsNewFlags, type StatusFlags, } from "./cli/flags.js"; -import { - initGlobalConfigFile, - loadResolvedConfig, - toConfigDisplay, - type ResolvedAcpxConfig, -} from "./config.js"; +import { emitJsonResult } from "./cli/json-output.js"; +import { registerStatusCommand } from "./cli/status-command.js"; +import { loadResolvedConfig, type ResolvedAcpxConfig } from "./config.js"; import { exitCodeForOutputErrorCode, normalizeOutputError, @@ -156,14 +154,6 @@ function applyPermissionExitCode(result: { } } -function emitJsonResult(format: OutputFormat, payload: unknown): boolean { - if (format !== "json") { - return false; - } - process.stdout.write(`${JSON.stringify(payload)}\n`); - return true; -} - function isCodexAgentInvocation(agent: { agentName: string; agentCommand: string }): boolean { if (agent.agentName === "codex") { return true; @@ -187,13 +177,11 @@ export { formatPromptSessionBannerLine } from "./cli/output-render.js"; type SessionModule = typeof import("./session.js"); type OutputModule = typeof import("./output.js"); type OutputRenderModule = typeof import("./cli/output-render.js"); -type QueueIpcModule = typeof import("./queue-ipc.js"); type SkillflagModule = typeof import("skillflag"); let sessionModulePromise: Promise | undefined; let outputModulePromise: Promise | undefined; let outputRenderModulePromise: Promise | undefined; -let queueIpcModulePromise: Promise | undefined; let skillflagModulePromise: Promise | undefined; function loadSessionModule(): Promise { @@ -211,11 +199,6 @@ function loadOutputRenderModule(): Promise { return outputRenderModulePromise; } -function loadQueueIpcModule(): Promise { - queueIpcModulePromise ??= import("./queue-ipc.js"); - return queueIpcModulePromise; -} - function loadSkillflagModule(): Promise { skillflagModulePromise ??= import("skillflag"); return skillflagModulePromise; @@ -924,167 +907,6 @@ async function handleSessionsHistory( printSessionHistoryByFormat(record, flags.limit, globalFlags.format); } -function formatUptime(startedAt: string | undefined): string | undefined { - if (!startedAt) { - return undefined; - } - - const startedMs = Date.parse(startedAt); - if (!Number.isFinite(startedMs)) { - return undefined; - } - - const elapsedMs = Math.max(0, Date.now() - startedMs); - const seconds = Math.floor(elapsedMs / 1_000); - const hours = Math.floor(seconds / 3_600); - const minutes = Math.floor((seconds % 3_600) / 60); - const remSeconds = seconds % 60; - return `${hours.toString().padStart(2, "0")}:${minutes - .toString() - .padStart(2, "0")}:${remSeconds.toString().padStart(2, "0")}`; -} - -async function handleStatus( - explicitAgentName: string | undefined, - flags: StatusFlags, - command: Command, - config: ResolvedAcpxConfig, -): Promise { - const globalFlags = resolveGlobalFlags(command, config); - const agent = resolveAgentInvocation(explicitAgentName, globalFlags, config); - const [{ probeQueueOwnerHealth }, { agentSessionIdPayload, emitJsonResult }] = await Promise.all([ - loadQueueIpcModule(), - loadOutputRenderModule(), - ]); - const record = await findSession({ - agentCommand: agent.agentCommand, - cwd: agent.cwd, - name: resolveSessionNameFromFlags(flags, command), - }); - - if (!record) { - if ( - emitJsonResult(globalFlags.format, { - action: "status_snapshot", - status: "no-session", - summary: "no active session", - }) - ) { - return; - } - - if (globalFlags.format === "quiet") { - process.stdout.write("no-session\n"); - return; - } - - process.stdout.write(`session: -\n`); - process.stdout.write(`agent: ${agent.agentCommand}\n`); - process.stdout.write(`pid: -\n`); - process.stdout.write(`status: no-session\n`); - process.stdout.write(`uptime: -\n`); - process.stdout.write(`lastPromptTime: -\n`); - return; - } - - const health = await probeQueueOwnerHealth(record.acpxRecordId); - const running = health.healthy; - const payload = { - sessionId: record.acpxRecordId, - agentCommand: record.agentCommand, - pid: health.pid ?? record.pid ?? null, - status: running ? "running" : "dead", - uptime: running ? (formatUptime(record.agentStartedAt) ?? null) : null, - lastPromptTime: record.lastPromptAt ?? null, - exitCode: running ? null : (record.lastAgentExitCode ?? null), - signal: running ? null : (record.lastAgentExitSignal ?? null), - ...agentSessionIdPayload(record.agentSessionId), - }; - - if ( - emitJsonResult(globalFlags.format, { - action: "status_snapshot", - status: running ? "alive" : "dead", - pid: payload.pid ?? undefined, - summary: running ? "queue owner healthy" : "queue owner unavailable", - uptime: payload.uptime ?? undefined, - lastPromptTime: payload.lastPromptTime ?? undefined, - exitCode: payload.exitCode ?? undefined, - signal: payload.signal ?? undefined, - acpxRecordId: record.acpxRecordId, - acpxSessionId: record.acpSessionId, - agentSessionId: record.agentSessionId, - }) - ) { - return; - } - - if (globalFlags.format === "quiet") { - process.stdout.write(`${payload.status}\n`); - return; - } - - process.stdout.write(`session: ${payload.sessionId}\n`); - if ("agentSessionId" in payload) { - process.stdout.write(`agentSessionId: ${payload.agentSessionId}\n`); - } - process.stdout.write(`agent: ${payload.agentCommand}\n`); - process.stdout.write(`pid: ${payload.pid ?? "-"}\n`); - process.stdout.write(`status: ${payload.status}\n`); - process.stdout.write(`uptime: ${payload.uptime ?? "-"}\n`); - process.stdout.write(`lastPromptTime: ${payload.lastPromptTime ?? "-"}\n`); - if (payload.status === "dead") { - process.stdout.write(`exitCode: ${payload.exitCode ?? "-"}\n`); - process.stdout.write(`signal: ${payload.signal ?? "-"}\n`); - } -} - -async function handleConfigShow(command: Command, config: ResolvedAcpxConfig): Promise { - const globalFlags = resolveGlobalFlags(command, config); - const payload = { - ...toConfigDisplay(config), - paths: { - global: config.globalPath, - project: config.projectPath, - }, - loaded: { - global: config.hasGlobalConfig, - project: config.hasProjectConfig, - }, - }; - - if (globalFlags.format === "json") { - process.stdout.write(`${JSON.stringify(payload)}\n`); - return; - } - - process.stdout.write(`${JSON.stringify(payload, null, 2)}\n`); -} - -async function handleConfigInit(command: Command, config: ResolvedAcpxConfig): Promise { - const globalFlags = resolveGlobalFlags(command, config); - const result = await initGlobalConfigFile(); - if (globalFlags.format === "json") { - process.stdout.write( - `${JSON.stringify({ - path: result.path, - created: result.created, - })}\n`, - ); - return; - } - if (globalFlags.format === "quiet") { - process.stdout.write(`${result.path}\n`); - return; - } - - if (result.created) { - process.stdout.write(`Created ${result.path}\n`); - return; - } - process.stdout.write(`Config already exists: ${result.path}\n`); -} - function registerSessionsCommand( parent: Command, explicitAgentName: string | undefined, @@ -1247,11 +1069,7 @@ function registerSharedAgentSubcommands( await handleSetConfigOption(explicitAgentName, key, value, flags, this, config); }); - const statusCommand = parent.command("status").description(descriptions.status); - addSessionNameOption(statusCommand); - statusCommand.action(async function (this: Command, flags: StatusFlags) { - await handleStatus(explicitAgentName, flags, this, config); - }); + registerStatusCommand(parent, explicitAgentName, config, descriptions.status); } function registerAgentCommand( @@ -1285,30 +1103,6 @@ function registerAgentCommand( registerSessionsCommand(agentCommand, agentName, config); } -function registerConfigCommand(program: Command, config: ResolvedAcpxConfig): void { - const configCommand = program - .command("config") - .description("Inspect and initialize acpx configuration"); - - configCommand - .command("show") - .description("Show resolved config") - .action(async function (this: Command) { - await handleConfigShow(this, config); - }); - - configCommand - .command("init") - .description("Create global config template") - .action(async function (this: Command) { - await handleConfigInit(this, config); - }); - - configCommand.action(async function (this: Command) { - await handleConfigShow(this, config); - }); -} - function registerFlowCommand(program: Command, config: ResolvedAcpxConfig): void { const flowCommand = program .command("flow") diff --git a/src/cli/config-command.ts b/src/cli/config-command.ts new file mode 100644 index 0000000..6306af0 --- /dev/null +++ b/src/cli/config-command.ts @@ -0,0 +1,73 @@ +import { Command } from "commander"; +import { initGlobalConfigFile, toConfigDisplay, type ResolvedAcpxConfig } from "../config.js"; +import { resolveGlobalFlags } from "./flags.js"; + +async function handleConfigShow(command: Command, config: ResolvedAcpxConfig): Promise { + const globalFlags = resolveGlobalFlags(command, config); + const payload = { + ...toConfigDisplay(config), + paths: { + global: config.globalPath, + project: config.projectPath, + }, + loaded: { + global: config.hasGlobalConfig, + project: config.hasProjectConfig, + }, + }; + + if (globalFlags.format === "json") { + process.stdout.write(`${JSON.stringify(payload)}\n`); + return; + } + + process.stdout.write(`${JSON.stringify(payload, null, 2)}\n`); +} + +async function handleConfigInit(command: Command, config: ResolvedAcpxConfig): Promise { + const globalFlags = resolveGlobalFlags(command, config); + const result = await initGlobalConfigFile(); + if (globalFlags.format === "json") { + process.stdout.write( + `${JSON.stringify({ + path: result.path, + created: result.created, + })}\n`, + ); + return; + } + if (globalFlags.format === "quiet") { + process.stdout.write(`${result.path}\n`); + return; + } + + if (result.created) { + process.stdout.write(`Created ${result.path}\n`); + return; + } + process.stdout.write(`Config already exists: ${result.path}\n`); +} + +export function registerConfigCommand(program: Command, config: ResolvedAcpxConfig): void { + const configCommand = program + .command("config") + .description("Inspect and initialize acpx configuration"); + + configCommand + .command("show") + .description("Show resolved config") + .action(async function (this: Command) { + await handleConfigShow(this, config); + }); + + configCommand + .command("init") + .description("Create global config template") + .action(async function (this: Command) { + await handleConfigInit(this, config); + }); + + configCommand.action(async function (this: Command) { + await handleConfigShow(this, config); + }); +} diff --git a/src/cli/json-output.ts b/src/cli/json-output.ts new file mode 100644 index 0000000..2d29294 --- /dev/null +++ b/src/cli/json-output.ts @@ -0,0 +1,9 @@ +import type { OutputFormat } from "../types.js"; + +export function emitJsonResult(format: OutputFormat, payload: unknown): boolean { + if (format !== "json") { + return false; + } + process.stdout.write(`${JSON.stringify(payload)}\n`); + return true; +} diff --git a/src/cli/output-render.ts b/src/cli/output-render.ts index d3951d1..8b8116d 100644 --- a/src/cli/output-render.ts +++ b/src/cli/output-render.ts @@ -2,6 +2,7 @@ import path from "node:path"; import { probeQueueOwnerHealth } from "../queue-ipc.js"; import { normalizeRuntimeSessionId } from "../runtime-session-id.js"; import type { OutputFormat, SessionRecord } from "../types.js"; +import { emitJsonResult } from "./json-output.js"; function formatSessionLabel(record: SessionRecord): string { return record.name ?? "cwd"; @@ -24,14 +25,6 @@ async function resolveSessionConnectionStatus( return health.healthy ? "connected" : "needs reconnect"; } -export function emitJsonResult(format: OutputFormat, payload: unknown): boolean { - if (format !== "json") { - return false; - } - process.stdout.write(`${JSON.stringify(payload)}\n`); - return true; -} - export function printSessionsByFormat(sessions: SessionRecord[], format: OutputFormat): void { if (format === "json") { process.stdout.write(`${JSON.stringify(sessions)}\n`); diff --git a/src/cli/status-command.ts b/src/cli/status-command.ts new file mode 100644 index 0000000..f5d2df3 --- /dev/null +++ b/src/cli/status-command.ts @@ -0,0 +1,137 @@ +import { Command } from "commander"; +import type { ResolvedAcpxConfig } from "../config.js"; +import { probeQueueOwnerHealth } from "../queue-ipc.js"; +import { findSession } from "../session-persistence.js"; +import { + addSessionNameOption, + resolveAgentInvocation, + resolveGlobalFlags, + resolveSessionNameFromFlags, + type StatusFlags, +} from "./flags.js"; +import { emitJsonResult } from "./json-output.js"; +import { agentSessionIdPayload } from "./output-render.js"; + +function formatUptime(startedAt: string | undefined): string | undefined { + if (!startedAt) { + return undefined; + } + + const startedMs = Date.parse(startedAt); + if (!Number.isFinite(startedMs)) { + return undefined; + } + + const elapsedMs = Math.max(0, Date.now() - startedMs); + const seconds = Math.floor(elapsedMs / 1_000); + const hours = Math.floor(seconds / 3_600); + const minutes = Math.floor((seconds % 3_600) / 60); + const remSeconds = seconds % 60; + return `${hours.toString().padStart(2, "0")}:${minutes + .toString() + .padStart(2, "0")}:${remSeconds.toString().padStart(2, "0")}`; +} + +export async function handleStatus( + explicitAgentName: string | undefined, + flags: StatusFlags, + command: Command, + config: ResolvedAcpxConfig, +): Promise { + const globalFlags = resolveGlobalFlags(command, config); + const agent = resolveAgentInvocation(explicitAgentName, globalFlags, config); + const record = await findSession({ + agentCommand: agent.agentCommand, + cwd: agent.cwd, + name: resolveSessionNameFromFlags(flags, command), + }); + + if (!record) { + if ( + emitJsonResult(globalFlags.format, { + action: "status_snapshot", + status: "no-session", + summary: "no active session", + }) + ) { + return; + } + + if (globalFlags.format === "quiet") { + process.stdout.write("no-session\n"); + return; + } + + process.stdout.write("session: -\n"); + process.stdout.write(`agent: ${agent.agentCommand}\n`); + process.stdout.write("pid: -\n"); + process.stdout.write("status: no-session\n"); + process.stdout.write("uptime: -\n"); + process.stdout.write("lastPromptTime: -\n"); + return; + } + + const health = await probeQueueOwnerHealth(record.acpxRecordId); + const running = health.healthy; + const payload = { + sessionId: record.acpxRecordId, + agentCommand: record.agentCommand, + pid: health.pid ?? record.pid ?? null, + status: running ? "running" : "dead", + uptime: running ? (formatUptime(record.agentStartedAt) ?? null) : null, + lastPromptTime: record.lastPromptAt ?? null, + exitCode: running ? null : (record.lastAgentExitCode ?? null), + signal: running ? null : (record.lastAgentExitSignal ?? null), + ...agentSessionIdPayload(record.agentSessionId), + }; + + if ( + emitJsonResult(globalFlags.format, { + action: "status_snapshot", + status: running ? "alive" : "dead", + pid: payload.pid ?? undefined, + summary: running ? "queue owner healthy" : "queue owner unavailable", + uptime: payload.uptime ?? undefined, + lastPromptTime: payload.lastPromptTime ?? undefined, + exitCode: payload.exitCode ?? undefined, + signal: payload.signal ?? undefined, + acpxRecordId: record.acpxRecordId, + acpxSessionId: record.acpSessionId, + agentSessionId: record.agentSessionId, + }) + ) { + return; + } + + if (globalFlags.format === "quiet") { + process.stdout.write(`${payload.status}\n`); + return; + } + + process.stdout.write(`session: ${payload.sessionId}\n`); + if ("agentSessionId" in payload) { + process.stdout.write(`agentSessionId: ${payload.agentSessionId}\n`); + } + process.stdout.write(`agent: ${payload.agentCommand}\n`); + process.stdout.write(`pid: ${payload.pid ?? "-"}\n`); + process.stdout.write(`status: ${payload.status}\n`); + process.stdout.write(`uptime: ${payload.uptime ?? "-"}\n`); + process.stdout.write(`lastPromptTime: ${payload.lastPromptTime ?? "-"}\n`); + if (payload.status === "dead") { + process.stdout.write(`exitCode: ${payload.exitCode ?? "-"}\n`); + process.stdout.write(`signal: ${payload.signal ?? "-"}\n`); + } +} + +export function registerStatusCommand( + parent: Command, + explicitAgentName: string | undefined, + config: ResolvedAcpxConfig, + description: string, +): void { + const statusCommand = parent.command("status").description(description); + addSessionNameOption(statusCommand); + statusCommand.action(async function (this: Command, flags: StatusFlags) { + await handleStatus(explicitAgentName, flags, this, config); + }); +} diff --git a/src/flows.ts b/src/flows.ts index ad7c63c..bc857f5 100644 --- a/src/flows.ts +++ b/src/flows.ts @@ -1,30 +1,26 @@ -export { - FlowRunner, - acp, - action, - checkpoint, - compute, - defineFlow, - shell, - type FlowNodeCommon, - type AcpNodeDefinition, - type ActionNodeDefinition, - type CheckpointNodeDefinition, - type ComputeNodeDefinition, - type FunctionActionNodeDefinition, - type FlowDefinition, - type FlowEdge, - type FlowNodeContext, - type FlowNodeDefinition, - type FlowRunResult, - type FlowRunState, - type FlowRunnerOptions, - type FlowSessionBinding, - type FlowStepRecord, - type ShellActionExecution, - type ShellActionNodeDefinition, - type ShellActionResult, -} from "./flows/runtime.js"; +export { FlowRunner } from "./flows/runtime.js"; +export { acp, action, checkpoint, compute, defineFlow, shell } from "./flows/definition.js"; +export type { + AcpNodeDefinition, + ActionNodeDefinition, + CheckpointNodeDefinition, + ComputeNodeDefinition, + FlowDefinition, + FlowEdge, + FlowNodeCommon, + FlowNodeContext, + FlowNodeDefinition, + FlowRunResult, + FlowRunState, + FlowRunnerOptions, + FlowSessionBinding, + FlowStepRecord, + FunctionActionNodeDefinition, + ResolvedFlowAgent, + ShellActionExecution, + ShellActionNodeDefinition, + ShellActionResult, +} from "./flows/types.js"; export { flowRunsBaseDir } from "./flows/store.js"; export { extractJsonObject, diff --git a/src/flows/definition.ts b/src/flows/definition.ts new file mode 100644 index 0000000..85dff3d --- /dev/null +++ b/src/flows/definition.ts @@ -0,0 +1,60 @@ +import type { + AcpNodeDefinition, + ActionNodeDefinition, + CheckpointNodeDefinition, + ComputeNodeDefinition, + FlowDefinition, + FunctionActionNodeDefinition, + ShellActionNodeDefinition, +} from "./types.js"; + +export function defineFlow(definition: TFlow): TFlow { + return definition; +} + +export function acp(definition: Omit): AcpNodeDefinition { + return { + kind: "acp", + ...definition, + }; +} + +export function compute(definition: Omit): ComputeNodeDefinition { + return { + kind: "compute", + ...definition, + }; +} + +export function action( + definition: Omit, +): FunctionActionNodeDefinition; +export function action( + definition: Omit, +): ShellActionNodeDefinition; +export function action( + definition: Omit | Omit, +): ActionNodeDefinition { + return { + kind: "action", + ...definition, + } as ActionNodeDefinition; +} + +export function shell( + definition: Omit, +): ShellActionNodeDefinition { + return { + kind: "action", + ...definition, + }; +} + +export function checkpoint( + definition: Omit = {}, +): CheckpointNodeDefinition { + return { + kind: "checkpoint", + ...definition, + }; +} diff --git a/src/flows/graph.ts b/src/flows/graph.ts new file mode 100644 index 0000000..4192576 --- /dev/null +++ b/src/flows/graph.ts @@ -0,0 +1,69 @@ +import type { FlowDefinition, FlowEdge } from "./types.js"; + +export function validateFlowDefinition(flow: FlowDefinition): void { + if (!flow.name.trim()) { + throw new Error("Flow name must not be empty"); + } + if (!flow.nodes[flow.startAt]) { + throw new Error(`Flow start node is missing: ${flow.startAt}`); + } + + const outgoingEdges = new Set(); + for (const edge of flow.edges) { + if (!flow.nodes[edge.from]) { + throw new Error(`Flow edge references unknown from-node: ${edge.from}`); + } + if (outgoingEdges.has(edge.from)) { + throw new Error(`Flow node must not declare multiple outgoing edges: ${edge.from}`); + } + outgoingEdges.add(edge.from); + if ("to" in edge) { + if (!flow.nodes[edge.to]) { + throw new Error(`Flow edge references unknown to-node: ${edge.to}`); + } + continue; + } + for (const target of Object.values(edge.switch.cases)) { + if (!flow.nodes[target]) { + throw new Error(`Flow switch references unknown to-node: ${target}`); + } + } + } +} + +export function resolveNext(edges: FlowEdge[], from: string, output: unknown): string | null { + const edge = edges.find((candidate) => candidate.from === from); + if (!edge) { + return null; + } + + if ("to" in edge) { + return edge.to; + } + + const value = getByPath(output, edge.switch.on); + if (typeof value !== "string" && typeof value !== "number" && typeof value !== "boolean") { + throw new Error(`Flow switch value must be scalar for ${edge.switch.on}`); + } + const next = edge.switch.cases[String(value)]; + if (!next) { + throw new Error(`No flow switch case for ${edge.switch.on}=${JSON.stringify(value)}`); + } + return next; +} + +function getByPath(value: unknown, jsonPath: string): unknown { + if (!jsonPath.startsWith("$.")) { + throw new Error(`Unsupported JSON path: ${jsonPath}`); + } + + return jsonPath + .slice(2) + .split(".") + .reduce((current, key) => { + if (current == null || typeof current !== "object") { + return undefined; + } + return (current as Record)[key]; + }, value); +} diff --git a/src/flows/runtime.ts b/src/flows/runtime.ts index 665f18f..5ed7f18 100644 --- a/src/flows/runtime.ts +++ b/src/flows/runtime.ts @@ -4,180 +4,60 @@ import { createOutputFormatter } from "../output.js"; import { promptToDisplayText, textPrompt } from "../prompt-content.js"; import { resolveSessionRecord } from "../session-persistence.js"; import { TimeoutError, withTimeout } from "../session-runtime-helpers.js"; -import { - cancelSessionPrompt, - createSession, - runOnce, - sendSession, - type SessionAgentOptions, -} from "../session.js"; -import type { - AuthPolicy, - McpServer, - NonInteractivePermissionPolicy, - PermissionMode, - PromptInput, -} from "../types.js"; +import { cancelSessionPrompt, createSession, runOnce, sendSession } from "../session.js"; +import type { PromptInput } from "../types.js"; +import { acp, action, checkpoint, compute, defineFlow, shell } from "./definition.js"; import { formatShellActionSummary, runShellAction } from "./executors/shell.js"; -import { FlowRunStore, flowRunsBaseDir } from "./store.js"; +import { resolveNext, validateFlowDefinition } from "./graph.js"; +import { FlowRunStore } from "./store.js"; +import type { + AcpNodeDefinition, + ActionNodeDefinition, + CheckpointNodeDefinition, + ComputeNodeDefinition, + FlowDefinition, + FlowNodeCommon, + FlowNodeContext, + FlowNodeDefinition, + FlowRunResult, + FlowRunState, + FlowRunnerOptions, + FlowSessionBinding, + FlowEdge, + FlowStepRecord, + FunctionActionNodeDefinition, + ResolvedFlowAgent, + ShellActionExecution, + ShellActionNodeDefinition, + ShellActionResult, +} from "./types.js"; + +export { acp, action, checkpoint, compute, defineFlow, shell }; +export type { + AcpNodeDefinition, + ActionNodeDefinition, + CheckpointNodeDefinition, + ComputeNodeDefinition, + FlowDefinition, + FlowEdge, + FlowNodeCommon, + FlowNodeContext, + FlowNodeDefinition, + FlowRunResult, + FlowRunState, + FlowRunnerOptions, + FlowSessionBinding, + FlowStepRecord, + FunctionActionNodeDefinition, + ResolvedFlowAgent, + ShellActionExecution, + ShellActionNodeDefinition, + ShellActionResult, +} from "./types.js"; -type MaybePromise = T | Promise; const DEFAULT_FLOW_HEARTBEAT_MS = 5_000; const DEFAULT_FLOW_STEP_TIMEOUT_MS = 15 * 60_000; -export type FlowNodeContext = { - input: TInput; - outputs: Record; - state: FlowRunState; - services: Record; -}; - -export type FlowNodeCommon = { - timeoutMs?: number; - heartbeatMs?: number; - statusDetail?: string; -}; - -export type FlowEdge = - | { - from: string; - to: string; - } - | { - from: string; - switch: { - on: string; - cases: Record; - }; - }; - -export type AcpNodeDefinition = FlowNodeCommon & { - kind: "acp"; - profile?: string; - cwd?: string | ((context: FlowNodeContext) => MaybePromise); - session?: { - handle?: string; - isolated?: boolean; - }; - prompt: (context: FlowNodeContext) => MaybePromise; - parse?: (text: string, context: FlowNodeContext) => MaybePromise; -}; - -export type ComputeNodeDefinition = FlowNodeCommon & { - kind: "compute"; - run: (context: FlowNodeContext) => MaybePromise; -}; - -export type FunctionActionNodeDefinition = FlowNodeCommon & { - kind: "action"; - run: (context: FlowNodeContext) => MaybePromise; -}; - -export type ShellActionExecution = { - command: string; - args?: string[]; - cwd?: string; - env?: Record; - stdin?: string; - shell?: boolean | string; - allowNonZeroExit?: boolean; - timeoutMs?: number; -}; - -export type ShellActionResult = { - command: string; - args: string[]; - cwd: string; - stdout: string; - stderr: string; - combinedOutput: string; - exitCode: number | null; - signal: NodeJS.Signals | null; - durationMs: number; -}; - -export type ShellActionNodeDefinition = FlowNodeCommon & { - kind: "action"; - exec: (context: FlowNodeContext) => MaybePromise; - parse?: (result: ShellActionResult, context: FlowNodeContext) => MaybePromise; -}; - -export type ActionNodeDefinition = FunctionActionNodeDefinition | ShellActionNodeDefinition; - -export type CheckpointNodeDefinition = FlowNodeCommon & { - kind: "checkpoint"; - summary?: string; - run?: (context: FlowNodeContext) => MaybePromise; -}; - -export type FlowNodeDefinition = - | AcpNodeDefinition - | ComputeNodeDefinition - | ActionNodeDefinition - | CheckpointNodeDefinition; - -export type FlowDefinition = { - name: string; - startAt: string; - nodes: Record; - edges: FlowEdge[]; -}; - -export type FlowStepRecord = { - nodeId: string; - kind: FlowNodeDefinition["kind"]; - startedAt: string; - finishedAt: string; - promptText: string | null; - rawText: string | null; - output: unknown; - session: FlowSessionBinding | null; - agent: { - agentName: string; - agentCommand: string; - cwd: string; - } | null; -}; - -export type FlowSessionBinding = { - key: string; - handle: string; - name: string; - profile?: string; - agentName: string; - agentCommand: string; - cwd: string; - acpxRecordId: string; - acpSessionId: string; - agentSessionId?: string; -}; - -export type FlowRunState = { - runId: string; - flowName: string; - flowPath?: string; - startedAt: string; - finishedAt?: string; - updatedAt: string; - status: "running" | "waiting" | "completed" | "failed" | "timed_out"; - input: unknown; - outputs: Record; - steps: FlowStepRecord[]; - sessionBindings: Record; - currentNode?: string; - currentNodeKind?: FlowNodeDefinition["kind"]; - currentNodeStartedAt?: string; - lastHeartbeatAt?: string; - statusDetail?: string; - waitingOn?: string; - error?: string; -}; - -export type FlowRunResult = { - runDir: string; - state: FlowRunState; -}; - type MemoryWritable = { write(chunk: string): void; }; @@ -187,81 +67,9 @@ type FlowNodeExecutionResult = { promptText: string | null; rawText: string | null; sessionInfo: FlowSessionBinding | null; - agentInfo: ReturnType | null; + agentInfo: ResolvedFlowAgent | null; }; -export type FlowRunnerOptions = { - resolveAgent: (profile?: string) => { - agentName: string; - agentCommand: string; - cwd: string; - }; - permissionMode: PermissionMode; - mcpServers?: McpServer[]; - nonInteractivePermissions?: NonInteractivePermissionPolicy; - authCredentials?: Record; - authPolicy?: AuthPolicy; - timeoutMs?: number; - defaultNodeTimeoutMs?: number; - ttlMs?: number; - verbose?: boolean; - suppressSdkConsoleErrors?: boolean; - sessionOptions?: SessionAgentOptions; - services?: Record; - outputRoot?: string; -}; - -export function defineFlow(definition: TFlow): TFlow { - return definition; -} - -export function acp(definition: Omit): AcpNodeDefinition { - return { - kind: "acp", - ...definition, - }; -} - -export function compute(definition: Omit): ComputeNodeDefinition { - return { - kind: "compute", - ...definition, - }; -} - -export function action( - definition: Omit, -): FunctionActionNodeDefinition; -export function action( - definition: Omit, -): ShellActionNodeDefinition; -export function action( - definition: Omit | Omit, -): ActionNodeDefinition { - return { - kind: "action", - ...definition, - } as ActionNodeDefinition; -} - -export function shell( - definition: Omit, -): ShellActionNodeDefinition { - return { - kind: "action", - ...definition, - }; -} - -export function checkpoint( - definition: Omit = {}, -): CheckpointNodeDefinition { - return { - kind: "checkpoint", - ...definition, - }; -} - export class FlowRunner { private readonly resolveAgent; private readonly permissionMode; @@ -293,7 +101,7 @@ export class FlowRunner { this.suppressSdkConsoleErrors = options.suppressSdkConsoleErrors; this.sessionOptions = options.sessionOptions; this.services = options.services ?? {}; - this.store = new FlowRunStore(options.outputRoot ?? flowRunsBaseDir()); + this.store = new FlowRunStore(options.outputRoot); } async run( @@ -339,7 +147,7 @@ export class FlowRunner { let promptText: string | null = null; let rawText: string | null = null; let sessionInfo: FlowSessionBinding | null = null; - let agentInfo: ReturnType | null = null; + let agentInfo: ResolvedFlowAgent | null = null; this.markNodeStarted(state, current, node.kind, startedAt, node.statusDetail); await this.store.writeSnapshot(runDir, state, { type: "node_started", @@ -720,7 +528,7 @@ export class FlowRunner { state: FlowRunState, flow: FlowDefinition, node: AcpNodeDefinition, - agent: ReturnType, + agent: ResolvedFlowAgent, ): Promise { const handle = node.session?.handle ?? "main"; const key = createSessionBindingKey(agent.agentCommand, agent.cwd, handle); @@ -794,7 +602,7 @@ export class FlowRunner { } private async runIsolatedPrompt( - agent: ReturnType, + agent: ResolvedFlowAgent, prompt: PromptInput, timeoutMs?: number, ): Promise { @@ -818,44 +626,13 @@ export class FlowRunner { } } -function validateFlowDefinition(flow: FlowDefinition): void { - if (!flow.name.trim()) { - throw new Error("Flow name must not be empty"); - } - if (!flow.nodes[flow.startAt]) { - throw new Error(`Flow start node is missing: ${flow.startAt}`); - } - - const outgoingEdges = new Set(); - for (const edge of flow.edges) { - if (!flow.nodes[edge.from]) { - throw new Error(`Flow edge references unknown from-node: ${edge.from}`); - } - if (outgoingEdges.has(edge.from)) { - throw new Error(`Flow node must not declare multiple outgoing edges: ${edge.from}`); - } - outgoingEdges.add(edge.from); - if ("to" in edge) { - if (!flow.nodes[edge.to]) { - throw new Error(`Flow edge references unknown to-node: ${edge.to}`); - } - continue; - } - for (const target of Object.values(edge.switch.cases)) { - if (!flow.nodes[target]) { - throw new Error(`Flow switch references unknown to-node: ${target}`); - } - } - } -} - function normalizePromptInput(prompt: PromptInput | string): PromptInput { return typeof prompt === "string" ? textPrompt(prompt) : prompt; } async function resolveNodeCwd( defaultCwd: string, - cwd: string | ((context: FlowNodeContext) => MaybePromise) | undefined, + cwd: AcpNodeDefinition["cwd"], context: FlowNodeContext, ): Promise { if (typeof cwd === "function") { @@ -865,43 +642,6 @@ async function resolveNodeCwd( return path.resolve(defaultCwd, cwd ?? defaultCwd); } -function resolveNext(edges: FlowEdge[], from: string, output: unknown): string | null { - const edge = edges.find((candidate) => candidate.from === from); - if (!edge) { - return null; - } - - if ("to" in edge) { - return edge.to; - } - - const value = getByPath(output, edge.switch.on); - if (typeof value !== "string" && typeof value !== "number" && typeof value !== "boolean") { - throw new Error(`Flow switch value must be scalar for ${edge.switch.on}`); - } - const next = edge.switch.cases[String(value)]; - if (!next) { - throw new Error(`No flow switch case for ${edge.switch.on}=${JSON.stringify(value)}`); - } - return next; -} - -function getByPath(value: unknown, jsonPath: string): unknown { - if (!jsonPath.startsWith("$.")) { - throw new Error(`Unsupported JSON path: ${jsonPath}`); - } - - return jsonPath - .slice(2) - .split(".") - .reduce((current, key) => { - if (current == null || typeof current !== "object") { - return undefined; - } - return (current as Record)[key]; - }, value); -} - function summarizePrompt(promptText: string, explicitDetail?: string): string { if (explicitDetail) { return explicitDetail; diff --git a/src/flows/store.ts b/src/flows/store.ts index 0967d0b..a03eff5 100644 --- a/src/flows/store.ts +++ b/src/flows/store.ts @@ -1,7 +1,7 @@ import fs from "node:fs/promises"; import os from "node:os"; import path from "node:path"; -import type { FlowRunState } from "./runtime.js"; +import type { FlowRunState } from "./types.js"; export type FlowStoreEvent = Record; diff --git a/src/flows/types.ts b/src/flows/types.ts new file mode 100644 index 0000000..7a622f6 --- /dev/null +++ b/src/flows/types.ts @@ -0,0 +1,186 @@ +import type { SessionAgentOptions } from "../session.js"; +import type { + AuthPolicy, + McpServer, + NonInteractivePermissionPolicy, + PermissionMode, + PromptInput, +} from "../types.js"; + +type MaybePromise = T | Promise; + +export type FlowNodeContext = { + input: TInput; + outputs: Record; + state: FlowRunState; + services: Record; +}; + +export type FlowNodeCommon = { + timeoutMs?: number; + heartbeatMs?: number; + statusDetail?: string; +}; + +export type FlowEdge = + | { + from: string; + to: string; + } + | { + from: string; + switch: { + on: string; + cases: Record; + }; + }; + +export type AcpNodeDefinition = FlowNodeCommon & { + kind: "acp"; + profile?: string; + cwd?: string | ((context: FlowNodeContext) => MaybePromise); + session?: { + handle?: string; + isolated?: boolean; + }; + prompt: (context: FlowNodeContext) => MaybePromise; + parse?: (text: string, context: FlowNodeContext) => MaybePromise; +}; + +export type ComputeNodeDefinition = FlowNodeCommon & { + kind: "compute"; + run: (context: FlowNodeContext) => MaybePromise; +}; + +export type FunctionActionNodeDefinition = FlowNodeCommon & { + kind: "action"; + run: (context: FlowNodeContext) => MaybePromise; +}; + +export type ShellActionExecution = { + command: string; + args?: string[]; + cwd?: string; + env?: Record; + stdin?: string; + shell?: boolean | string; + allowNonZeroExit?: boolean; + timeoutMs?: number; +}; + +export type ShellActionResult = { + command: string; + args: string[]; + cwd: string; + stdout: string; + stderr: string; + combinedOutput: string; + exitCode: number | null; + signal: NodeJS.Signals | null; + durationMs: number; +}; + +export type ShellActionNodeDefinition = FlowNodeCommon & { + kind: "action"; + exec: (context: FlowNodeContext) => MaybePromise; + parse?: (result: ShellActionResult, context: FlowNodeContext) => MaybePromise; +}; + +export type ActionNodeDefinition = FunctionActionNodeDefinition | ShellActionNodeDefinition; + +export type CheckpointNodeDefinition = FlowNodeCommon & { + kind: "checkpoint"; + summary?: string; + run?: (context: FlowNodeContext) => MaybePromise; +}; + +export type FlowNodeDefinition = + | AcpNodeDefinition + | ComputeNodeDefinition + | ActionNodeDefinition + | CheckpointNodeDefinition; + +export type FlowDefinition = { + name: string; + startAt: string; + nodes: Record; + edges: FlowEdge[]; +}; + +export type FlowStepRecord = { + nodeId: string; + kind: FlowNodeDefinition["kind"]; + startedAt: string; + finishedAt: string; + promptText: string | null; + rawText: string | null; + output: unknown; + session: FlowSessionBinding | null; + agent: { + agentName: string; + agentCommand: string; + cwd: string; + } | null; +}; + +export type FlowSessionBinding = { + key: string; + handle: string; + name: string; + profile?: string; + agentName: string; + agentCommand: string; + cwd: string; + acpxRecordId: string; + acpSessionId: string; + agentSessionId?: string; +}; + +export type FlowRunState = { + runId: string; + flowName: string; + flowPath?: string; + startedAt: string; + finishedAt?: string; + updatedAt: string; + status: "running" | "waiting" | "completed" | "failed" | "timed_out"; + input: unknown; + outputs: Record; + steps: FlowStepRecord[]; + sessionBindings: Record; + currentNode?: string; + currentNodeKind?: FlowNodeDefinition["kind"]; + currentNodeStartedAt?: string; + lastHeartbeatAt?: string; + statusDetail?: string; + waitingOn?: string; + error?: string; +}; + +export type FlowRunResult = { + runDir: string; + state: FlowRunState; +}; + +export type ResolvedFlowAgent = { + agentName: string; + agentCommand: string; + cwd: string; +}; + +export type FlowRunnerOptions = { + resolveAgent: (profile?: string) => ResolvedFlowAgent; + permissionMode: PermissionMode; + mcpServers?: McpServer[]; + nonInteractivePermissions?: NonInteractivePermissionPolicy; + authCredentials?: Record; + authPolicy?: AuthPolicy; + timeoutMs?: number; + defaultNodeTimeoutMs?: number; + ttlMs?: number; + verbose?: boolean; + suppressSdkConsoleErrors?: boolean; + sessionOptions?: SessionAgentOptions; + services?: Record; + outputRoot?: string; +}; diff --git a/src/queue-ipc-health.ts b/src/queue-ipc-health.ts new file mode 100644 index 0000000..97485fd --- /dev/null +++ b/src/queue-ipc-health.ts @@ -0,0 +1,64 @@ +import { connectToQueueOwner } from "./queue-ipc-transport.js"; +import { readQueueOwnerRecord, readQueueOwnerStatus } from "./queue-lease-store.js"; + +export type QueueOwnerHealth = { + sessionId: string; + hasLease: boolean; + healthy: boolean; + socketReachable: boolean; + pidAlive: boolean; + pid?: number; + socketPath?: string; + ownerGeneration?: number; + queueDepth?: number; +}; + +export async function probeQueueOwnerHealth(sessionId: string): Promise { + const ownerRecord = await readQueueOwnerRecord(sessionId); + if (!ownerRecord) { + return { + sessionId, + hasLease: false, + healthy: false, + socketReachable: false, + pidAlive: false, + }; + } + + const owner = await readQueueOwnerStatus(sessionId); + if (!owner) { + return { + sessionId, + hasLease: false, + healthy: false, + socketReachable: false, + pidAlive: false, + }; + } + + const pidAlive = owner.alive; + let socketReachable = false; + try { + const socket = await connectToQueueOwner(ownerRecord, 2); + if (socket) { + socketReachable = true; + if (!socket.destroyed) { + socket.end(); + } + } + } catch { + socketReachable = false; + } + + return { + sessionId, + hasLease: true, + healthy: socketReachable, + socketReachable, + pidAlive, + pid: owner.pid, + socketPath: owner.socketPath, + ownerGeneration: owner.ownerGeneration, + queueDepth: owner.queueDepth, + }; +} diff --git a/src/queue-ipc-transport.ts b/src/queue-ipc-transport.ts new file mode 100644 index 0000000..d545931 --- /dev/null +++ b/src/queue-ipc-transport.ts @@ -0,0 +1,58 @@ +import net from "node:net"; +import { measurePerf } from "./perf-metrics.js"; +import { type QueueOwnerRecord, waitMs } from "./queue-lease-store.js"; + +const QUEUE_CONNECT_ATTEMPTS = 40; +export const QUEUE_CONNECT_RETRY_MS = 50; + +function shouldRetryQueueConnect(error: unknown): boolean { + const code = (error as NodeJS.ErrnoException).code; + return code === "ENOENT" || code === "ECONNREFUSED"; +} + +async function connectToSocket(socketPath: string): Promise { + return await new Promise((resolve, reject) => { + const socket = net.createConnection(socketPath); + + const onConnect = () => { + socket.off("error", onError); + resolve(socket); + }; + const onError = (error: Error) => { + socket.off("connect", onConnect); + reject(error); + }; + + socket.once("connect", onConnect); + socket.once("error", onError); + }); +} + +export async function connectToQueueOwner( + owner: QueueOwnerRecord, + maxAttempts = QUEUE_CONNECT_ATTEMPTS, +): Promise { + let lastError: unknown; + + const attempts = Math.max(1, Math.trunc(maxAttempts)); + for (let attempt = 0; attempt < attempts; attempt += 1) { + try { + return await measurePerf( + "queue.connect", + async () => await connectToSocket(owner.socketPath), + ); + } catch (error) { + lastError = error; + if (!shouldRetryQueueConnect(error)) { + throw error; + } + await waitMs(QUEUE_CONNECT_RETRY_MS); + } + } + + if (lastError && !shouldRetryQueueConnect(lastError)) { + throw lastError; + } + + return undefined; +} diff --git a/src/queue-ipc.ts b/src/queue-ipc.ts index df69f0b..fdd0a21 100644 --- a/src/queue-ipc.ts +++ b/src/queue-ipc.ts @@ -1,19 +1,13 @@ import { randomUUID } from "node:crypto"; -import net from "node:net"; import type { SetSessionConfigOptionResponse } from "@agentclientprotocol/sdk"; import { QueueConnectionError, QueueProtocolError } from "./errors.js"; -import { incrementPerfCounter, measurePerf } from "./perf-metrics.js"; +import { incrementPerfCounter } from "./perf-metrics.js"; +import { probeQueueOwnerHealth, type QueueOwnerHealth } from "./queue-ipc-health.js"; +import { QUEUE_CONNECT_RETRY_MS, connectToQueueOwner } from "./queue-ipc-transport.js"; import { - type QueueOwnerLease, type QueueOwnerRecord, - isProcessAlive, readQueueOwnerRecord, - readQueueOwnerStatus, - releaseQueueOwnerLease, - terminateProcess, terminateQueueOwnerForSession, - tryAcquireQueueOwnerLease, - waitMs, } from "./queue-lease-store.js"; import { parseQueueOwnerMessage, @@ -37,8 +31,7 @@ import type { SessionSendOutcome, } from "./types.js"; -const QUEUE_CONNECT_ATTEMPTS = 40; -export const QUEUE_CONNECT_RETRY_MS = 50; +export { QUEUE_CONNECT_RETRY_MS } from "./queue-ipc-transport.js"; export { isProcessAlive, releaseQueueOwnerLease, @@ -82,124 +75,12 @@ async function maybeRecoverStaleOwnerAfterProtocolMismatch(params: { return true; } -export type QueueOwnerHealth = { - sessionId: string; - hasLease: boolean; - healthy: boolean; - socketReachable: boolean; - pidAlive: boolean; - pid?: number; - socketPath?: string; - ownerGeneration?: number; - queueDepth?: number; -}; - +export { probeQueueOwnerHealth }; +export type { QueueOwnerHealth }; export type { QueueOwnerMessage, QueueSubmitRequest } from "./queue-messages.js"; export type { QueueOwnerControlHandlers, QueueTask } from "./queue-ipc-server.js"; export { SessionQueueOwner } from "./queue-ipc-server.js"; -function shouldRetryQueueConnect(error: unknown): boolean { - const code = (error as NodeJS.ErrnoException).code; - return code === "ENOENT" || code === "ECONNREFUSED"; -} - -async function connectToSocket(socketPath: string): Promise { - return await new Promise((resolve, reject) => { - const socket = net.createConnection(socketPath); - - const onConnect = () => { - socket.off("error", onError); - resolve(socket); - }; - const onError = (error: Error) => { - socket.off("connect", onConnect); - reject(error); - }; - - socket.once("connect", onConnect); - socket.once("error", onError); - }); -} - -async function connectToQueueOwner( - owner: QueueOwnerRecord, - maxAttempts = QUEUE_CONNECT_ATTEMPTS, -): Promise { - let lastError: unknown; - - const attempts = Math.max(1, Math.trunc(maxAttempts)); - for (let attempt = 0; attempt < attempts; attempt += 1) { - try { - return await measurePerf( - "queue.connect", - async () => await connectToSocket(owner.socketPath), - ); - } catch (error) { - lastError = error; - if (!shouldRetryQueueConnect(error)) { - throw error; - } - await waitMs(QUEUE_CONNECT_RETRY_MS); - } - } - - if (lastError && !shouldRetryQueueConnect(lastError)) { - throw lastError; - } - - return undefined; -} - -export async function probeQueueOwnerHealth(sessionId: string): Promise { - const ownerRecord = await readQueueOwnerRecord(sessionId); - if (!ownerRecord) { - return { - sessionId, - hasLease: false, - healthy: false, - socketReachable: false, - pidAlive: false, - }; - } - - const owner = await readQueueOwnerStatus(sessionId); - if (!owner) { - return { - sessionId, - hasLease: false, - healthy: false, - socketReachable: false, - pidAlive: false, - }; - } - - const pidAlive = owner.alive; - let socketReachable = false; - try { - const socket = await connectToQueueOwner(ownerRecord, 2); - if (socket) { - socketReachable = true; - if (!socket.destroyed) { - socket.end(); - } - } - } catch { - socketReachable = false; - } - - return { - sessionId, - hasLease: true, - healthy: socketReachable, - socketReachable, - pidAlive, - pid: owner.pid, - socketPath: owner.socketPath, - ownerGeneration: owner.ownerGeneration, - queueDepth: owner.queueDepth, - }; -} - function assertOwnerGeneration( owner: QueueOwnerRecord, message: QueueOwnerMessage, From d63019b95301aea6fa982b410cbb0f7e9d27dcbf Mon Sep 17 00:00:00 2001 From: Onur Solmaz <2453968+osolmaz@users.noreply.github.com> Date: Thu, 26 Mar 2026 11:46:19 +0100 Subject: [PATCH 16/22] docs: add pr triage flow example --- examples/flows/README.md | 9 +- examples/flows/pr-triage/README.md | 221 ++++ examples/flows/pr-triage/pr-triage.flow.ts | 1247 ++++++++++++++++++++ 3 files changed, 1475 insertions(+), 2 deletions(-) create mode 100644 examples/flows/pr-triage/README.md create mode 100644 examples/flows/pr-triage/pr-triage.flow.ts diff --git a/examples/flows/README.md b/examples/flows/README.md index 1e46367..49ea6c6 100644 --- a/examples/flows/README.md +++ b/examples/flows/README.md @@ -1,9 +1,10 @@ # Flow Examples -These are simple source-tree examples for `acpx flow run`. +These are source-tree examples for `acpx flow run`. - `echo.flow.ts`: one ACP step that returns a JSON reply - `branch.flow.ts`: ACP classification followed by a deterministic branch into either `continue` or `checkpoint` +- `pr-triage/pr-triage.flow.ts`: a larger single-PR workflow example with a colocated written spec in `pr-triage/README.md` - `shell.flow.ts`: one native runtime-owned shell action that returns structured JSON - `workdir.flow.ts`: native workspace prep followed by an ACP step that runs inside that isolated cwd - `two-turn.flow.ts`: two ACP prompts in the same implicit main session @@ -17,6 +18,9 @@ acpx flow run examples/flows/echo.flow.ts \ acpx flow run examples/flows/branch.flow.ts \ --input-json '{"task":"FIX: add a regression test for the reconnect bug"}' +acpx flow run examples/flows/pr-triage/pr-triage.flow.ts \ + --input-json '{"repo":"openclaw/acpx","prNumber":150}' + acpx flow run examples/flows/shell.flow.ts \ --input-json '{"text":"hello from shell"}' @@ -26,4 +30,5 @@ acpx flow run examples/flows/two-turn.flow.ts \ --input-json '{"topic":"How should we validate a new ACP adapter?"}' ``` -These examples are generic. `acpx` does not ship workload-specific flows in core. +These examples are examples only. They do not define `acpx` core product behavior. +The PR-triage example can comment on or close real GitHub PRs if you run it against a live repository. diff --git a/examples/flows/pr-triage/README.md b/examples/flows/pr-triage/README.md new file mode 100644 index 0000000..c54b38a --- /dev/null +++ b/examples/flows/pr-triage/README.md @@ -0,0 +1,221 @@ +--- +description: Prompt for triaging PRs, issues, or issue descriptions by inferring the plain-language intent, judging whether the work actually solves the underlying problem, routing ambiguous cases to a human, and only landing changes after Codex review and CI are in an acceptable state +--- + +```mermaid +flowchart TD + classDef hidden fill:none,stroke:none,color:none,stroke-width:0px; + A[Read item] --> B[Find intent] + B --> C{"Judge implementation
or solution"} + + C -->|"Bad, localized,
or unclear"| D[Comment and close PR] + C -->|"Seems OK but needs a
design decision/human call"| E[Comment and escalate to human] + C -->|Good enough| V{Bug or feature?} + + V -->|Bug| P[If bug, reproduce it
and test the fix] + V -->|Feature| T[If feature,
test it directly] + + P -->|Validated| F{Refactor?} + T -->|Validated| F + P -->|"Not validated"| X(( )) + T -->|"Not validated"| X + X --> E + X ~~~ F + class X hidden + + F -->|Fundamental| E + F -->|Superficial| R[Do superficial refactor] + F -->|None| G[Trigger Codex review] + R --> G + + G --> H{P0 or P1?} + H -->|No| J[Check CI] + H -->|Yes| I[Address review feedback] + I --> G + J --> K{CI failures?} + K -->|No| E + K -->|Yes| L[Fix CI failures] + L --> J +``` + +This prompt may process multiple items in one run. Use it for the triage lane, not the single-PR landing lane. + +1. **Process each item independently.** Take a list of items as input. Each item may be a PR, an issue, or a raw issue description. Process each item separately and do not let the framing of one item leak into another. + +2. **Figure out what the work is trying to do for a human.** For each item, first figure out the real intention behind it. Read the code, the diff, the issue text, the PR description, and any surrounding context needed to answer one question in plain language: what is this actually trying to do for a human? Write that intention like one human talking to another human. Do not hide behind technical jargon. Translate jargon into purpose. If the stated PR description sounds model-generated or overly technical, do not repeat it blindly; recover the plain-language goal underneath it. + +3. **Decide whether the implementation or solution actually solves the real problem.** Once you have the intention, judge the work against that intention. Do not stop at “does the code compile” or “does the diff match the ticket.” Ask whether the PR or proposed implementation is addressing the underlying problem in a real and durable way, or whether it is only treating a symptom locally. Be explicit about the difference between a fundamental fix and a shortcut, band-aid, or narrowly scoped patch that avoids the real issue. + - Treat an unclear PR the same as a bad or localized fix for closure purposes. If the PR is not even clear enough to evaluate confidently, it should be closed rather than routed to a human. + +4. **Close PRs that are wrong, too local, or too unclear.** If the item is a PR and your judgment is that the proposed solution is wrong-shaped for the problem, only treats a symptom, is just a localized fix that does not address the underlying issue, or the PR is not even clear enough to evaluate confidently, do not send it down the human-review lane by default. Instead, treat that as a rejection outcome for the PR: write a concise comment explaining the plain-language intention as best you can recover it, why the current implementation does not solve the right problem or is too unclear to keep moving, and what kind of reframing would be needed, then close the PR. Use the human-attention lane for cases that need a human product or architecture judgment before deciding whether the work should continue at all. + +5. **Classify how much refactoring is really needed.** As part of that judgment, explicitly decide whether the item needs a refactor, and if so what kind: + - no refactor needed: the current shape is acceptable for the intention + - superficial refactor: cleanup, reshaping, or local improvement is needed, but the work can still be completed autonomously without changing the core framing of the solution + - fundamental refactor: the current approach is wrong-shaped for the problem and needs a deeper restructuring, reframing, or architectural change in order to solve the intention properly + +6. **Choose between continue, close, or escalate.** Based on that judgment, decide whether the item is safe to keep moving autonomously, should be closed, or needs human attention before landing. Close a PR if any of the following are true: + - the intention is unclear, conflicting, or poorly framed + - the implementation is not actually serving the intention + - the solution is too localized, too reactive, or too narrow for the problem it claims to solve + - the PR is a shortcut, band-aid, or symptom fix rather than a real solution + - the current implementation should be rejected rather than iterated on + +7. **Only escalate when a real human judgment call is needed.** Route the item to a human if any of the following are true: + - the right answer may require reframing the problem, changing the product behavior, or making an architectural call rather than just fixing code + - a fundamental refactor is needed to solve the problem properly + - a human must decide what the correct product or architecture direction should be before any implementation can be judged + +8. **Choose the right validation path before polishing the PR.** After deciding that the solution is good enough to continue, explicitly decide whether the PR is primarily a bug-fix or a feature/behavior change. That choice determines how the work should be validated before it proceeds to refactor, review, or CI. + +9. **If it is a bug, reproduce it and then test the fix.** For a bug-fix, regression, or other failure claim, identify the smallest targeted repro or test that captures the issue. If needed, temporarily ablate the fix or the changed test setup so you can demonstrate failure on the refreshed base or ablated state. That temporary ablation must stay local only: do not commit it, do not push it, and do not leave the PR branch in the broken state. Restore the real PR fix in the working tree before continuing, then rerun the same repro or targeted test to prove that the fix changes the outcome. When feasible, also run relevant integration or end-to-end tests near that behavior. + +10. **If it is a feature, test the changed behavior directly.** For a feature or behavior change, validate the changed behavior directly on the PR branch with the smallest targeted test or check that shows the feature works as intended. When feasible, also run relevant integration or end-to-end tests near that behavior. Do not force an artificial “reproduce a prior failure” step for work that is not actually a bug fix. + +11. **Escalate if the claimed work cannot actually be validated.** If a bug cannot be reproduced, the fix does not change the outcome, or a feature change cannot be validated confidently with targeted testing, stop and escalate to a human rather than continuing into refactor, review, or CI as if the work were proven. + +12. **Do superficial refactors before continuing into review.** If the item only needs a superficial refactor, that does not require human attention by itself. Superficial refactors should be done on the autonomous lane before the item proceeds into Codex review. Only fundamental refactors trigger the human-attention path. + +13. **Keep moving when the work is good enough to continue.** If the item does not need human attention and is not a close outcome, continue autonomously. It is acceptable to proceed with automated review, local validation, CI/CD checking, and follow-up fixes as long as the intention is clear, the work has been validated on the correct bug or feature path, and the item does not require a human product or architecture judgment. If the implementation looks acceptable enough to continue, keep going rather than blocking on perfectionism. + +14. **Do not spend review effort on work that should stop early.** Only continue into Codex review if the item is safe to continue autonomously. If the item needs human attention or should be closed, stop the autonomous flow there. Do not spend time running Codex review, fixing code, or chasing CI on work that is not ready to merge anyway. Instead, write up the intention, the reason human attention is required or the reason the PR should be closed, whether a fundamental refactor is needed, whether the bug or feature claim could actually be validated, and the exact decision or reframing needed from a human. + +15. **Trigger Codex review in a fixed order on every PR that stays on the autonomous lane.** For items that are safe to continue autonomously, every PR must go through Codex review in this order. First, check whether the PR already has Codex review comments on GitHub for the current PR head and address the valid unresolved ones. Do not skip existing Codex feedback just because you plan to run another review. When reading GitHub review state, do not rely on `gh pr view --comments`; use stable REST-backed `gh api` calls such as `repos/{owner}/{repo}/pulls/{pr}/reviews`, `repos/{owner}/{repo}/pulls/{pr}/comments`, and `repos/{owner}/{repo}/issues/{pr}/comments` instead. After that, refresh the PR base branch from origin, determine the correct updated base ref or merge base from the checked-out repo, and run a fresh local `codex review --base ` against that fresh base ref. Do not review against a stale local base branch, against the whole repository state, or against a stale local diff. If that local Codex review cannot be completed reliably, including timing out, stop pretending review is clear and escalate to a human. Treat P0 and P1 findings from either source as blockers that must be resolved before the PR can move forward. P2 and lower findings are not blockers by default; handle them with judgment and do not keep looping just to polish them unless they materially change the intention-first assessment. + +16. **Address blocking review feedback and rerun local review if needed.** After Codex review, make sure the review feedback is actually closed out. That means: + +- valid Codex findings from GitHub reviews are fixed or otherwise resolved with a clear reason +- the PR base branch has been refreshed from origin before local review +- a fresh local `codex review --base ` has been run against the current branch state relative to that fresh base ref +- if you changed code while addressing review feedback, rerun the targeted validation from the earlier bug-or-feature validation step and any nearby integration or end-to-end tests when feasible before continuing +- irrelevant findings are explicitly dismissed or explained, not silently ignored +- stale comments from older commits are recognized as stale and not mistaken for current blockers +- if P0 or P1 findings remain, address that review feedback and run local review again until the blocking findings are cleared + +17. **Check whether CI failures really belong to this PR, and approve workflow runs when that is the blocker.** Then evaluate CI/CD for items still on the autonomous lane. If CI is green, that part is satisfied. If CI is not fully green, determine whether the failures are actually caused by the PR. If a workflow run is blocked only because it needs maintainer approval to run, approve or enable that workflow run first if you have permission, for example with the workflow-run approval endpoint `POST /repos/{owner}/{repo}/actions/runs/{run_id}/approve`, then re-check CI before escalating. If failures are unrelated, pre-existing, or clearly due to external churn outside the diff, document that plainly and do not treat them as blockers. If the failures are plausibly related to the PR, they must be fixed before landing. After fixing related CI failures or approving the blocked workflow run, check CI again until the related failures are gone or clearly shown to be unrelated. If the only remaining blocker is a workflow approval gate that you cannot clear yourself, escalate to a human and say that explicitly. + +- if you changed code while fixing CI-related problems, rerun the targeted validation from the earlier bug-or-feature validation step and any nearby integration or end-to-end tests when feasible before checking CI again + +18. **Only land PRs that clear every gate.** A PR is ready to land only if all of the following are true: + +- the plain-language intention is clear +- the implementation serves that intention in a real way rather than merely covering symptoms +- the work has been validated on the correct path: a bug was reproduced and shown fixed, or a feature was tested directly +- any needed refactor is either unnecessary or superficial rather than fundamental +- there is no remaining need for human framing or architectural judgment +- Codex review has happened, existing GitHub Codex feedback has been handled, and there are no unresolved P0 or P1 findings +- CI/CD is green, or any remaining failures are clearly unrelated to the PR + +19. **Apply the same judgment to issues, but only close real PRs.** If the item is an issue or issue description rather than an existing PR, do the same intention-first analysis and decide whether it is ready for autonomous implementation or whether it needs human framing first. If the issue is already framed well enough to proceed, say so. If it is not, explain exactly what judgment call, fundamental refactor, or reframing a human still needs to provide. The explicit close action applies only to real PRs. + +20. **Write down one concise decision record for each item.** For every item, produce a concise but complete result with these sections: + +- Plain-language intention +- Is the intention valid +- Does the current PR or proposed solution actually solve the right problem +- Was the work validated on the correct path: bug reproduced and fixed, or feature tested directly +- Should this PR be closed immediately +- Refactor needed: none, superficial, or fundamental +- Human attention required, safe to continue autonomously, or close now +- Codex review status and any blocking findings +- CI/CD status and whether any failures are unrelated +- Final recommendation: close PR, land, continue autonomously, or escalate to a human + +21. **Post the result back, and close PRs when the outcome says to close them.** If the item is a real PR or issue, post the final result back onto that item as a comment. The comment should be written for a human reviewer or author, in plain language, and should include the intention, the judgment about whether the work really solves the right problem, whether the work was actually validated on the correct bug or feature path, whether a refactor is needed and what kind, whether the PR should be closed, any blocking Codex review or CI concerns, and the final recommendation. If the item needed human attention, the comment should clearly say that the autonomous review-and-land path was intentionally stopped early and that a fundamental refactor, failed validation step, or human reframing is still needed. All human-escalation outcomes should use the same basic note structure; do not invent separate note formats for different escalation branches. Instead, reuse one shared human note and make the reason explicit, such as `design decision/human call`, `validation not established`, or `ready for human landing decision`. If the item is a PR and the conclusion is that the current implementation is unclear, a bad fix, or merely a localized fix, close the PR after posting the comment. If the input item is only a raw issue description with no real item to comment on, skip the posting step and state that there was no concrete item to comment on. + +### Timeout assumptions in the executable flow + +These are the current operational timeout assumptions in the single-file executable workflow, and the markdown should stay in sync with the TypeScript file: + +- `prepare_workspace`: 20 minutes +- `reproduce_bug_and_test_fix`: 30 minutes +- `test_feature_directly`: 25 minutes +- `do_superficial_refactor`: 25 minutes +- `collect_review_state`: 60 minutes +- nested local `codex review` inside `collect_review_state`: 30 minutes +- `review_loop`: 30 minutes +- `collect_ci_state`: 15 minutes +- `fix_ci_failures`: 30 minutes +- `post_close_pr`: 15 minutes +- `post_escalation_comment`: 10 minutes +- `ensureProjectDependencies` (`pnpm install --frozen-lockfile` when needed): 20 minutes +- each targeted validation command in the bug/feature validation steps: 20 minutes + +ACP steps without an explicit timeout in the workflow currently rely on the `acpx` flow runtime default. At the moment that default is 15 minutes, so if a step such as `extract_intent`, `judge_solution`, `bug_or_feature`, `judge_refactor`, `comment_and_close_pr`, or `comment_and_escalate_to_human` should have a different budget, that must be stated explicitly in the workflow file. + +22. **Use a short, scannable comment template with explicit status signals.** Use an actual comment template when posting the result. Keep it short, plain, and scannable. Use helpful status emojis so a human can quickly tell whether this is safe to keep moving, needs intervention, or should be closed. When the outcome is `escalate to human`, always use the same note format and include a field or line that clearly states why human input is needed. This template is mandatory for posted comments. Do not invent a different layout. + +Emoji guide: + +- `✅` valid / good / safe +- `⚠️` needs human attention +- `🛑` close the PR +- `🔧` superficial refactor +- `🧱` fundamental refactor +- `🟢` safe to continue autonomously +- `🔴` blocked from autonomous landing +- `➖` not applicable +- `🧭` validation status +- `🧪` Codex review or test status +- `🚦` CI/CD status +- `🏁` final recommendation + +Default comment template: + +```md +## Triage result + +### Quick read + +- Intent valid: ✅ Yes / ❌ No +- Solves the right problem: ✅ Yes / ⚠️ Partly / ❌ No / 🛑 Localized, bad, or unclear fix +- Validation: ✅ Bug reproduced and fixed / ✅ Feature tested directly / ⚠️ Not validated / ➖ Not applicable / ⏸️ Not run +- Close PR: 🛑 Yes / ✅ No +- Refactor needed: ✅ None / 🔧 Superficial / 🧱 Fundamental +- Human attention: ⚠️ Required / 🟢 Not required / 🛑 Not applicable because PR should close +- Recommendation: 🏁 + +### Intent + +> + +### Why + +<2-5 plain-language bullets explaining the judgment> + +### Codex review + +- Status: 🧪 Not run / 🧪 Already present / ✅ Clear / 🔴 Blocking findings remain +- Notes: + +### CI/CD + +- Status: 🚦 Green / 🚦 Mixed but unrelated / 🔴 Related failures remain / ⏸️ Approval needed / ⏸️ Not checked +- Notes: + +### Recommendation + +🏁 +``` + +If the item needs human attention, the template should make that obvious near the top: + +- `Human attention: ⚠️ Required` +- `Human decision needed: ` +- `Refactor needed: 🧱 Fundamental` if applicable +- `Recommendation: 🏁 escalate to a human` + +Use this same human-escalation note shape for every human branch. Only change the explicit human decision needed line and the supporting explanation. + +If the item is a PR and the solution is bad, unclear, or merely localized, the template should make that obvious near the top: + +- `Solves the right problem: 🛑 Localized, bad, or unclear fix` +- `Close PR: 🛑 Yes` +- `Recommendation: 🏁 close PR` + +If the item is safe to keep moving: + +- `Human attention: 🟢 Not required` +- `Recommendation: 🏁 continue autonomously` or `🏁 land` + +23. **Be rigorous about protecting the project from wrong-shaped work.** Be extremely diligent. The point of this prompt is not just to do process. The point is to protect against technically polished PRs that sound right but are solving the wrong thing, solving too little, or avoiding the real problem behind the work. diff --git a/examples/flows/pr-triage/pr-triage.flow.ts b/examples/flows/pr-triage/pr-triage.flow.ts new file mode 100644 index 0000000..f577356 --- /dev/null +++ b/examples/flows/pr-triage/pr-triage.flow.ts @@ -0,0 +1,1247 @@ +import { spawn } from "node:child_process"; +import fs from "node:fs/promises"; +import os from "node:os"; +import path from "node:path"; + +const FLOW_DIR = ".acpx-flow"; +const MAIN_SESSION = { + handle: "main", +}; + +const flow = { + name: "pr-triage", + startAt: "load_pr", + nodes: { + load_pr: { + kind: "compute", + run: ({ input }) => loadPullRequestInput(input), + }, + + prepare_workspace: { + kind: "action", + timeoutMs: 20 * 60_000, + statusDetail: "Create isolated PR workspace and fetch GitHub context", + run: async ({ outputs }) => await prepareWorkspace(loadPrOutput(outputs)), + }, + + extract_intent: { + kind: "acp", + session: MAIN_SESSION, + cwd: ({ outputs }) => prepared(outputs).workdir, + async prompt({ outputs }) { + return promptExtractIntent(prepared(outputs)); + }, + parse: (text) => extractJson(text), + }, + + judge_solution: { + kind: "acp", + session: MAIN_SESSION, + cwd: ({ outputs }) => prepared(outputs).workdir, + async prompt({ outputs }) { + return promptJudgeSolution(prepared(outputs)); + }, + parse: (text) => extractJson(text), + }, + + bug_or_feature: { + kind: "acp", + session: MAIN_SESSION, + cwd: ({ outputs }) => prepared(outputs).workdir, + async prompt({ outputs }) { + return promptBugOrFeature(prepared(outputs)); + }, + parse: (text) => extractJson(text), + }, + + reproduce_bug_and_test_fix: { + kind: "action", + timeoutMs: 30 * 60_000, + statusDetail: "Reproduce the bug and validate the fix in the isolated workspace", + run: async ({ outputs }) => + await reproduceBugAndTestFix(prepared(outputs), outputs.bug_or_feature), + }, + + test_feature_directly: { + kind: "action", + timeoutMs: 25 * 60_000, + statusDetail: "Run direct feature validation in the isolated workspace", + run: async ({ outputs }) => + await testFeatureDirectly(prepared(outputs), outputs.bug_or_feature), + }, + + judge_refactor: { + kind: "acp", + session: MAIN_SESSION, + cwd: ({ outputs }) => prepared(outputs).workdir, + async prompt({ outputs }) { + return promptJudgeRefactor(prepared(outputs), outputs); + }, + parse: (text) => extractJson(text), + }, + + do_superficial_refactor: { + kind: "acp", + session: MAIN_SESSION, + cwd: ({ outputs }) => prepared(outputs).workdir, + timeoutMs: 25 * 60_000, + async prompt({ outputs }) { + return promptDoSuperficialRefactor(prepared(outputs), outputs); + }, + parse: (text) => extractJson(text), + }, + + collect_review_state: { + kind: "action", + timeoutMs: 60 * 60_000, + statusDetail: "Collect GitHub review state and run local Codex review", + run: async ({ outputs }) => await collectReviewState(prepared(outputs)), + }, + + review_loop: { + kind: "acp", + session: MAIN_SESSION, + cwd: ({ outputs }) => prepared(outputs).workdir, + timeoutMs: 30 * 60_000, + async prompt({ outputs }) { + return promptReviewLoop(prepared(outputs), outputs); + }, + parse: (text) => extractJson(text), + }, + + collect_ci_state: { + kind: "action", + timeoutMs: 15 * 60_000, + statusDetail: "Collect CI state and approve workflow runs when possible", + run: async ({ outputs }) => await collectCiState(prepared(outputs)), + }, + + fix_ci_failures: { + kind: "acp", + session: MAIN_SESSION, + cwd: ({ outputs }) => prepared(outputs).workdir, + timeoutMs: 30 * 60_000, + async prompt({ outputs }) { + return promptFixCiFailures(prepared(outputs), outputs); + }, + parse: (text) => extractJson(text), + }, + + comment_and_close_pr: { + kind: "acp", + session: MAIN_SESSION, + cwd: ({ outputs }) => prepared(outputs).workdir, + async prompt({ outputs }) { + return promptCommentAndClose(prepared(outputs), outputs); + }, + parse: (text) => extractJson(text), + }, + + post_close_pr: { + kind: "action", + timeoutMs: 15 * 60_000, + statusDetail: "Post close comment and close the PR", + run: async ({ outputs }) => + await postClosePr(prepared(outputs), outputs.comment_and_close_pr), + }, + + comment_and_escalate_to_human: { + kind: "acp", + session: MAIN_SESSION, + cwd: ({ outputs }) => prepared(outputs).workdir, + async prompt({ outputs }) { + return promptCommentAndEscalate(prepared(outputs), outputs); + }, + parse: (text) => extractJson(text), + }, + + post_escalation_comment: { + kind: "action", + timeoutMs: 10 * 60_000, + statusDetail: "Post human handoff comment", + run: async ({ outputs }) => + await postEscalationComment(prepared(outputs), outputs.comment_and_escalate_to_human), + }, + + finalize: { + kind: "compute", + run: ({ outputs, state }) => ({ + final: + outputs.post_close_pr ?? + outputs.post_escalation_comment ?? + outputs.comment_and_close_pr ?? + outputs.comment_and_escalate_to_human ?? + null, + intent: outputs.extract_intent ?? null, + solution: outputs.judge_solution ?? null, + validationPath: outputs.bug_or_feature ?? null, + validation: outputs.reproduce_bug_and_test_fix ?? outputs.test_feature_directly ?? null, + refactor: outputs.judge_refactor ?? null, + review: outputs.review_loop ?? null, + ci: outputs.fix_ci_failures ?? null, + workspace: outputs.prepare_workspace ?? null, + sessionBindings: state.sessionBindings, + }), + }, + }, + edges: [ + { from: "load_pr", to: "prepare_workspace" }, + { from: "prepare_workspace", to: "extract_intent" }, + { from: "extract_intent", to: "judge_solution" }, + { + from: "judge_solution", + switch: { + on: "$.route", + cases: { + close_pr: "comment_and_close_pr", + comment_and_escalate_to_human: "comment_and_escalate_to_human", + bug_or_feature: "bug_or_feature", + }, + }, + }, + { + from: "bug_or_feature", + switch: { + on: "$.route", + cases: { + reproduce_bug_and_test_fix: "reproduce_bug_and_test_fix", + test_feature_directly: "test_feature_directly", + comment_and_escalate_to_human: "comment_and_escalate_to_human", + }, + }, + }, + { + from: "reproduce_bug_and_test_fix", + switch: { + on: "$.route", + cases: { + judge_refactor: "judge_refactor", + comment_and_escalate_to_human: "comment_and_escalate_to_human", + }, + }, + }, + { + from: "test_feature_directly", + switch: { + on: "$.route", + cases: { + judge_refactor: "judge_refactor", + comment_and_escalate_to_human: "comment_and_escalate_to_human", + }, + }, + }, + { + from: "judge_refactor", + switch: { + on: "$.route", + cases: { + collect_review_state: "collect_review_state", + do_superficial_refactor: "do_superficial_refactor", + comment_and_escalate_to_human: "comment_and_escalate_to_human", + }, + }, + }, + { from: "do_superficial_refactor", to: "collect_review_state" }, + { from: "collect_review_state", to: "review_loop" }, + { + from: "review_loop", + switch: { + on: "$.route", + cases: { + collect_review_state: "collect_review_state", + collect_ci_state: "collect_ci_state", + comment_and_escalate_to_human: "comment_and_escalate_to_human", + }, + }, + }, + { from: "collect_ci_state", to: "fix_ci_failures" }, + { + from: "fix_ci_failures", + switch: { + on: "$.route", + cases: { + collect_ci_state: "collect_ci_state", + comment_and_escalate_to_human: "comment_and_escalate_to_human", + }, + }, + }, + { from: "comment_and_close_pr", to: "post_close_pr" }, + { from: "post_close_pr", to: "finalize" }, + { from: "comment_and_escalate_to_human", to: "post_escalation_comment" }, + { from: "post_escalation_comment", to: "finalize" }, + ], +}; + +export default flow; + +async function prepareWorkspace(pr) { + const prData = await ghApiJson(`repos/${pr.repo}/pulls/${pr.prNumber}`); + const files = await ghApiJson(`repos/${pr.repo}/pulls/${pr.prNumber}/files?per_page=100`); + const linkedIssueNumber = extractLinkedIssueNumber(String(prData.body ?? "")); + const issue = + linkedIssueNumber !== null + ? await ghApiJson(`repos/${pr.repo}/issues/${linkedIssueNumber}`) + : null; + + const workdir = await fs.mkdtemp(path.join(os.tmpdir(), `acpx-pr${pr.prNumber}-`)); + const baseCloneUrl = String(prData.base.repo.clone_url); + const headCloneUrl = String(prData.head.repo.clone_url); + const baseRef = String(prData.base.ref); + const headRef = String(prData.head.ref); + const headSha = String(prData.head.sha); + const localBranch = `pr-${pr.prNumber}-head`; + const pushRemote = headCloneUrl === baseCloneUrl ? "origin" : "head"; + + await runCommand("git", ["clone", "--origin", "origin", baseCloneUrl, workdir]); + + if (pushRemote === "head") { + await runCommand("git", ["-C", workdir, "remote", "add", "head", headCloneUrl]); + } + + await runCommand("git", [ + "-C", + workdir, + "fetch", + "origin", + `refs/heads/${baseRef}:refs/remotes/origin/${baseRef}`, + ]); + await runCommand("git", [ + "-C", + workdir, + "fetch", + pushRemote, + `refs/heads/${headRef}:refs/remotes/${pushRemote}/${headRef}`, + ]); + await runCommand("git", ["-C", workdir, "checkout", "-B", localBranch, headSha]); + await runCommand("git", [ + "-C", + workdir, + "branch", + "--set-upstream-to", + `${pushRemote}/${headRef}`, + localBranch, + ]); + + const metaDir = path.join(workdir, FLOW_DIR); + await fs.mkdir(metaDir, { recursive: true }); + await writeJson(path.join(metaDir, "pr.json"), prData); + await writeJson(path.join(metaDir, "files.json"), files); + await writeJson(path.join(metaDir, "issue.json"), issue); + await writeJson(path.join(metaDir, "workspace.json"), { + repo: pr.repo, + prNumber: pr.prNumber, + prUrl: pr.prUrl, + workdir, + baseRef, + headRef, + headSha, + localBranch, + pushRemote, + pushRef: headRef, + isCrossRepository: Boolean(prData.head.repo.full_name !== prData.base.repo.full_name), + }); + + return { + ...pr, + title: String(prData.title ?? ""), + body: String(prData.body ?? ""), + baseRef, + headRef, + headSha, + localBranch, + pushRemote, + pushRef: headRef, + workdir, + flowDir: metaDir, + linkedIssueNumber, + changedFiles: Array.isArray(files) ? files : [], + isCrossRepository: Boolean(prData.head.repo.full_name !== prData.base.repo.full_name), + }; +} + +async function reproduceBugAndTestFix(pr, validationPath) { + if (validationPath?.kind !== "bug") { + throw new Error("Bug validation action requires bug validation path"); + } + + await ensureProjectDependencies(pr.workdir); + const testPlan = buildTargetedTestPlan(pr.changedFiles); + if (testPlan.commands.length === 0) { + return { + validation_status: "fix_not_proven", + route: "comment_and_escalate_to_human", + summary: "No targeted test command could be derived from the PR changes.", + repro_steps: [], + targeted_tests: [], + integration_tests: [], + e2e_tests: [], + restored_branch_state: true, + }; + } + + const codeFiles = pr.changedFiles + .map((file) => String(file.filename ?? "")) + .filter((filename) => filename && !isTestFile(filename)); + if (codeFiles.length === 0) { + return { + validation_status: "fix_not_proven", + route: "comment_and_escalate_to_human", + summary: + "Could not isolate a non-test code change to ablate while keeping the new validation intact.", + repro_steps: [], + targeted_tests: testPlan.commands, + integration_tests: [], + e2e_tests: [], + restored_branch_state: true, + }; + } + + const baseRef = `origin/${pr.baseRef}`; + await runCommand("git", ["-C", pr.workdir, "fetch", "origin", pr.baseRef]); + const mergeBase = ( + await runCommand("git", ["-C", pr.workdir, "merge-base", "HEAD", baseRef]) + ).stdout.trim(); + const patch = ( + await runCommand("git", [ + "-C", + pr.workdir, + "diff", + "--binary", + `${mergeBase}..HEAD`, + "--", + ...codeFiles, + ]) + ).stdout; + if (!patch.trim()) { + return { + validation_status: "fix_not_proven", + route: "comment_and_escalate_to_human", + summary: "Could not derive an ablation patch for the non-test code changes in this PR.", + repro_steps: [], + targeted_tests: testPlan.commands, + integration_tests: [], + e2e_tests: [], + restored_branch_state: true, + }; + } + + const initial = await runValidationPlan(pr.workdir, testPlan.commands); + if (!initial.ok) { + return { + validation_status: "fix_not_proven", + route: "comment_and_escalate_to_human", + summary: + "The targeted validation did not pass on the PR head before ablation, so the fix could not be proven.", + repro_steps: [], + targeted_tests: testPlan.commands, + integration_tests: [], + e2e_tests: [], + restored_branch_state: true, + }; + } + + const patchPath = path.join(pr.flowDir, "ablation.patch"); + await fs.writeFile(patchPath, patch, "utf8"); + await runCommand("git", ["-C", pr.workdir, "apply", "-R", patchPath]); + const ablated = await runValidationPlan(pr.workdir, testPlan.commands, { + allowFailure: true, + }); + await runCommand("git", ["-C", pr.workdir, "reset", "--hard", "HEAD"]); + + const restored = await runValidationPlan(pr.workdir, testPlan.commands); + const reproduced = !ablated.ok; + + return { + validation_status: reproduced && restored.ok ? "reproduced_and_fixed" : "fix_not_proven", + route: reproduced && restored.ok ? "judge_refactor" : "comment_and_escalate_to_human", + summary: + reproduced && restored.ok + ? "The targeted regression test passed on the PR head, failed after local-only ablation of the code change, and passed again after restoring the PR branch state." + : "The bug could not be shown to fail on the local-only ablated state and pass again on the restored PR head.", + repro_steps: [ + `Ran targeted validation on PR head in ${path.basename(pr.workdir)}`, + "Reverse-applied the non-test code patch locally without committing or pushing it", + "Reran the same targeted validation on the ablated state", + "Restored the tracked PR branch state with git reset --hard HEAD", + "Reran the same targeted validation on the restored PR head", + ], + targeted_tests: testPlan.commands, + integration_tests: [], + e2e_tests: [], + restored_branch_state: true, + }; +} + +async function testFeatureDirectly(pr, validationPath) { + if (validationPath?.kind !== "feature") { + throw new Error("Feature validation action requires feature validation path"); + } + + await ensureProjectDependencies(pr.workdir); + const testPlan = buildTargetedTestPlan(pr.changedFiles); + if (testPlan.commands.length === 0) { + return { + validation_status: "feature_not_validated", + route: "comment_and_escalate_to_human", + summary: "No targeted test command could be derived for direct feature validation.", + targeted_tests: [], + integration_tests: [], + e2e_tests: [], + }; + } + + const result = await runValidationPlan(pr.workdir, testPlan.commands, { + allowFailure: true, + }); + return { + validation_status: result.ok ? "feature_validated" : "feature_not_validated", + route: result.ok ? "judge_refactor" : "comment_and_escalate_to_human", + summary: result.ok + ? "The targeted feature validation passed on the PR branch." + : "The targeted feature validation did not complete cleanly on the PR branch.", + targeted_tests: testPlan.commands, + integration_tests: [], + e2e_tests: [], + }; +} + +async function collectReviewState(pr) { + const reviews = await ghApiJson(`repos/${pr.repo}/pulls/${pr.prNumber}/reviews?per_page=100`); + const reviewComments = await ghApiJson( + `repos/${pr.repo}/pulls/${pr.prNumber}/comments?per_page=100`, + ); + const issueComments = await ghApiJson( + `repos/${pr.repo}/issues/${pr.prNumber}/comments?per_page=100`, + ); + + await runCommand("git", ["-C", pr.workdir, "fetch", "origin", pr.baseRef]); + const baseRef = `origin/${pr.baseRef}`; + const mergeBase = ( + await runCommand("git", ["-C", pr.workdir, "merge-base", "HEAD", baseRef]) + ).stdout.trim(); + + const localReviewCommand = ["review", "--base", baseRef]; + const localReviewRun = await runCommand("codex", localReviewCommand, { + cwd: pr.workdir, + allowFailure: true, + timeoutMs: 30 * 60_000, + }); + const localReviewParsed = tryExtractJson(localReviewRun.stdout); + + const reviewState = { + baseRef, + mergeBase, + githubReviews: Array.isArray(reviews) ? reviews.map(normalizeGitHubReview) : [], + githubReviewComments: Array.isArray(reviewComments) + ? reviewComments.map(normalizeGitHubReviewComment) + : [], + githubIssueComments: Array.isArray(issueComments) + ? issueComments.map(normalizeGitHubIssueComment) + : [], + localCodexReview: localReviewParsed, + localCodexReviewRaw: trimTextTail(localReviewRun.stdout, 16_000), + localCodexReviewError: trimTextTail(localReviewRun.stderr, 8_000), + localCodexReviewExitCode: localReviewRun.exitCode, + localCodexReviewTimedOut: localReviewRun.timedOut, + }; + await writeJson(path.join(pr.flowDir, "review-state.json"), reviewState); + + return { + review_state_path: path.join(pr.flowDir, "review-state.json"), + review_status: localReviewParsed?.review_status ?? null, + local_codex_review_ran: true, + local_codex_review_exit_code: localReviewRun.exitCode, + }; +} + +async function collectCiState(pr) { + const prView = await ghPrView(pr.repo, pr.prNumber, [ + "statusCheckRollup", + "commits", + "isCrossRepository", + ]); + const headSha = String(prView?.commits?.[0]?.oid ?? pr.headSha) || pr.headSha; + const workflowRuns = await ghApiJson( + `repos/${pr.repo}/actions/runs?head_sha=${encodeURIComponent(headSha)}&per_page=20`, + ); + + let workflowApprovalAttempted = false; + let workflowApproved = false; + + const runs = Array.isArray(workflowRuns?.workflow_runs) ? workflowRuns.workflow_runs : []; + for (const run of runs) { + if (String(run.status ?? "") === "action_required") { + workflowApprovalAttempted = true; + const approval = await runCommand( + "gh", + ["api", "-X", "POST", `repos/${pr.repo}/actions/runs/${run.id}/approve`], + { allowFailure: true }, + ); + if (approval.exitCode === 0) { + workflowApproved = true; + } + } + } + + const ciState = { + statusCheckRollup: Array.isArray(prView?.statusCheckRollup) ? prView.statusCheckRollup : [], + workflowRuns: runs, + workflowApprovalAttempted, + workflowApproved, + }; + await writeJson(path.join(pr.flowDir, "ci-state.json"), ciState); + + return { + ci_state_path: path.join(pr.flowDir, "ci-state.json"), + workflow_approval_attempted: workflowApprovalAttempted, + workflow_approved: workflowApproved, + }; +} + +async function postClosePr(pr, commentStep) { + const comment = String(commentStep?.comment ?? "").trim(); + if (!comment) { + throw new Error("Close-path comment step did not return a comment body"); + } + + const commentFile = path.join(pr.flowDir, "close-comment.md"); + await fs.writeFile(commentFile, comment, "utf8"); + await runCommand("gh", [ + "pr", + "comment", + String(pr.prNumber), + "--repo", + pr.repo, + "--body-file", + commentFile, + ]); + await runCommand("gh", [ + "pr", + "close", + String(pr.prNumber), + "--repo", + pr.repo, + "--comment", + "Closed by automated triage.", + ]); + return { + route: "close_pr", + summary: "Posted the close-path comment and closed the PR.", + comment_posted: true, + pr_closed: true, + }; +} + +async function postEscalationComment(pr, commentStep) { + const comment = String(commentStep?.comment ?? "").trim(); + if (!comment) { + throw new Error("Escalation comment step did not return a comment body"); + } + + const commentFile = path.join(pr.flowDir, "escalation-comment.md"); + await fs.writeFile(commentFile, comment, "utf8"); + await runCommand("gh", [ + "pr", + "comment", + String(pr.prNumber), + "--repo", + pr.repo, + "--body-file", + commentFile, + ]); + return { + route: "escalate_to_human", + summary: "Posted the human handoff comment.", + comment_posted: true, + }; +} + +function promptExtractIntent(pr) { + return [ + "You are processing one pull request at a time in an isolated workspace already prepared by the flow runtime.", + `Target PR: ${prRef(pr)}`, + `Working directory: ${pr.workdir}`, + `Read local context from ${FLOW_DIR}/pr.json, ${FLOW_DIR}/issue.json, ${FLOW_DIR}/files.json, and ${FLOW_DIR}/workspace.json.`, + "Inspect the checked-out repo and current diff yourself when needed. Do not ask the runtime to fetch more context.", + "This is a read-only judgment step. Do not run installs, tests, CI checks, Codex review, or GitHub API commands here.", + "Extract the plain-language human intent and the underlying problem.", + "Return exactly one JSON object with this shape and nothing else:", + "{", + ' "intent": "plain-language human goal",', + ' "problem": "short description of the underlying issue",', + ' "confidence": 0.0,', + ' "reason": "short explanation"', + "}", + ].join("\n"); +} + +function promptJudgeSolution(pr) { + return [ + "You are still in the same PR session inside the isolated workspace.", + `Target PR: ${prRef(pr)}`, + `Use the checked-out repo and the local context files under ${FLOW_DIR}/.`, + "This is a read-only judgment step. Do not run installs, tests, CI checks, Codex review, or GitHub API commands here.", + "The validation and review mechanics happen in later flow steps; do not start them now.", + "Judge whether this PR is a good solution to the underlying problem.", + "Use these verdicts:", + '- "good_enough" if the solution is right-shaped and can continue.', + '- "localized_fix" if it only treats a symptom or is too local for the real problem.', + '- "bad_fix" if it is solving the wrong problem or is the wrong approach.', + '- "unclear" if the PR is too unclear to evaluate confidently.', + '- "needs_human_call" if it seems plausible but needs a design decision or human call before continuing.', + "Route `close_pr` for localized_fix, bad_fix, or unclear.", + "Route `comment_and_escalate_to_human` for needs_human_call.", + "Route `bug_or_feature` for good_enough.", + "Return exactly one JSON object and nothing else:", + "{", + ' "verdict": "good_enough" | "localized_fix" | "bad_fix" | "unclear" | "needs_human_call",', + ' "route": "close_pr" | "comment_and_escalate_to_human" | "bug_or_feature",', + ' "reason": "short explanation",', + ' "evidence": ["brief evidence item"]', + "}", + ].join("\n"); +} + +function promptBugOrFeature(pr) { + return [ + "You are still in the same PR session inside the isolated workspace.", + `Target PR: ${prRef(pr)}`, + `Use the checked-out repo plus ${FLOW_DIR}/pr.json and ${FLOW_DIR}/issue.json.`, + "This is a read-only classification step. Do not run installs, tests, CI checks, Codex review, or GitHub API commands here.", + "Decide which validation path this PR should take before refactor or review.", + "Use `bug` if this PR primarily claims to fix a bug, regression, broken behavior, or other issue that should first be reproduced and then proven fixed.", + "Use `feature` if this PR primarily adds or changes behavior that should be validated directly without first reproducing a prior failure.", + "If you cannot classify it confidently, route to `comment_and_escalate_to_human`.", + "Return exactly one JSON object and nothing else:", + "{", + ' "kind": "bug" | "feature" | "unclear",', + ' "route": "reproduce_bug_and_test_fix" | "test_feature_directly" | "comment_and_escalate_to_human",', + ' "reason": "short explanation"', + "}", + ].join("\n"); +} + +function promptJudgeRefactor(pr, outputs) { + const validation = outputs.reproduce_bug_and_test_fix ?? outputs.test_feature_directly ?? null; + return [ + "You are still in the same PR session inside the isolated workspace.", + `Target PR: ${prRef(pr)}`, + "The validation step has already been run by the flow runtime.", + `Validation summary: ${validation?.summary ?? "none"}`, + "This is a read-only judgment step. Do not rerun validation, CI checks, Codex review, or GitHub API commands here.", + "Judge whether this PR needs no refactor, a superficial refactor, or a fundamental refactor.", + "Route `collect_review_state` for none.", + "Route `do_superficial_refactor` for superficial.", + "Route `comment_and_escalate_to_human` for fundamental.", + "Return exactly one JSON object and nothing else:", + "{", + ' "refactor_needed": "none" | "superficial" | "fundamental",', + ' "route": "collect_review_state" | "do_superficial_refactor" | "comment_and_escalate_to_human",', + ' "reason": "short explanation"', + "}", + ].join("\n"); +} + +function promptDoSuperficialRefactor(pr) { + return [ + "You are still in the same PR session inside the isolated workspace.", + `Target PR: ${prRef(pr)}`, + `Use the local branch ${pr.localBranch}. If you need to push, use remote ${pr.pushRemote} branch ${pr.pushRef}.`, + "Perform only the superficial refactor directly in the checked-out repo.", + "Keep it minor and maintainability-focused. Do not reframe the problem or turn this into a fundamental rewrite.", + "If you change files, run focused checks when feasible, rerun the earlier targeted validation before returning, then commit and push the branch yourself.", + "Return exactly one JSON object with this shape and nothing else:", + "{", + ' "route": "collect_review_state",', + ' "summary": "short explanation",', + ' "files_touched": ["path/to/file"],', + ' "committed": true | false', + "}", + ].join("\n"); +} + +function promptReviewLoop(pr, outputs) { + const reviewStatePath = + outputs.collect_review_state?.review_state_path ?? `${FLOW_DIR}/review-state.json`; + const validation = outputs.reproduce_bug_and_test_fix ?? outputs.test_feature_directly ?? null; + return [ + "Stay on the autonomous review lane for this single PR.", + `Target PR: ${prRef(pr)}`, + `The review mechanics have already been collected by the flow runtime in ${reviewStatePath}.`, + "Read that local JSON file and the local repo state instead of rerunning `gh api` or `codex review` yourself.", + "Use only the normalized GitHub review data and the stored local Codex review result from that file as review evidence.", + "Top-level GitHub issue comments count only if they clearly contain Codex-authored review feedback for the current head. Ignore plain handoff or status comments.", + `Use the local branch ${pr.localBranch}. If you need to push, use remote ${pr.pushRemote} branch ${pr.pushRef}.`, + "First, inspect the existing GitHub Codex review data already collected for the current PR head.", + "Then inspect the fresh local Codex review result that was already run against the refreshed base ref.", + "If valid P0 or P1 issues remain from either source, fix them directly in the repo, run focused checks when feasible, commit and push the branch yourself, and then route back to `collect_review_state` so the flow runtime can rerun the review mechanics.", + "Do not keep looping just because only P2 or lower findings remain. Treat P2 and lower as non-blocking unless they materially change your judgment about whether the PR is safe to continue.", + `If you change code in this loop, rerun the earlier targeted validation before returning. Latest validation summary: ${validation?.summary ?? "none"}.`, + "If `localCodexReviewExitCode` is non-zero, if `localCodexReviewTimedOut` is true, or if the local Codex review could not be established reliably, route to `comment_and_escalate_to_human` instead of pretending review is clear.", + "If blocking review findings are cleared, route to `collect_ci_state`.", + "Return exactly one JSON object and nothing else:", + "{", + ' "route": "collect_review_state" | "collect_ci_state" | "comment_and_escalate_to_human",', + ' "review_status": "blocking_findings_remain" | "clear" | "could_not_establish",', + ' "summary": "short explanation",', + ' "github_codex_reviews_handled": true | false,', + ' "local_codex_review_ran": true | false,', + ' "blocking_findings": ["brief finding"],', + ' "committed": true | false', + "}", + ].join("\n"); +} + +function promptFixCiFailures(pr, outputs) { + const ciStatePath = outputs.collect_ci_state?.ci_state_path ?? `${FLOW_DIR}/ci-state.json`; + const validation = outputs.reproduce_bug_and_test_fix ?? outputs.test_feature_directly ?? null; + return [ + "Stay on the autonomous CI lane for this single PR.", + `Target PR: ${prRef(pr)}`, + `The CI mechanics have already been collected by the flow runtime in ${ciStatePath}.`, + "Read that local JSON file and the checked-out repo state instead of rerunning broad CI discovery yourself.", + `Use the local branch ${pr.localBranch}. If you need to push, use remote ${pr.pushRemote} branch ${pr.pushRef}.`, + "If the runtime already approved or attempted to approve workflow runs, treat that as the current ground truth and focus on the remaining CI result.", + "If related failures remain and you can fix them, fix them directly in the repo, run focused checks when feasible, rerun the earlier targeted validation, commit and push the branch yourself, and then route back to `collect_ci_state` so the flow runtime can re-check CI.", + `Latest validation summary: ${validation?.summary ?? "none"}.`, + "If CI is green or the remaining failures are clearly unrelated, route to `comment_and_escalate_to_human` for the final human handoff.", + "If the only remaining blocker is workflow approval that the runtime could not clear, route to `comment_and_escalate_to_human` and make that the explicit human action needed next.", + "Return exactly one JSON object and nothing else:", + "{", + ' "route": "collect_ci_state" | "comment_and_escalate_to_human",', + ' "ci_status": "related_failures_remain" | "green_or_unrelated" | "approval_blocked",', + ' "summary": "short explanation",', + ' "related_failures": ["brief failure"],', + ' "unrelated_failures": ["brief failure"],', + ' "workflow_approval_attempted": true | false,', + ' "workflow_approved": true | false,', + ' "committed": true | false', + "}", + ].join("\n"); +} + +function promptCommentAndClose(pr, outputs) { + const summary = finalCommentSummary(outputs); + return [ + "You are on the close path for this PR.", + `Target PR: ${prRef(pr)}`, + "Write the exact comment to post. Do not post it yourself; the flow runtime will do that after this step.", + "Use these exact headings in this order: `## Triage result`, `### Quick read`, `### Intent`, `### Why`, `### Codex review`, `### CI/CD`, `### Recommendation`.", + "For this close path, the comment must make these top-line outcomes explicit:", + "- `Solves the right problem: 🛑 Localized, bad, or unclear fix`", + "- `Close PR: 🛑 Yes`", + "- `Recommendation: 🏁 close PR`", + "Use the current run state below as the source of truth:", + JSON.stringify(summary, null, 2), + "Return exactly one JSON object and nothing else:", + "{", + ' "route": "close_pr",', + ' "summary": "short explanation",', + ' "comment_format_followed": true | false,', + ' "comment": "markdown comment to post"', + "}", + ].join("\n"); +} + +function promptCommentAndEscalate(pr, outputs) { + const summary = finalCommentSummary(outputs); + return [ + "You are on the human handoff path for this PR.", + `Target PR: ${prRef(pr)}`, + "Write the exact comment to post. Do not post it yourself; the flow runtime will do that after this step.", + "Use these exact headings in this order: `## Triage result`, `### Quick read`, `### Intent`, `### Why`, `### Codex review`, `### CI/CD`, `### Recommendation`.", + "For this human handoff path, the comment must make these top-line outcomes explicit:", + "- `Human attention: ⚠️ Required`", + "- `Recommendation: 🏁 escalate to a human`", + "- `Human decision needed: ` near the top of the comment", + "If the remaining blocker is workflow approval, say that plainly.", + "Use the current run state below as the source of truth:", + JSON.stringify(summary, null, 2), + "Return exactly one JSON object and nothing else:", + "{", + ' "route": "escalate_to_human",', + ' "summary": "short explanation",', + ' "human_decision_needed": "short explanation",', + ' "comment_format_followed": true | false,', + ' "comment": "markdown comment to post"', + "}", + ].join("\n"); +} + +function loadPullRequestInput(input) { + const repo = String(input?.repo ?? "").trim(); + const prNumber = Number(input?.prNumber); + + if (!repo) { + throw new Error('Flow input must include a non-empty "repo" string'); + } + if (!Number.isInteger(prNumber) || prNumber <= 0) { + throw new Error('Flow input must include a positive integer "prNumber"'); + } + + return { + repo, + prNumber, + prUrl: `https://github.com/${repo}/pull/${prNumber}`, + }; +} + +function loadPrOutput(outputs) { + return outputs.load_pr; +} + +function prepared(outputs) { + return outputs.prepare_workspace; +} + +function prRef(pr) { + return `${pr.repo}#${pr.prNumber} (${pr.prUrl})`; +} + +function finalCommentSummary(outputs) { + return { + intent: outputs.extract_intent ?? null, + solution: outputs.judge_solution ?? null, + validationPath: outputs.bug_or_feature ?? null, + validation: outputs.reproduce_bug_and_test_fix ?? outputs.test_feature_directly ?? null, + refactor: outputs.judge_refactor ?? null, + review: outputs.review_loop ?? null, + ci: outputs.fix_ci_failures ?? null, + }; +} + +async function ensureProjectDependencies(workdir) { + const packageJson = path.join(workdir, "package.json"); + const lockfile = path.join(workdir, "pnpm-lock.yaml"); + const nodeModules = path.join(workdir, "node_modules"); + + if (!(await exists(packageJson)) || !(await exists(lockfile)) || (await exists(nodeModules))) { + return; + } + + await runCommand("pnpm", ["install", "--frozen-lockfile"], { + cwd: workdir, + timeoutMs: 20 * 60_000, + }); +} + +function buildTargetedTestPlan(changedFiles) { + const changedTestFiles = changedFiles + .map((file) => String(file.filename ?? "")) + .filter((filename) => /^test\/.+\.test\.ts$/.test(filename)); + + if (changedTestFiles.length === 0) { + return { + commands: [], + }; + } + + return { + commands: [ + "pnpm run build:test", + `node --test ${changedTestFiles.map((file) => `dist-test/${file.replace(/\.ts$/, ".js")}`).join(" ")}`, + ], + }; +} + +async function runValidationPlan(workdir, commands, options = {}) { + const results = []; + for (const command of commands) { + const result = await runShellLine(command, { + cwd: workdir, + allowFailure: options.allowFailure === true, + timeoutMs: 20 * 60_000, + }); + results.push(result); + if (!result.ok && options.allowFailure !== true) { + return { + ok: false, + results, + }; + } + if (!result.ok && options.allowFailure === true) { + return { + ok: false, + results, + }; + } + } + + return { + ok: true, + results, + }; +} + +async function ghApiJson(endpoint) { + const result = await runCommand("gh", ["api", endpoint]); + return JSON.parse(result.stdout); +} + +async function ghPrView(repo, prNumber, fields) { + const result = await runCommand("gh", [ + "pr", + "view", + String(prNumber), + "--repo", + repo, + "--json", + fields.join(","), + ]); + return JSON.parse(result.stdout); +} + +function normalizeGitHubReview(review) { + return { + id: review?.id ?? null, + user: review?.user?.login ?? null, + state: review?.state ?? null, + body: limitText(typeof review?.body === "string" ? review.body : "", 1_500), + submitted_at: review?.submitted_at ?? null, + commit_id: review?.commit_id ?? null, + html_url: review?.html_url ?? null, + }; +} + +function normalizeGitHubReviewComment(comment) { + return { + id: comment?.id ?? null, + user: comment?.user?.login ?? null, + path: comment?.path ?? null, + line: comment?.line ?? comment?.original_line ?? null, + side: comment?.side ?? null, + body: limitText(typeof comment?.body === "string" ? comment.body : "", 1_500), + commit_id: comment?.commit_id ?? null, + html_url: comment?.html_url ?? null, + }; +} + +function normalizeGitHubIssueComment(comment) { + return { + id: comment?.id ?? null, + user: comment?.user?.login ?? null, + body: limitText(typeof comment?.body === "string" ? comment.body : "", 1_500), + created_at: comment?.created_at ?? null, + updated_at: comment?.updated_at ?? null, + html_url: comment?.html_url ?? null, + }; +} + +function extractLinkedIssueNumber(body) { + const match = body.match(/\b(?:close[sd]?|fix(?:e[sd])?|resolve[sd]?)\s+#(\d+)\b/i); + return match ? Number(match[1]) : null; +} + +function isTestFile(filename) { + return /(^|\/)(test|tests|__tests__)\/|\.test\.[jt]sx?$|\.spec\.[jt]sx?$/.test(filename); +} + +async function writeJson(filename, value) { + await fs.writeFile(filename, `${JSON.stringify(value, null, 2)}\n`, "utf8"); +} + +function trimTextTail(text, maxChars) { + const value = String(text ?? "").trim(); + if (!value || value.length <= maxChars) { + return value; + } + return value.slice(value.length - maxChars); +} + +function limitText(text, maxChars) { + const value = String(text ?? ""); + if (value.length <= maxChars) { + return value; + } + return `${value.slice(0, maxChars)}...`; +} + +async function exists(filename) { + try { + await fs.access(filename); + return true; + } catch { + return false; + } +} + +async function runShellLine(command, options = {}) { + return await runCommand("zsh", ["-lc", command], options); +} + +async function runCommand(command, args, options = {}) { + const child = spawn(command, args, { + cwd: options.cwd, + env: { + ...process.env, + ...options.env, + }, + stdio: ["ignore", "pipe", "pipe"], + }); + + let stdout = ""; + let stderr = ""; + let timedOut = false; + let timeoutId; + + if (options.timeoutMs) { + timeoutId = setTimeout(() => { + timedOut = true; + child.kill("SIGTERM"); + }, options.timeoutMs); + } + + child.stdout.on("data", (chunk) => { + stdout += String(chunk); + }); + child.stderr.on("data", (chunk) => { + stderr += String(chunk); + }); + + const exit = await new Promise((resolve, reject) => { + child.on("error", reject); + child.on("close", (exitCode, signal) => { + resolve({ + exitCode, + signal, + }); + }); + }); + + if (timeoutId) { + clearTimeout(timeoutId); + } + + const ok = !timedOut && exit.exitCode === 0; + if (!ok && options.allowFailure !== true) { + throw new Error( + [ + `Command failed: ${command} ${args.join(" ")}`, + `exitCode: ${String(exit.exitCode)}`, + timedOut ? "timedOut: true" : null, + stdout ? `stdout:\n${stdout}` : null, + stderr ? `stderr:\n${stderr}` : null, + ] + .filter(Boolean) + .join("\n"), + ); + } + + return { + ok, + command, + args, + stdout, + stderr, + exitCode: exit.exitCode, + signal: exit.signal, + timedOut, + }; +} + +function extractJson(text) { + const trimmed = String(text ?? "").trim(); + if (!trimmed) { + throw new Error("Expected JSON output, got empty text"); + } + + const direct = tryParse(trimmed); + if (direct.ok) { + return direct.value; + } + + const fencedMatch = trimmed.match(/```(?:json)?\s*([\s\S]*?)```/i); + if (fencedMatch) { + const fenced = tryParse(fencedMatch[1].trim()); + if (fenced.ok) { + return fenced.value; + } + } + + for (const candidate of extractBalancedJsonCandidates(trimmed)) { + const parsed = tryParse(candidate); + if (parsed.ok) { + return parsed.value; + } + } + + throw new Error(`Could not parse JSON from assistant output:\n${trimmed}`); +} + +function tryExtractJson(text) { + try { + return extractJson(text); + } catch { + return null; + } +} + +function tryParse(text) { + try { + return { ok: true, value: JSON.parse(text) }; + } catch { + return { ok: false }; + } +} + +function extractBalancedJsonCandidates(text) { + const candidates = []; + const starts = new Set(["{", "["]); + for (let i = 0; i < text.length; i += 1) { + if (!starts.has(text[i] ?? "")) { + continue; + } + + const result = scanBalanced(text, i); + if (result) { + candidates.push(result); + } + } + + return candidates; +} + +function scanBalanced(text, startIndex) { + const stack = []; + let inString = false; + let escaped = false; + + for (let i = startIndex; i < text.length; i += 1) { + const char = text[i]; + + if (inString) { + if (escaped) { + escaped = false; + } else if (char === "\\") { + escaped = true; + } else if (char === '"') { + inString = false; + } + continue; + } + + if (char === '"') { + inString = true; + continue; + } + + if (char === "{" || char === "[") { + stack.push(char); + continue; + } + + if (char === "}" || char === "]") { + const last = stack.at(-1); + if ((last === "{" && char !== "}") || (last === "[" && char !== "]")) { + return null; + } + + stack.pop(); + if (stack.length === 0) { + return text.slice(startIndex, i + 1); + } + } + } + + return null; +} From b671fce3ec3531f617fa633958ceb13fdf166c03 Mon Sep 17 00:00:00 2001 From: Onur Solmaz <2453968+osolmaz@users.noreply.github.com> Date: Thu, 26 Mar 2026 11:51:05 +0100 Subject: [PATCH 17/22] docs: tighten flow documentation --- README.md | 34 +++++++++++++++++++++- docs/2026-03-25-acpx-flows-architecture.md | 16 +++++----- docs/CLI.md | 19 ++++++++++-- examples/flows/README.md | 9 ++++-- skills/acpx/SKILL.md | 7 +++++ 5 files changed, 73 insertions(+), 12 deletions(-) diff --git a/README.md b/README.md index 6e3c058..9e741c3 100644 --- a/README.md +++ b/README.md @@ -39,7 +39,7 @@ One command surface for Pi, OpenClaw ACP, Codex, Claude, and other ACP-compatibl - **Structured output**: typed ACP messages (thinking, tool calls, diffs) instead of ANSI scraping - **Any ACP agent**: built-in registry + `--agent` escape hatch for custom servers - **One-shot mode**: `exec` for stateless fire-and-forget tasks -- **Experimental flows**: `flow run ` for user-authored ACP workflows over multiple prompts +- **Experimental flows**: `flow run ` for TypeScript workflow modules over multiple prompts - **Runtime-owned flow actions**: shell-backed action steps can prepare workspaces and other deterministic mechanics outside the agent turn - **Flow workspace isolation**: `acp` nodes can target an explicit per-step cwd, so flows can keep agent work inside disposable worktrees @@ -216,6 +216,38 @@ acpx --ttl 30 codex 'keep queue owner alive for quick follow-ups' acpx --verbose codex 'debug why adapter startup is failing' ``` +## Flows + +`acpx flow run ` executes a TypeScript flow module through the `acpx/flows` +runtime and persists run state under `~/.acpx/flows/runs/`. + +Flows are for multi-step ACP work where one prompt is not enough: + +- `acp` steps keep model-shaped work in ACP +- `action` steps handle deterministic mechanics like shell commands or GitHub calls +- `compute` steps do local routing or shaping +- `checkpoint` steps pause for something outside the runtime + +The source tree includes flow examples under [examples/flows/README.md](examples/flows/README.md): + +- small examples such as `echo`, `branch`, `shell`, `workdir`, and `two-turn` +- a larger PR-triage example under [examples/flows/pr-triage/README.md](examples/flows/pr-triage/README.md) + +Example runs: + +```bash +acpx flow run ./my-flow.ts --input-file ./flow-input.json + +acpx flow run examples/flows/branch.flow.ts \ + --input-json '{"task":"FIX: add a regression test for the reconnect bug"}' + +acpx flow run examples/flows/pr-triage/pr-triage.flow.ts \ + --input-json '{"repo":"openclaw/acpx","prNumber":150}' +``` + +The PR-triage example is only an example workflow. It can comment on or close +real GitHub PRs if you run it against a live repository. + ## Configuration files `acpx` reads config in this order (later wins): diff --git a/docs/2026-03-25-acpx-flows-architecture.md b/docs/2026-03-25-acpx-flows-architecture.md index 8465339..2c56461 100644 --- a/docs/2026-03-25-acpx-flows-architecture.md +++ b/docs/2026-03-25-acpx-flows-architecture.md @@ -287,7 +287,8 @@ Do not turn output parsing into a large framework. - Prefer clear runtime boundaries over specialized built-ins - Add fewer conventions, not more - Use one main session by default -- Keep workload-specific logic in user flow files, not in `acpx` core +- Keep workload-specific logic in user flow files or example files, not in + `acpx` core product behavior - Use compatibility JSON by default and strict JSON only when it pays for itself ## PR triage example shape @@ -311,18 +312,19 @@ bounded. ## CLI shape -The main user-facing entrypoint is: +The current user-facing entrypoint is: ```bash acpx flow run [--input-json | --input-file ] ``` -Related commands: +Run state is persisted under `~/.acpx/flows/runs/`. -- `acpx flow resume ` -- `acpx flow show ` -- `acpx flow graph ` -- `acpx flow validate ` +The source tree includes example flows under `examples/flows/`, including: + +- small focused examples such as `echo`, `branch`, `shell`, `workdir`, and + `two-turn` +- a larger PR-triage example under `examples/flows/pr-triage/` ## What belongs in core diff --git a/docs/CLI.md b/docs/CLI.md index 79f9a63..cf7fce2 100644 --- a/docs/CLI.md +++ b/docs/CLI.md @@ -79,8 +79,23 @@ acpx [global_options] flow run [--input-json | --input-file - `--input-json` passes flow input inline as JSON. - `--input-file` reads flow input JSON from disk. - `--default-agent` supplies the default agent profile for `acp` nodes that do not pin one. -- `acpx` does not ship built-in workload-specific flows; the file is provided by the caller. -- The source repo includes small generic examples under `examples/flows/`. +- The file is always provided by the caller at runtime. `acpx` does not require any built-in flow registry. +- The source repo includes example flow files under `examples/flows/`, including a larger PR-triage example under `examples/flows/pr-triage/`. + +Example invocations: + +```bash +acpx flow run ./my-flow.ts --input-file ./flow-input.json + +acpx flow run examples/flows/branch.flow.ts \ + --input-json '{"task":"FIX: add a regression test for the reconnect bug"}' + +acpx flow run examples/flows/pr-triage/pr-triage.flow.ts \ + --input-json '{"repo":"openclaw/acpx","prNumber":150}' +``` + +The PR-triage example is only an example workflow. It can post GitHub comments +or close a PR if you run it against a live repository. ## Global options diff --git a/examples/flows/README.md b/examples/flows/README.md index 49ea6c6..72db5a7 100644 --- a/examples/flows/README.md +++ b/examples/flows/README.md @@ -2,6 +2,8 @@ These are source-tree examples for `acpx flow run`. +They range from small primitives to one larger end-to-end example. + - `echo.flow.ts`: one ACP step that returns a JSON reply - `branch.flow.ts`: ACP classification followed by a deterministic branch into either `continue` or `checkpoint` - `pr-triage/pr-triage.flow.ts`: a larger single-PR workflow example with a colocated written spec in `pr-triage/README.md` @@ -30,5 +32,8 @@ acpx flow run examples/flows/two-turn.flow.ts \ --input-json '{"topic":"How should we validate a new ACP adapter?"}' ``` -These examples are examples only. They do not define `acpx` core product behavior. -The PR-triage example can comment on or close real GitHub PRs if you run it against a live repository. +These examples are examples only. They do not define `acpx` core product +behavior. + +The PR-triage example can comment on or close real GitHub PRs if you run it +against a live repository. diff --git a/skills/acpx/SKILL.md b/skills/acpx/SKILL.md index 704eff2..5ab7f43 100644 --- a/skills/acpx/SKILL.md +++ b/skills/acpx/SKILL.md @@ -323,6 +323,13 @@ Raw custom adapter command: acpx --agent './bin/custom-acp-server --profile ci' 'run validation checks' ``` +Flow run: + +```bash +acpx flow run ./my-flow.ts --input-file ./flow-input.json +acpx flow run examples/flows/branch.flow.ts --input-json '{"task":"FIX: add a regression test"}' +``` + Repo-scoped review with permissive mode: ```bash From 588cae120ad5bbc5c6c3330694b4e8ab7bc394ad Mon Sep 17 00:00:00 2001 From: Onur Solmaz <2453968+osolmaz@users.noreply.github.com> Date: Thu, 26 Mar 2026 14:28:30 +0100 Subject: [PATCH 18/22] feat: route flow node timeouts as outcomes --- docs/2026-03-25-acpx-flows-architecture.md | 58 ++++++++++ src/flows/graph.ts | 48 ++++++++- src/flows/runtime.ts | 119 +++++++++++++++++---- src/flows/types.ts | 17 +++ test/flows-store.test.ts | 1 + test/flows.test.ts | 100 ++++++++++++++++- 6 files changed, 321 insertions(+), 22 deletions(-) diff --git a/docs/2026-03-25-acpx-flows-architecture.md b/docs/2026-03-25-acpx-flows-architecture.md index 2c56461..b5ac5f6 100644 --- a/docs/2026-03-25-acpx-flows-architecture.md +++ b/docs/2026-03-25-acpx-flows-architecture.md @@ -163,6 +163,63 @@ Prefer: - declarative `switch` edges - `compute` nodes for custom routing logic +## Node outcomes + +Timeouts should be treated as routable node outcomes, not only as fatal run +errors. + +The clean model is small: + +- `ok` +- `timed_out` +- `failed` +- `cancelled` + +That outcome is control-plane state, separate from the business output of the +step. + +In practice, that means a flow should be able to say things like: + +- `review_loop` timed out -> escalate to human +- `collect_review_state` failed -> escalate to human +- `fix_ci_failures` cancelled -> pause or escalate + +This should not become a large event system. + +The runtime should persist: + +- step output +- step outcome +- error text when present +- timestamps and duration + +Then the graph can route on those outcomes when needed. + +For example, a switch edge may branch on: + +- `$.next` for normal business output +- `$result.outcome` for control-plane routing +- `$output.route` when a flow wants the output path to be explicit + +If a flow does not define a route for a non-`ok` outcome, failing the run is +still the right default. + +## Events and history + +Flow event logs are for observability, not for driving the graph directly. + +For example, the runtime may record events such as: + +- node started +- node heartbeat +- node finished +- run failed + +That append-only history belongs in the run log. + +Routing should still use a small structured result model rather than treating +the event stream itself as the workflow API. + ## Session model Each flow run gets one main ACP session by default. @@ -233,6 +290,7 @@ The flow store keeps orchestration state such as: - run status - current node - outputs +- latest node results and outcomes - step history - session bindings - errors diff --git a/src/flows/graph.ts b/src/flows/graph.ts index 4192576..5c6c39e 100644 --- a/src/flows/graph.ts +++ b/src/flows/graph.ts @@ -1,4 +1,4 @@ -import type { FlowDefinition, FlowEdge } from "./types.js"; +import type { FlowDefinition, FlowEdge, FlowNodeResult } from "./types.js"; export function validateFlowDefinition(flow: FlowDefinition): void { if (!flow.name.trim()) { @@ -31,7 +31,12 @@ export function validateFlowDefinition(flow: FlowDefinition): void { } } -export function resolveNext(edges: FlowEdge[], from: string, output: unknown): string | null { +export function resolveNext( + edges: FlowEdge[], + from: string, + output: unknown, + result?: FlowNodeResult, +): string | null { const edge = edges.find((candidate) => candidate.from === from); if (!edge) { return null; @@ -41,7 +46,7 @@ export function resolveNext(edges: FlowEdge[], from: string, output: unknown): s return edge.to; } - const value = getByPath(output, edge.switch.on); + const value = getBySwitchPath(output, result, edge.switch.on); if (typeof value !== "string" && typeof value !== "number" && typeof value !== "boolean") { throw new Error(`Flow switch value must be scalar for ${edge.switch.on}`); } @@ -52,6 +57,43 @@ export function resolveNext(edges: FlowEdge[], from: string, output: unknown): s return next; } +export function resolveNextForOutcome( + edges: FlowEdge[], + from: string, + result: FlowNodeResult, +): string | null { + const edge = edges.find((candidate) => candidate.from === from); + if (!edge || "to" in edge) { + return null; + } + if (!edge.switch.on.startsWith("$result.")) { + return null; + } + const value = getBySwitchPath(undefined, result, edge.switch.on); + if (typeof value !== "string" && typeof value !== "number" && typeof value !== "boolean") { + throw new Error(`Flow switch value must be scalar for ${edge.switch.on}`); + } + const next = edge.switch.cases[String(value)]; + if (!next) { + throw new Error(`No flow switch case for ${edge.switch.on}=${JSON.stringify(value)}`); + } + return next; +} + +function getBySwitchPath( + output: unknown, + result: FlowNodeResult | undefined, + jsonPath: string, +): unknown { + if (jsonPath.startsWith("$result.")) { + return getByPath(result, `$.${jsonPath.slice("$result.".length)}`); + } + if (jsonPath.startsWith("$output.")) { + return getByPath(output, `$.${jsonPath.slice("$output.".length)}`); + } + return getByPath(output, jsonPath); +} + function getByPath(value: unknown, jsonPath: string): unknown { if (!jsonPath.startsWith("$.")) { throw new Error(`Unsupported JSON path: ${jsonPath}`); diff --git a/src/flows/runtime.ts b/src/flows/runtime.ts index 5ed7f18..5e0ef97 100644 --- a/src/flows/runtime.ts +++ b/src/flows/runtime.ts @@ -3,12 +3,12 @@ import path from "node:path"; import { createOutputFormatter } from "../output.js"; import { promptToDisplayText, textPrompt } from "../prompt-content.js"; import { resolveSessionRecord } from "../session-persistence.js"; -import { TimeoutError, withTimeout } from "../session-runtime-helpers.js"; +import { InterruptedError, TimeoutError, withTimeout } from "../session-runtime-helpers.js"; import { cancelSessionPrompt, createSession, runOnce, sendSession } from "../session.js"; import type { PromptInput } from "../types.js"; import { acp, action, checkpoint, compute, defineFlow, shell } from "./definition.js"; import { formatShellActionSummary, runShellAction } from "./executors/shell.js"; -import { resolveNext, validateFlowDefinition } from "./graph.js"; +import { resolveNext, resolveNextForOutcome, validateFlowDefinition } from "./graph.js"; import { FlowRunStore } from "./store.js"; import type { AcpNodeDefinition, @@ -25,6 +25,8 @@ import type { FlowSessionBinding, FlowEdge, FlowStepRecord, + FlowNodeOutcome, + FlowNodeResult, FunctionActionNodeDefinition, ResolvedFlowAgent, ShellActionExecution, @@ -43,6 +45,8 @@ export type { FlowNodeCommon, FlowNodeContext, FlowNodeDefinition, + FlowNodeOutcome, + FlowNodeResult, FlowRunResult, FlowRunState, FlowRunnerOptions, @@ -122,6 +126,7 @@ export class FlowRunner { status: "running", input, outputs: {}, + results: {}, steps: [], sessionBindings: {}, }; @@ -154,16 +159,40 @@ export class FlowRunner { nodeId: current, kind: node.kind, }); - ({ output, promptText, rawText, sessionInfo, agentInfo } = await this.executeNode( - runDir, - state, - flow, - current, - node, - context, - )); - - if (node.kind === "checkpoint") { + let nodeResult: FlowNodeResult | undefined; + let executionError: unknown; + try { + ({ output, promptText, rawText, sessionInfo, agentInfo } = await this.executeNode( + runDir, + state, + flow, + current, + node, + context, + )); + nodeResult = createNodeResult({ + nodeId: current, + kind: node.kind, + outcome: "ok", + startedAt, + finishedAt: isoNow(), + output, + }); + } catch (error) { + executionError = error; + nodeResult = createNodeResult({ + nodeId: current, + kind: node.kind, + outcome: outcomeForError(error), + startedAt, + finishedAt: isoNow(), + error: error instanceof Error ? error.message : String(error), + }); + } + + state.results[current] = nodeResult; + + if (nodeResult.outcome === "ok" && node.kind === "checkpoint") { state.outputs[current] = output; state.waitingOn = current; state.updatedAt = isoNow(); @@ -172,8 +201,9 @@ export class FlowRunner { state.steps.push({ nodeId: current, kind: node.kind, + outcome: nodeResult.outcome, startedAt, - finishedAt: isoNow(), + finishedAt: nodeResult.finishedAt, promptText, rawText, output, @@ -191,28 +221,49 @@ export class FlowRunner { }; } - state.outputs[current] = output; + if (nodeResult.outcome === "ok") { + state.outputs[current] = output; + } state.updatedAt = isoNow(); this.clearActiveNode(state); state.steps.push({ nodeId: current, kind: node.kind, + outcome: nodeResult.outcome, startedAt, - finishedAt: isoNow(), + finishedAt: nodeResult.finishedAt, promptText, rawText, output, + error: nodeResult.error, session: sessionInfo, agent: agentInfo, }); + if (nodeResult.outcome === "ok") { + await this.store.writeSnapshot(runDir, state, { + type: "node_completed", + nodeId: current, + output, + }); + current = resolveNext(flow.edges, current, output, nodeResult); + continue; + } + await this.store.writeSnapshot(runDir, state, { - type: "node_completed", + type: "node_outcome", nodeId: current, - output, + outcome: nodeResult.outcome, + error: nodeResult.error, }); - current = resolveNext(flow.edges, current, output); + const next = resolveNextForOutcome(flow.edges, current, nodeResult); + if (next) { + current = next; + continue; + } + + throw executionError; } state.status = "completed"; @@ -244,6 +295,7 @@ export class FlowRunner { return { input, outputs: state.outputs, + results: state.results, state, services: this.services, }; @@ -697,6 +749,37 @@ function createSessionName(flowName: string, handle: string, cwd: string, runId: return `${flowName}-${handle}-${stamp}-${runId.slice(-8)}`; } +function createNodeResult(options: { + nodeId: string; + kind: FlowNodeDefinition["kind"]; + outcome: FlowNodeOutcome; + startedAt: string; + finishedAt: string; + output?: unknown; + error?: string; +}): FlowNodeResult { + return { + nodeId: options.nodeId, + kind: options.kind, + outcome: options.outcome, + startedAt: options.startedAt, + finishedAt: options.finishedAt, + durationMs: new Date(options.finishedAt).getTime() - new Date(options.startedAt).getTime(), + output: options.output, + error: options.error, + }; +} + +function outcomeForError(error: unknown): FlowNodeOutcome { + if (error instanceof TimeoutError) { + return "timed_out"; + } + if (error instanceof InterruptedError) { + return "cancelled"; + } + return "failed"; +} + function stableShortHash(value: string): string { return createHash("sha1").update(value).digest("hex").slice(0, 8); } diff --git a/src/flows/types.ts b/src/flows/types.ts index 7a622f6..026e6db 100644 --- a/src/flows/types.ts +++ b/src/flows/types.ts @@ -12,6 +12,7 @@ type MaybePromise = T | Promise; export type FlowNodeContext = { input: TInput; outputs: Record; + results: Record; state: FlowRunState; services: Record; }; @@ -107,14 +108,29 @@ export type FlowDefinition = { edges: FlowEdge[]; }; +export type FlowNodeOutcome = "ok" | "timed_out" | "failed" | "cancelled"; + +export type FlowNodeResult = { + nodeId: string; + kind: FlowNodeDefinition["kind"]; + outcome: FlowNodeOutcome; + startedAt: string; + finishedAt: string; + durationMs: number; + output?: unknown; + error?: string; +}; + export type FlowStepRecord = { nodeId: string; kind: FlowNodeDefinition["kind"]; + outcome: FlowNodeOutcome; startedAt: string; finishedAt: string; promptText: string | null; rawText: string | null; output: unknown; + error?: string; session: FlowSessionBinding | null; agent: { agentName: string; @@ -146,6 +162,7 @@ export type FlowRunState = { status: "running" | "waiting" | "completed" | "failed" | "timed_out"; input: unknown; outputs: Record; + results: Record; steps: FlowStepRecord[]; sessionBindings: Record; currentNode?: string; diff --git a/test/flows-store.test.ts b/test/flows-store.test.ts index e87c242..e5c3b3a 100644 --- a/test/flows-store.test.ts +++ b/test/flows-store.test.ts @@ -25,6 +25,7 @@ test("FlowRunStore writes snapshots, live state, and events", async () => { status: "running", input: { ok: true }, outputs: {}, + results: {}, steps: [], sessionBindings: {}, currentNode: "prepare", diff --git a/test/flows.test.ts b/test/flows.test.ts index f1b03b5..c9b2328 100644 --- a/test/flows.test.ts +++ b/test/flows.test.ts @@ -422,8 +422,106 @@ test("FlowRunner marks timed out shell steps explicitly", async () => { const runDir = await waitForRunDir(outputRoot, "timeout-test"); const state = await readRunJson(runDir); assert.equal(state.status, "timed_out"); - assert.equal(state.currentNode, "slow"); assert.match(String(state.error), /Timed out after 50ms/); + const slowResult = (state.results as Record>).slow; + assert.equal(slowResult.nodeId, "slow"); + assert.equal(slowResult.kind, "action"); + assert.equal(slowResult.outcome, "timed_out"); + assert.equal(slowResult.error, "Timed out after 50ms"); + assert.equal(typeof slowResult.startedAt, "string"); + assert.equal(typeof slowResult.finishedAt, "string"); + assert.equal(typeof slowResult.durationMs, "number"); + }); +}); + +test("FlowRunner can route timed out nodes by outcome", async () => { + await withTempHome(async () => { + const outputRoot = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-store-")); + const runner = new FlowRunner({ + resolveAgent: () => ({ + agentName: "unused", + agentCommand: "unused", + cwd: process.cwd(), + }), + permissionMode: "approve-all", + outputRoot, + }); + + const flow = defineFlow({ + name: "timeout-route-test", + startAt: "slow", + nodes: { + slow: shell({ + exec: () => ({ + command: process.execPath, + args: ["-e", "setTimeout(() => {}, 1000)"], + timeoutMs: 50, + }), + }), + after_timeout: action({ + run: ({ results }) => ({ + routed: true, + outcome: results.slow?.outcome, + }), + }), + }, + edges: [ + { + from: "slow", + switch: { + on: "$result.outcome", + cases: { + timed_out: "after_timeout", + }, + }, + }, + ], + }); + + const result = await runner.run(flow, {}); + assert.equal(result.state.status, "completed"); + assert.equal(result.state.results.slow?.outcome, "timed_out"); + assert.deepEqual(result.state.outputs.after_timeout, { + routed: true, + outcome: "timed_out", + }); + }); +}); + +test("FlowRunner stores successful node results separately from outputs", async () => { + await withTempHome(async () => { + const outputRoot = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-store-")); + const runner = new FlowRunner({ + resolveAgent: () => ({ + agentName: "unused", + agentCommand: "unused", + cwd: process.cwd(), + }), + permissionMode: "approve-all", + outputRoot, + }); + + const flow = defineFlow({ + name: "result-state-test", + startAt: "first", + nodes: { + first: compute({ + run: () => ({ next: "done" }), + }), + done: action({ + run: ({ results }) => ({ + firstOutcome: results.first?.outcome, + }), + }), + }, + edges: [{ from: "first", to: "done" }], + }); + + const result = await runner.run(flow, {}); + assert.equal(result.state.status, "completed"); + assert.equal(result.state.results.first?.outcome, "ok"); + assert.deepEqual(result.state.outputs.first, { next: "done" }); + assert.deepEqual(result.state.outputs.done, { firstOutcome: "ok" }); }); }); From bd1f3341298073cfe9e55019fb42de7ed9d706b1 Mon Sep 17 00:00:00 2001 From: Onur Solmaz <2453968+osolmaz@users.noreply.github.com> Date: Thu, 26 Mar 2026 15:31:59 +0100 Subject: [PATCH 19/22] fix: tighten flow timeouts and cwd handling --- src/flows/runtime.ts | 142 ++++++++++++++++++++++----------------- test/flows.test.ts | 104 ++++++++++++++++++++++++++++ test/integration.test.ts | 5 +- 3 files changed, 188 insertions(+), 63 deletions(-) diff --git a/src/flows/runtime.ts b/src/flows/runtime.ts index 5e0ef97..a25264d 100644 --- a/src/flows/runtime.ts +++ b/src/flows/runtime.ts @@ -76,6 +76,7 @@ type FlowNodeExecutionResult = { export class FlowRunner { private readonly resolveAgent; + private readonly defaultCwd; private readonly permissionMode; private readonly mcpServers?; private readonly nonInteractivePermissions?; @@ -92,6 +93,7 @@ export class FlowRunner { constructor(options: FlowRunnerOptions) { this.resolveAgent = options.resolveAgent; + this.defaultCwd = options.resolveAgent(undefined).cwd; this.permissionMode = options.permissionMode; this.mcpServers = options.mcpServers; this.nonInteractivePermissions = options.nonInteractivePermissions; @@ -315,7 +317,7 @@ export class FlowRunner { case "action": return await this.executeActionNode(runDir, state, node, context); case "checkpoint": - return await this.executeCheckpointNode(nodeId, node, context); + return await this.executeCheckpointNode(runDir, state, nodeId, node, context); case "acp": return await this.executeAcpNode(runDir, state, flow, node, context); default: { @@ -357,17 +359,14 @@ export class FlowRunner { node: ActionNodeDefinition, context: FlowNodeContext, ): Promise { + const nodeTimeoutMs = node.timeoutMs ?? this.defaultNodeTimeoutMs; if ("run" in node) { const output = await this.runWithHeartbeat( runDir, state, state.currentNode ?? "", node, - async () => - await withTimeout( - Promise.resolve(node.run(context)), - node.timeoutMs ?? this.defaultNodeTimeoutMs, - ), + async () => await withTimeout(Promise.resolve(node.run(context)), nodeTimeoutMs), ); return { output, @@ -378,10 +377,17 @@ export class FlowRunner { }; } - const execution = await Promise.resolve(node.exec(context)); + const execution = await this.runWithHeartbeat( + runDir, + state, + state.currentNode ?? "", + node, + async () => await withTimeout(Promise.resolve(node.exec(context)), nodeTimeoutMs), + ); const effectiveExecution: ShellActionExecution = { ...execution, - timeoutMs: execution.timeoutMs ?? node.timeoutMs ?? this.defaultNodeTimeoutMs, + cwd: resolveShellActionCwd(this.defaultCwd, execution.cwd), + timeoutMs: execution.timeoutMs ?? nodeTimeoutMs, }; this.updateStatusDetail(state, formatShellActionSummary(effectiveExecution)); await this.store.writeLive(runDir, state, { @@ -407,13 +413,25 @@ export class FlowRunner { } private async executeCheckpointNode( + runDir: string, + state: FlowRunState, nodeId: string, node: CheckpointNodeDefinition, context: FlowNodeContext, ): Promise { const output = typeof node.run === "function" - ? await node.run(context) + ? await this.runWithHeartbeat( + runDir, + state, + state.currentNode ?? "", + node, + async () => + await withTimeout( + Promise.resolve(node.run?.(context)), + node.timeoutMs ?? this.defaultNodeTimeoutMs, + ), + ) : { checkpoint: nodeId, summary: node.summary ?? nodeId, @@ -434,69 +452,65 @@ export class FlowRunner { node: AcpNodeDefinition, context: FlowNodeContext, ): Promise { - const resolvedAgent = this.resolveAgent(node.profile); - const agentInfo = { - ...resolvedAgent, - cwd: await resolveNodeCwd(resolvedAgent.cwd, node.cwd, context), - }; - const prompt = normalizePromptInput(await node.prompt(context)); - const promptText = promptToDisplayText(prompt); - this.updateStatusDetail(state, summarizePrompt(promptText, node.statusDetail)); - await this.store.writeLive(runDir, state, { - type: "node_detail", - nodeId: state.currentNode, - detail: state.statusDetail, - }); - - if (node.session?.isolated) { - const rawText = await this.runWithHeartbeat( - runDir, - state, - state.currentNode ?? "", - node, - async () => - await this.runIsolatedPrompt( - agentInfo, - prompt, - node.timeoutMs ?? this.defaultNodeTimeoutMs, - ), - ); - return { - output: node.parse ? await node.parse(rawText, context) : rawText, - promptText, - rawText, - sessionInfo: null, - agentInfo, - }; - } - - const boundSession = await this.ensureSessionBinding(state, flow, node, agentInfo); - const rawText = await this.runWithHeartbeat( + const nodeTimeoutMs = node.timeoutMs ?? this.defaultNodeTimeoutMs; + let boundSession: FlowSessionBinding | null = null; + return await this.runWithHeartbeat( runDir, state, state.currentNode ?? "", node, - async () => - await this.runPersistentPrompt( - boundSession, - prompt, - node.timeoutMs ?? this.defaultNodeTimeoutMs, - ), async () => { + const resolvedAgent = this.resolveAgent(node.profile); + const agentInfo = { + ...resolvedAgent, + cwd: await withTimeout( + resolveNodeCwd(resolvedAgent.cwd, node.cwd, context), + nodeTimeoutMs, + ), + }; + const prompt = normalizePromptInput( + await withTimeout(Promise.resolve(node.prompt(context)), nodeTimeoutMs), + ); + const promptText = promptToDisplayText(prompt); + this.updateStatusDetail(state, summarizePrompt(promptText, node.statusDetail)); + await this.store.writeLive(runDir, state, { + type: "node_detail", + nodeId: state.currentNode, + detail: state.statusDetail, + }); + + if (node.session?.isolated) { + const rawText = await this.runIsolatedPrompt(agentInfo, prompt, nodeTimeoutMs); + return { + output: node.parse ? await node.parse(rawText, context) : rawText, + promptText, + rawText, + sessionInfo: null, + agentInfo, + }; + } + + boundSession = await this.ensureSessionBinding(state, flow, node, agentInfo); + const rawText = await this.runPersistentPrompt(boundSession, prompt, nodeTimeoutMs); + const sessionInfo = await this.refreshSessionBinding(boundSession); + state.sessionBindings[sessionInfo.key] = sessionInfo; + return { + output: node.parse ? await node.parse(rawText, context) : rawText, + promptText, + rawText, + sessionInfo, + agentInfo, + }; + }, + async () => { + if (!boundSession) { + return; + } await cancelSessionPrompt({ sessionId: boundSession.acpxRecordId, }); }, ); - const sessionInfo = await this.refreshSessionBinding(boundSession); - state.sessionBindings[sessionInfo.key] = sessionInfo; - return { - output: node.parse ? await node.parse(rawText, context) : rawText, - promptText, - rawText, - sessionInfo, - agentInfo, - }; } private markNodeStarted( @@ -694,6 +708,10 @@ async function resolveNodeCwd( return path.resolve(defaultCwd, cwd ?? defaultCwd); } +function resolveShellActionCwd(defaultCwd: string, cwd: string | undefined): string { + return path.resolve(defaultCwd, cwd ?? defaultCwd); +} + function summarizePrompt(promptText: string, explicitDetail?: string): string { if (explicitDetail) { return explicitDetail; diff --git a/test/flows.test.ts b/test/flows.test.ts index c9b2328..a7f23c6 100644 --- a/test/flows.test.ts +++ b/test/flows.test.ts @@ -14,8 +14,10 @@ import { defineFlow, shell, } from "../src/flows/runtime.js"; +import type { ShellActionExecution } from "../src/flows/runtime.js"; import { flowRunsBaseDir } from "../src/flows/store.js"; import { TimeoutError } from "../src/session-runtime-helpers.js"; +import type { PromptInput } from "../src/types.js"; const MOCK_AGENT_PATH = fileURLToPath(new URL("./mock-agent.js", import.meta.url)); const MOCK_AGENT_COMMAND = `node ${JSON.stringify(MOCK_AGENT_PATH)}`; @@ -488,6 +490,108 @@ test("FlowRunner can route timed out nodes by outcome", async () => { }); }); +test("FlowRunner times out async shell exec callbacks", async () => { + await withTempHome(async () => { + const outputRoot = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-store-")); + const runner = new FlowRunner({ + resolveAgent: () => ({ + agentName: "unused", + agentCommand: "unused", + cwd: process.cwd(), + }), + permissionMode: "approve-all", + outputRoot, + }); + + const flow = defineFlow({ + name: "shell-exec-timeout-test", + startAt: "slow", + nodes: { + slow: shell({ + timeoutMs: 50, + exec: async () => await new Promise(() => {}), + }), + }, + edges: [], + }); + + await assert.rejects(async () => await runner.run(flow, {}), TimeoutError); + const runDir = await waitForRunDir(outputRoot, "shell-exec-timeout-test"); + const state = await readRunJson(runDir); + const slowResult = (state.results as Record>).slow; + assert.equal(slowResult.outcome, "timed_out"); + }); +}); + +test("FlowRunner times out async ACP prompt callbacks", async () => { + await withTempHome(async () => { + const outputRoot = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-store-")); + const runner = new FlowRunner({ + resolveAgent: () => ({ + agentName: "mock", + agentCommand: MOCK_AGENT_COMMAND, + cwd: process.cwd(), + }), + permissionMode: "approve-all", + outputRoot, + }); + + const flow = defineFlow({ + name: "acp-prompt-timeout-test", + startAt: "slow", + nodes: { + slow: acp({ + session: { + isolated: true, + }, + timeoutMs: 50, + prompt: async () => await new Promise(() => {}), + }), + }, + edges: [], + }); + + await assert.rejects(async () => await runner.run(flow, {}), TimeoutError); + const runDir = await waitForRunDir(outputRoot, "acp-prompt-timeout-test"); + const state = await readRunJson(runDir); + const slowResult = (state.results as Record>).slow; + assert.equal(slowResult.outcome, "timed_out"); + }); +}); + +test("FlowRunner times out async checkpoint callbacks", async () => { + await withTempHome(async () => { + const outputRoot = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-store-")); + const runner = new FlowRunner({ + resolveAgent: () => ({ + agentName: "unused", + agentCommand: "unused", + cwd: process.cwd(), + }), + permissionMode: "approve-all", + outputRoot, + }); + + const flow = defineFlow({ + name: "checkpoint-timeout-test", + startAt: "wait", + nodes: { + wait: checkpoint({ + timeoutMs: 50, + run: async () => await new Promise(() => {}), + }), + }, + edges: [], + }); + + await assert.rejects(async () => await runner.run(flow, {}), TimeoutError); + const runDir = await waitForRunDir(outputRoot, "checkpoint-timeout-test"); + const state = await readRunJson(runDir); + const waitResult = (state.results as Record>).wait; + assert.equal(waitResult.outcome, "timed_out"); + }); +}); + test("FlowRunner stores successful node results separately from outputs", async () => { await withTempHome(async () => { const outputRoot = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-store-")); diff --git a/test/integration.test.ts b/test/integration.test.ts index ce8c16c..093f090 100644 --- a/test/integration.test.ts +++ b/test/integration.test.ts @@ -210,7 +210,10 @@ test("integration: flow run executes function and shell actions from --input-fil assert.equal(payload.status, "completed"); assert.equal(payload.outputs?.prepare?.text, "SMOKE"); assert.equal(payload.outputs?.finalize?.value, "SMOKE"); - assert.equal(typeof payload.outputs?.finalize?.cwd, "string"); + assert.equal( + await fs.realpath(String(payload.outputs?.finalize?.cwd ?? "")), + await fs.realpath(cwd), + ); } finally { await fs.rm(cwd, { recursive: true, force: true }); } From 8015456350b873fe4d0291018b64d65d7393f6b6 Mon Sep 17 00:00:00 2001 From: Onur Solmaz <2453968+osolmaz@users.noreply.github.com> Date: Thu, 26 Mar 2026 15:50:19 +0100 Subject: [PATCH 20/22] fix: harden flow timeout handling --- src/flows/executors/shell.ts | 4 +- src/flows/runtime.ts | 81 ++++++++++++------------ src/flows/store.ts | 3 +- test/flows-shell.test.ts | 11 ++++ test/flows-store.test.ts | 59 ++++++++++++++++++ test/flows.test.ts | 115 +++++++++++++++++++++++++++++++++++ 6 files changed, 227 insertions(+), 46 deletions(-) diff --git a/src/flows/executors/shell.ts b/src/flows/executors/shell.ts index 3400a9d..8aebe47 100644 --- a/src/flows/executors/shell.ts +++ b/src/flows/executors/shell.ts @@ -60,10 +60,10 @@ export async function runShellAction(spec: ShellActionExecution): Promise 0 ? `\n${stderr.trim()}` : ""}`, + `Shell action failed (${renderShellCommand(spec.command, args)}): ${signal ? `signal ${signal}` : `exit ${String(exitCode)}`}${stderr.length > 0 ? `\n${stderr.trim()}` : ""}`, ), ); return; diff --git a/src/flows/runtime.ts b/src/flows/runtime.ts index a25264d..16fb746 100644 --- a/src/flows/runtime.ts +++ b/src/flows/runtime.ts @@ -333,16 +333,14 @@ export class FlowRunner { node: ComputeNodeDefinition, context: FlowNodeContext, ): Promise { + const nodeTimeoutMs = node.timeoutMs ?? this.defaultNodeTimeoutMs; const output = await this.runWithHeartbeat( runDir, state, state.currentNode ?? "", node, - async () => - await withTimeout( - Promise.resolve(node.run(context)), - node.timeoutMs ?? this.defaultNodeTimeoutMs, - ), + nodeTimeoutMs, + async () => await Promise.resolve(node.run(context)), ); return { output, @@ -366,7 +364,8 @@ export class FlowRunner { state, state.currentNode ?? "", node, - async () => await withTimeout(Promise.resolve(node.run(context)), nodeTimeoutMs), + nodeTimeoutMs, + async () => await Promise.resolve(node.run(context)), ); return { output, @@ -377,36 +376,36 @@ export class FlowRunner { }; } - const execution = await this.runWithHeartbeat( + const { output, rawText } = await this.runWithHeartbeat( runDir, state, state.currentNode ?? "", node, - async () => await withTimeout(Promise.resolve(node.exec(context)), nodeTimeoutMs), - ); - const effectiveExecution: ShellActionExecution = { - ...execution, - cwd: resolveShellActionCwd(this.defaultCwd, execution.cwd), - timeoutMs: execution.timeoutMs ?? nodeTimeoutMs, - }; - this.updateStatusDetail(state, formatShellActionSummary(effectiveExecution)); - await this.store.writeLive(runDir, state, { - type: "node_detail", - nodeId: state.currentNode, - detail: state.statusDetail, - }); - const result = await this.runWithHeartbeat( - runDir, - state, - state.currentNode ?? "", - node, - async () => await runShellAction(effectiveExecution), + nodeTimeoutMs, + async () => { + const execution = await Promise.resolve(node.exec(context)); + const effectiveExecution: ShellActionExecution = { + ...execution, + cwd: resolveShellActionCwd(this.defaultCwd, execution.cwd), + timeoutMs: execution.timeoutMs ?? nodeTimeoutMs, + }; + this.updateStatusDetail(state, formatShellActionSummary(effectiveExecution)); + await this.store.writeLive(runDir, state, { + type: "node_detail", + nodeId: state.currentNode, + detail: state.statusDetail, + }); + const result = await runShellAction(effectiveExecution); + return { + output: node.parse ? await node.parse(result, context) : result, + rawText: result.combinedOutput, + }; + }, ); - const output = node.parse ? await node.parse(result, context) : result; return { output, promptText: null, - rawText: result.combinedOutput, + rawText, sessionInfo: null, agentInfo: null, }; @@ -419,6 +418,7 @@ export class FlowRunner { node: CheckpointNodeDefinition, context: FlowNodeContext, ): Promise { + const nodeTimeoutMs = node.timeoutMs ?? this.defaultNodeTimeoutMs; const output = typeof node.run === "function" ? await this.runWithHeartbeat( @@ -426,11 +426,8 @@ export class FlowRunner { state, state.currentNode ?? "", node, - async () => - await withTimeout( - Promise.resolve(node.run?.(context)), - node.timeoutMs ?? this.defaultNodeTimeoutMs, - ), + nodeTimeoutMs, + async () => await Promise.resolve(node.run?.(context)), ) : { checkpoint: nodeId, @@ -459,18 +456,14 @@ export class FlowRunner { state, state.currentNode ?? "", node, + nodeTimeoutMs, async () => { const resolvedAgent = this.resolveAgent(node.profile); const agentInfo = { ...resolvedAgent, - cwd: await withTimeout( - resolveNodeCwd(resolvedAgent.cwd, node.cwd, context), - nodeTimeoutMs, - ), + cwd: await resolveNodeCwd(resolvedAgent.cwd, node.cwd, context), }; - const prompt = normalizePromptInput( - await withTimeout(Promise.resolve(node.prompt(context)), nodeTimeoutMs), - ); + const prompt = normalizePromptInput(await Promise.resolve(node.prompt(context))); const promptText = promptToDisplayText(prompt); this.updateStatusDetail(state, summarizePrompt(promptText, node.statusDetail)); await this.store.writeLive(runDir, state, { @@ -490,7 +483,7 @@ export class FlowRunner { }; } - boundSession = await this.ensureSessionBinding(state, flow, node, agentInfo); + boundSession = await this.ensureSessionBinding(state, flow, node, agentInfo, nodeTimeoutMs); const rawText = await this.runPersistentPrompt(boundSession, prompt, nodeTimeoutMs); const sessionInfo = await this.refreshSessionBinding(boundSession); state.sessionBindings[sessionInfo.key] = sessionInfo; @@ -549,6 +542,7 @@ export class FlowRunner { state: FlowRunState, nodeId: string, node: FlowNodeCommon, + timeoutMs: number | undefined, run: () => Promise, onTimeout?: () => Promise, ): Promise { @@ -574,7 +568,7 @@ export class FlowRunner { } try { - return await run(); + return await withTimeout(run(), timeoutMs); } catch (error) { if (error instanceof TimeoutError && onTimeout) { await onTimeout().catch(() => { @@ -595,6 +589,7 @@ export class FlowRunner { flow: FlowDefinition, node: AcpNodeDefinition, agent: ResolvedFlowAgent, + timeoutMs: number | undefined, ): Promise { const handle = node.session?.handle ?? "main"; const key = createSessionBindingKey(agent.agentCommand, agent.cwd, handle); @@ -613,7 +608,7 @@ export class FlowRunner { nonInteractivePermissions: this.nonInteractivePermissions, authCredentials: this.authCredentials, authPolicy: this.authPolicy, - timeoutMs: this.defaultNodeTimeoutMs, + timeoutMs, verbose: this.verbose, sessionOptions: this.sessionOptions, }); diff --git a/src/flows/store.ts b/src/flows/store.ts index a03eff5..da3387b 100644 --- a/src/flows/store.ts +++ b/src/flows/store.ts @@ -1,3 +1,4 @@ +import { randomUUID } from "node:crypto"; import fs from "node:fs/promises"; import os from "node:os"; import path from "node:path"; @@ -75,7 +76,7 @@ function createLiveState(state: FlowRunState): FlowLiveState { } async function writeJsonAtomic(filePath: string, value: unknown): Promise { - const tempPath = `${filePath}.${process.pid}.${Date.now()}.tmp`; + const tempPath = `${filePath}.${process.pid}.${Date.now()}.${randomUUID()}.tmp`; const payload = JSON.stringify(value, null, 2); await fs.writeFile(tempPath, `${payload}\n`, "utf8"); await fs.rename(tempPath, filePath); diff --git a/test/flows-shell.test.ts b/test/flows-shell.test.ts index a0b395b..d296890 100644 --- a/test/flows-shell.test.ts +++ b/test/flows-shell.test.ts @@ -66,3 +66,14 @@ test("runShellAction times out long-running commands", async () => { (error: unknown) => error instanceof TimeoutError, ); }); + +test("runShellAction rejects commands terminated by signal", async () => { + await assert.rejects( + async () => + await runShellAction({ + command: "/bin/sh", + args: ["-c", 'kill -TERM "$$"'], + }), + /signal SIGTERM/, + ); +}); diff --git a/test/flows-store.test.ts b/test/flows-store.test.ts index e5c3b3a..5edf203 100644 --- a/test/flows-store.test.ts +++ b/test/flows-store.test.ts @@ -89,3 +89,62 @@ test("FlowRunStore writes snapshots, live state, and events", async () => { await fs.rm(outputRoot, { recursive: true, force: true }); } }); + +test("FlowRunStore uses unique temp paths for concurrent live writes", async () => { + const outputRoot = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-store-race-")); + const originalDateNow = Date.now; + Date.now = () => 1_700_000_000_000; + + try { + const store = new FlowRunStore(outputRoot); + const runDir = await store.createRunDir("run-race"); + const baseState: FlowRunState = { + runId: "run-race", + flowName: "race", + startedAt: "2026-03-26T00:00:00.000Z", + updatedAt: "2026-03-26T00:00:00.000Z", + status: "running", + input: {}, + outputs: {}, + results: {}, + steps: [], + sessionBindings: {}, + currentNode: "step", + currentNodeKind: "action", + currentNodeStartedAt: "2026-03-26T00:00:00.000Z", + lastHeartbeatAt: "2026-03-26T00:00:00.000Z", + }; + + await Promise.all([ + store.writeLive(runDir, structuredClone(baseState), { + type: "node_heartbeat", + nodeId: "step", + }), + store.writeLive( + runDir, + { + ...structuredClone(baseState), + statusDetail: "updated", + }, + { + type: "node_detail", + nodeId: "step", + }, + ), + ]); + + const live = JSON.parse(await fs.readFile(path.join(runDir, "live.json"), "utf8")) as { + runId?: string; + }; + const events = (await fs.readFile(path.join(runDir, "events.ndjson"), "utf8")) + .trim() + .split("\n") + .map((line) => JSON.parse(line) as { type?: string }); + + assert.equal(live.runId, "run-race"); + assert.equal(events.length, 2); + } finally { + Date.now = originalDateNow; + await fs.rm(outputRoot, { recursive: true, force: true }); + } +}); diff --git a/test/flows.test.ts b/test/flows.test.ts index a7f23c6..8fa3d91 100644 --- a/test/flows.test.ts +++ b/test/flows.test.ts @@ -523,6 +523,43 @@ test("FlowRunner times out async shell exec callbacks", async () => { }); }); +test("FlowRunner times out async shell parse callbacks", async () => { + await withTempHome(async () => { + const outputRoot = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-store-")); + const runner = new FlowRunner({ + resolveAgent: () => ({ + agentName: "unused", + agentCommand: "unused", + cwd: process.cwd(), + }), + permissionMode: "approve-all", + outputRoot, + }); + + const flow = defineFlow({ + name: "shell-parse-timeout-test", + startAt: "slow", + nodes: { + slow: shell({ + timeoutMs: 50, + exec: () => ({ + command: process.execPath, + args: ["-e", 'process.stdout.write("ok")'], + }), + parse: async () => await new Promise(() => {}), + }), + }, + edges: [], + }); + + await assert.rejects(async () => await runner.run(flow, {}), TimeoutError); + const runDir = await waitForRunDir(outputRoot, "shell-parse-timeout-test"); + const state = await readRunJson(runDir); + const slowResult = (state.results as Record>).slow; + assert.equal(slowResult.outcome, "timed_out"); + }); +}); + test("FlowRunner times out async ACP prompt callbacks", async () => { await withTempHome(async () => { const outputRoot = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-store-")); @@ -559,6 +596,84 @@ test("FlowRunner times out async ACP prompt callbacks", async () => { }); }); +test("FlowRunner times out async ACP parse callbacks", async () => { + await withTempHome(async () => { + const outputRoot = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-store-")); + const runner = new FlowRunner({ + resolveAgent: () => ({ + agentName: "unused", + agentCommand: "unused", + cwd: process.cwd(), + }), + permissionMode: "approve-all", + outputRoot, + }); + const runnerHarness = runner as unknown as { + runIsolatedPrompt: () => Promise; + }; + runnerHarness.runIsolatedPrompt = async () => "hello"; + + const flow = defineFlow({ + name: "acp-parse-timeout-test", + startAt: "slow", + nodes: { + slow: acp({ + session: { + isolated: true, + }, + timeoutMs: 50, + prompt: () => "hello", + parse: async () => await new Promise(() => {}), + }), + }, + edges: [], + }); + + await assert.rejects(async () => await runner.run(flow, {}), TimeoutError); + const runDir = await waitForRunDir(outputRoot, "acp-parse-timeout-test"); + const state = await readRunJson(runDir); + const slowResult = (state.results as Record>).slow; + assert.equal(slowResult.outcome, "timed_out"); + }); +}); + +test("FlowRunner respects per-node timeouts while creating persistent ACP sessions", async () => { + await withTempHome(async () => { + const outputRoot = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-store-")); + const runner = new FlowRunner({ + resolveAgent: () => ({ + agentName: "unused", + agentCommand: "unused", + cwd: process.cwd(), + }), + permissionMode: "approve-all", + outputRoot, + }); + const runnerHarness = runner as unknown as { + ensureSessionBinding: () => Promise; + }; + runnerHarness.ensureSessionBinding = async () => await new Promise(() => {}); + + const flow = defineFlow({ + name: "acp-session-create-timeout-test", + startAt: "slow", + nodes: { + slow: acp({ + timeoutMs: 50, + prompt: () => "hello", + }), + }, + edges: [], + }); + + await assert.rejects(async () => await runner.run(flow, {}), TimeoutError); + const runDir = await waitForRunDir(outputRoot, "acp-session-create-timeout-test"); + const state = await readRunJson(runDir); + const slowResult = (state.results as Record>).slow; + assert.equal(slowResult.outcome, "timed_out"); + }); +}); + test("FlowRunner times out async checkpoint callbacks", async () => { await withTempHome(async () => { const outputRoot = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-store-")); From 8bc3af0eeebbd6bb7b030d9d3146480d67b78a3b Mon Sep 17 00:00:00 2001 From: Onur Solmaz <2453968+osolmaz@users.noreply.github.com> Date: Thu, 26 Mar 2026 16:05:57 +0100 Subject: [PATCH 21/22] fix: support external flow imports --- src/flows/cli.ts | 70 +++++++++++++++++++--- src/session-runtime/queue-owner-process.ts | 23 +++++++ test/integration.test.ts | 68 ++++++++++++++++++++- test/queue-owner-process.test.ts | 14 +++++ 4 files changed, 167 insertions(+), 8 deletions(-) diff --git a/src/flows/cli.ts b/src/flows/cli.ts index b9df3cd..b5ff55a 100644 --- a/src/flows/cli.ts +++ b/src/flows/cli.ts @@ -1,6 +1,7 @@ +import { randomUUID } from "node:crypto"; import fs from "node:fs/promises"; import path from "node:path"; -import { pathToFileURL } from "node:url"; +import { fileURLToPath, pathToFileURL } from "node:url"; import { InvalidArgumentError, type Command } from "commander"; import { resolveAgentInvocation, @@ -18,6 +19,9 @@ type FlowRunFlags = { defaultAgent?: string; }; +const FLOW_RUNTIME_SPECIFIER = "acpx/flows"; +const TEXT_MODULE_EXTENSIONS = new Set([".js", ".mjs", ".cjs", ".ts", ".tsx", ".mts", ".cts"]); + export async function handleFlowRun( flowFile: string, flags: FlowRunFlags, @@ -77,15 +81,67 @@ async function readFlowInput(flags: FlowRunFlags): Promise { } async function loadFlowModule(flowPath: string): Promise { - const flowUrl = pathToFileURL(flowPath).href; const extension = path.extname(flowPath).toLowerCase(); - const module = await loadFlowRuntimeModule(flowUrl, extension); + const prepared = await prepareFlowModuleImport(flowPath, extension); + try { + const module = await loadFlowRuntimeModule(prepared.flowUrl, extension); + + const candidate = findFlowDefinition(module); + if (!candidate) { + throw new Error(`Flow module must export a flow object: ${flowPath}`); + } + return candidate; + } finally { + await prepared.cleanup?.(); + } +} + +async function prepareFlowModuleImport( + flowPath: string, + extension: string, +): Promise<{ + flowUrl: string; + cleanup?: () => Promise; +}> { + const flowUrl = pathToFileURL(flowPath).href; + if (!TEXT_MODULE_EXTENSIONS.has(extension)) { + return { flowUrl }; + } + + const source = await fs.readFile(flowPath, "utf8"); + if (!source.includes(FLOW_RUNTIME_SPECIFIER)) { + return { flowUrl }; + } + + const runtimeSpecifier = resolveFlowRuntimeImportSpecifier(); + const rewritten = source.replaceAll( + /(["'])acpx\/flows\1/g, + (_match, quote: string) => `${quote}${runtimeSpecifier}${quote}`, + ); + if (rewritten === source) { + return { flowUrl }; + } - const candidate = findFlowDefinition(module); - if (!candidate) { - throw new Error(`Flow module must export a flow object: ${flowPath}`); + const tempPath = path.join(path.dirname(flowPath), `.acpx-flow-load-${randomUUID()}${extension}`); + await fs.writeFile(tempPath, rewritten, "utf8"); + return { + flowUrl: pathToFileURL(tempPath).href, + cleanup: async () => { + await fs.rm(tempPath, { force: true }); + }, + }; +} + +function resolveFlowRuntimeImportSpecifier(): string { + const selfPath = fileURLToPath(import.meta.url); + + if (selfPath.endsWith(`${path.sep}src${path.sep}flows${path.sep}cli.ts`)) { + return new URL("../flows.ts", import.meta.url).href; + } + if (selfPath.endsWith(`${path.sep}src${path.sep}flows${path.sep}cli.js`)) { + return new URL("../flows.js", import.meta.url).href; } - return candidate; + return new URL("./flows.js", import.meta.url).href; } async function loadFlowRuntimeModule( diff --git a/src/session-runtime/queue-owner-process.ts b/src/session-runtime/queue-owner-process.ts index 6b3cde7..15f5852 100644 --- a/src/session-runtime/queue-owner-process.ts +++ b/src/session-runtime/queue-owner-process.ts @@ -53,6 +53,29 @@ export function sanitizeQueueOwnerExecArgv( if (value.startsWith("--test-")) { continue; } + if ( + value === "--inspect" || + value === "--inspect-brk" || + value === "--inspect-port" || + value === "--inspect-publish-uid" || + value.startsWith("--inspect=") || + value.startsWith("--inspect-brk=") || + value.startsWith("--inspect-port=") || + value.startsWith("--inspect-publish-uid=") || + value === "--debug-port" || + value.startsWith("--debug-port=") + ) { + if ( + value === "--inspect" || + value === "--inspect-brk" || + value === "--inspect-port" || + value === "--inspect-publish-uid" || + value === "--debug-port" + ) { + index += 1; + } + continue; + } sanitized.push(value); } return sanitized; diff --git a/test/integration.test.ts b/test/integration.test.ts index 093f090..82e298f 100644 --- a/test/integration.test.ts +++ b/test/integration.test.ts @@ -220,6 +220,63 @@ test("integration: flow run executes function and shell actions from --input-fil }); }); +test('integration: flow run resolves "acpx/flows" imports for external flow files', async () => { + await withTempHome(async (homeDir) => { + const cwd = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-integration-cwd-")); + const flowDir = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-flow-import-")); + const flowPath = path.join(flowDir, "external.flow.ts"); + + try { + await fs.writeFile( + flowPath, + [ + 'import { compute, defineFlow } from "acpx/flows";', + "", + "export default defineFlow({", + ' name: "external-flow-import",', + ' startAt: "done",', + " nodes: {", + " done: compute({", + ' run: () => ({ ok: true, source: "external" }),', + " }),", + " },", + " edges: [],", + "});", + "", + ].join("\n"), + "utf8", + ); + + const result = await runCli( + ["--approve-all", "--cwd", cwd, "--format", "json", "flow", "run", flowPath], + homeDir, + ); + + assert.equal(result.code, 0, result.stderr); + const payload = JSON.parse(result.stdout.trim()) as { + action?: string; + status?: string; + outputs?: { + done?: { + ok?: boolean; + source?: string; + }; + }; + }; + + assert.equal(payload.action, "flow_run_result"); + assert.equal(payload.status, "completed"); + assert.deepEqual(payload.outputs?.done, { + ok: true, + source: "external", + }); + } finally { + await fs.rm(flowDir, { recursive: true, force: true }); + await fs.rm(cwd, { recursive: true, force: true }); + } + }); +}); + test("integration: flow run reports waiting checkpoints in json mode", async () => { await withTempHome(async (homeDir) => { const cwd = await fs.mkdtemp(path.join(os.tmpdir(), "acpx-integration-cwd-")); @@ -1961,9 +2018,18 @@ async function runCli( args: string[], homeDir: string, options: CliRunOptions = {}, +): Promise { + return await runCliWithEntry(CLI_PATH, args, homeDir, options); +} + +async function runCliWithEntry( + entryPath: string, + args: string[], + homeDir: string, + options: CliRunOptions = {}, ): Promise { return await new Promise((resolve, reject) => { - const child = spawn(process.execPath, [CLI_PATH, ...args], { + const child = spawn(process.execPath, [entryPath, ...args], { env: { ...process.env, HOME: homeDir, diff --git a/test/queue-owner-process.test.ts b/test/queue-owner-process.test.ts index 8c839a9..4337b4a 100644 --- a/test/queue-owner-process.test.ts +++ b/test/queue-owner-process.test.ts @@ -75,6 +75,20 @@ describe("sanitizeQueueOwnerExecArgv", () => { ["--import", "tsx", "--loader", "custom-loader"], ); }); + + it("drops debugger flags from queue-owner exec args", () => { + assert.deepEqual( + sanitizeQueueOwnerExecArgv([ + "--inspect-brk=9229", + "--inspect-port", + "9230", + "--debug-port=9231", + "--import", + "tsx", + ]), + ["--import", "tsx"], + ); + }); }); describe("buildQueueOwnerArgOverride", () => { From 7e154e921e74d05b821a7ef2a3bcdca857568657 Mon Sep 17 00:00:00 2001 From: Onur Solmaz <2453968+osolmaz@users.noreply.github.com> Date: Thu, 26 Mar 2026 16:16:31 +0100 Subject: [PATCH 22/22] chore: raise pr triage review timeout --- examples/flows/pr-triage/README.md | 2 +- examples/flows/pr-triage/pr-triage.flow.ts | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/examples/flows/pr-triage/README.md b/examples/flows/pr-triage/README.md index c54b38a..1ba3ae4 100644 --- a/examples/flows/pr-triage/README.md +++ b/examples/flows/pr-triage/README.md @@ -133,7 +133,7 @@ These are the current operational timeout assumptions in the single-file executa - `do_superficial_refactor`: 25 minutes - `collect_review_state`: 60 minutes - nested local `codex review` inside `collect_review_state`: 30 minutes -- `review_loop`: 30 minutes +- `review_loop`: 90 minutes - `collect_ci_state`: 15 minutes - `fix_ci_failures`: 30 minutes - `post_close_pr`: 15 minutes diff --git a/examples/flows/pr-triage/pr-triage.flow.ts b/examples/flows/pr-triage/pr-triage.flow.ts index f577356..0fe5769 100644 --- a/examples/flows/pr-triage/pr-triage.flow.ts +++ b/examples/flows/pr-triage/pr-triage.flow.ts @@ -102,7 +102,7 @@ const flow = { kind: "acp", session: MAIN_SESSION, cwd: ({ outputs }) => prepared(outputs).workdir, - timeoutMs: 30 * 60_000, + timeoutMs: 90 * 60_000, async prompt({ outputs }) { return promptReviewLoop(prepared(outputs), outputs); },