Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
126 changes: 68 additions & 58 deletions .claude/skills/entire-external-agent/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,101 +2,111 @@
name: entire-external-agent
description: >
Run all three external agent binary phases sequentially: research, write-tests,
and implement using E2E-first TDD (unit tests written last).
Accepts an optional argument to run a single phase: research, write-tests, or implement.
Usage: /entire-external-agent [phase] — omit phase to run full pipeline.
Use when the user says "build external agent", "create agent binary",
"external agent plugin", or wants to run the full pipeline end-to-end.
and implement using black-box-first TDD across protocol compliance, lifecycle
integration, and unit tests. Accepts an optional argument to run a single phase:
research, write-tests, or implement.
---

# External Agent Binary — Full Pipeline

Build a standalone external agent binary that implements the Entire CLI's external agent protocol using E2E-first TDD. Parameters are collected once and reused across all phases.
Build a standalone external agent binary that implements the Entire CLI external agent protocol.

The current test split is:

1. **Protocol compliance** lives in `external-agents-tests`.
2. **Lifecycle integration** lives in this repo's `e2e/` harness.
3. **Unit tests** live in each agent module.

Do not add new generic protocol tests under this repo's `e2e/` directory.

## Parameters

Collect these before starting (ask the user if not provided):
Collect these before starting if the user did not provide them:

| Parameter | Description | How to derive |
|-----------|-------------|---------------|
| `AGENT_NAME` | Human-readable name (e.g., "Windsurf") | User provides |
| `AGENT_SLUG` | Binary suffix: `entire-agent-<AGENT_SLUG>` (kebab-case) | Kebab-case of agent name |
| `LANGUAGE` | Implementation language (Go, Python, TypeScript, Rust) | User provides; default Go |
| `PROJECT_DIR` | Where to create the project | Default: `./entire-agent-<AGENT_SLUG>` |
| `CAPABILITIES` | Which optional capabilities to implement | Derived from research phase |
| `ENTIRE_BIN` | Path to the Entire CLI binary | Default: `entire` from PATH, or `E2E_ENTIRE_BIN` env |
| Parameter | Description | Default |
|-----------|-------------|---------|
| `AGENT_NAME` | Human-readable name (for example, `Windsurf`) | User-provided |
| `AGENT_SLUG` | Binary suffix for `entire-agent-<slug>` | Kebab-case of `AGENT_NAME` |
| `LANGUAGE` | Implementation language | `Go` |
| `PROJECT_DIR` | Agent directory to create or edit | `./agents/entire-agent-<slug>` |
| `ENTIRE_BIN` | Path to the Entire CLI binary for lifecycle testing | `entire` from `PATH` or `E2E_ENTIRE_BIN` |

## Phase Selection

This skill accepts an optional argument to run a single phase:
- `/entire-external-agent research` runs only Phase 1.
- `/entire-external-agent write-tests` runs only Phase 2.
- `/entire-external-agent implement` runs only Phase 3.
- `/entire-external-agent` runs all three phases in order.

- `/entire-external-agent research` — Run only Phase 1 (research)
- `/entire-external-agent write-tests` — Run only Phase 2 (scaffold + E2E tests)
- `/entire-external-agent implement` — Run only Phase 3 (E2E-first TDD implementation)
- `/entire-external-agent` (no argument) — Run all three phases sequentially

If an argument is provided, skip directly to that phase's procedure. Parameters and prerequisites still apply — collect them before starting.
If a single phase is requested, still collect the shared parameters first.

## Protocol Spec

Use the protocol specification at:
`https://github.com/entireio/cli/blob/main/docs/architecture/external-agent-protocol.md`

If a user provides a different protocol spec location explicitly, use that instead and pass it to each phase as `PROTOCOL_SPEC_LOCATION`.

## Core Rule: E2E-First TDD
If the user gives a different spec location explicitly, use that instead.

This skill enforces strict E2E-first test-driven development. The rules:
## Core Rule: Black-Box-First TDD

1. **E2E tests are the spec.** The `e2e/` test harness defines what "working" means. The agent binary must pass all E2E tests to be considered complete.
2. **Run E2E tests at every step.** Each implementation tier starts by running the E2E test and watching it fail. You implement until it passes. No exceptions.
3. **Unit tests are written last.** After all E2E tiers pass, you write unit tests using real data collected from E2E runs as golden fixtures.
4. **If you didn't watch it fail, you don't know if it tests the right thing.** Never write a test you haven't seen fail first.
5. **Minimum viable fix.** At each E2E failure, implement only the code needed to fix that failure. Don't anticipate future tiers.
1. **Protocol compliance is the contract.** The binary must pass the shared `external-agents-tests` suite.
2. **Lifecycle tests prove real integration.** The repo-local `e2e/` harness covers the Entire + real-agent workflow and stays separate from generic protocol checks.
3. **Unit tests are written last.** After protocol and lifecycle behavior are working, add unit tests to lock down parsing, hooks, and file handling.
4. **Watch failures before fixing them.** Run the failing test first so you know what behavior the code must satisfy.
5. **Keep the fix scoped.** Implement only the behavior needed for the current failure, then rerun.

## Pipeline

Run these three phases in order. Each phase builds on the previous phase's output.

### Phase 1: Research

Discover the target agent's hook mechanism, transcript format, session management, and configuration. Map native concepts to protocol subcommands. Produces `<PROJECT_DIR>/AGENT.md` with protocol mapping and E2E prerequisites.

Use the Read tool to read the file `.claude/skills/entire-external-agent/research.md` and follow the procedure it contains.
Discover the target agent's hook mechanism, transcript format, session layout, CLI entrypoints, and lifecycle prerequisites. Produce `<PROJECT_DIR>/AGENT.md` with the protocol mapping and any real-CLI requirements needed for lifecycle tests.

**Expected output:** `<PROJECT_DIR>/AGENT.md` — agent research one-pager with protocol mapping and E2E test prerequisites.
Use `.claude/skills/entire-external-agent/research.md`.

**Commit gate:** After the research phase completes, create a git commit for the resulting files.
Expected output:
- `<PROJECT_DIR>/AGENT.md`

**Gate:** If the agent lacks any mechanism for lifecycle hooks or session management, discuss with the user before proceeding. Some agents may only support a subset of the protocol.
### Phase 2: Write Tests

### Phase 2: Write-Tests
Scaffold the binary and the test surfaces you will need:

Scaffold the binary with compilable stubs and create a self-contained `e2e/` test harness in the project directory. The harness exercises the full human workflow: `entire enable`, real agent invocation, hook firing, checkpoint validation. Tests are expected to fail at this stage — they define the spec.
- agent module structure under `<PROJECT_DIR>`
- protocol compliance expectations compatible with `external-agents-tests`
- lifecycle adapter wiring in this repo's `e2e/` harness
- optional compliance fixtures if the agent benefits from stronger black-box detect or transcript assertions

Use the Read tool to read the file `.claude/skills/entire-external-agent/write-tests.md` and follow the procedure it contains.
Use `.claude/skills/entire-external-agent/write-tests.md`.

**Expected output:** Complete project directory at `<PROJECT_DIR>` with compiled binary stubs and `e2e/` test harness that compiles but fails.
Expected output:
- compiling binary scaffold
- any needed lifecycle adapter files under `e2e/agents/`
- optional fixture file paths documented in `<PROJECT_DIR>/AGENT.md` or `README.md`

**Commit gate:** After the scaffold compiles and the e2e harness compiles (`cd e2e && go test -c -tags=e2e`), create a git commit.
### Phase 3: Implement

### Phase 3: Implement (E2E-First, Unit Tests Last)
Implement until:

Build the real agent binary using strict E2E-first TDD. E2E tests drive development at every step — run each tier, watch it fail, implement the minimum fix, repeat. Unit tests are written only after all E2E tiers pass, using real data from E2E runs as golden fixtures.
- the binary passes protocol compliance
- lifecycle tests pass when the required CLIs are available
- unit tests cover the important internal behaviors

Use the Read tool to read the file `.claude/skills/entire-external-agent/implement.md` and follow the procedure it contains.
Use `.claude/skills/entire-external-agent/implement.md`.

**Expected output:** Fully implemented binary where all E2E tests pass and unit tests lock in behavior.

**Note:** `AGENT.md` is a living document — Phases 2 and 3 update it when they discover new information during testing or implementation.
Expected output:
- fully working binary
- passing unit tests
- passing protocol compliance
- passing lifecycle integration where dependencies are available

## Final Summary

After all three phases, summarize:
- Agent name and binary name
- Language used
- Capabilities declared
- E2E test results (all tiers passing)
- Unit test coverage
- Installation instructions (`go install`, `pip install`, etc.)
- Any remaining gaps or TODOs
At the end, summarize:

- agent name and binary name
- implementation language
- declared capabilities
- protocol compliance status
- lifecycle test status
- unit test coverage
- installation instructions
- any remaining gaps
Loading
Loading