Skip to content

Commit 1756da8

Browse files
committed
fix: update package structure and validation rules for clarity
1 parent 3703406 commit 1756da8

File tree

3 files changed

+17
-9
lines changed

3 files changed

+17
-9
lines changed

docs/spec/01-package-format.md

Lines changed: 15 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,11 @@ package-name/
99
│ ├── SKILL.md # Skill instructions (REQUIRED per skill)
1010
│ ├── scripts/ # Executable scripts
1111
│ ├── references/ # Documentation files
12-
│ └── assets/ # Templates, data, images
12+
│ ├── assets/ # Templates, data, images
13+
│ └── tests/ # Deterministic skill tests (optional)
14+
│ ├── test-config.json # Test runner config
15+
│ ├── fixtures/ # Test input files
16+
│ └── cases/ # Test cases (YAML)
1317
├── commands/ # Slash commands (optional)
1418
│ └── command-name.md
1519
├── agents/ # Sub-agent definitions (optional)
@@ -20,16 +24,20 @@ package-name/
2024
│ └── rule-name/
2125
│ └── RULE.md
2226
├── hooks/ # Lifecycle hooks (optional)
23-
│ └── hooks.json
27+
│ ├── hooks.json # Hook definitions
28+
│ ├── scripts/ # Shell scripts referenced by hooks
29+
│ └── tests/ # Deterministic hook tests (optional)
30+
│ ├── test-config.json # Test runner config
31+
│ ├── fixtures/ # Simulated event payloads (JSON)
32+
│ └── cases/ # Test cases (YAML)
2433
├── mcp/ # MCP server configs (optional)
2534
│ └── servers.json
2635
├── evals/ # LLM-judged evaluations (optional)
2736
│ ├── eval-config.json # Eval runner configuration
2837
│ ├── fixtures/ # Shared test fixtures
29-
│ │ └── sample.pdf
30-
│ └── cases/ # Eval cases
31-
│ ├── 01-skill-e2e.yaml # Skill end-to-end eval
32-
│ └── 02-hook-integration.yaml # Hook integration eval
38+
│ ├── cases/ # Eval cases (YAML)
39+
│ └── reports/ # Eval run reports (auto-generated)
40+
│ └── <timestamp>.json # Full provenance per run
3341
├── AGENTS.md # Universal agent instructions (optional)
3442
├── README.md # Human documentation
3543
├── CHANGELOG.md # Version history
@@ -73,7 +81,7 @@ Evals are **LLM-judged integration tests** that verify skills and hooks work cor
7381
| Field | Type | Required | Description |
7482
|-------|------|----------|-------------|
7583
| `version` | `number` | **Yes** | Eval config format version. Currently `1`. |
76-
| `engine` | `string` | **Yes** | Agent runtime to use: `"claude-code"`, `"copilot"`, `"codex"`, `"cursor"`. |
84+
| `engine` | `string` | **Yes** | Agent runtime to use. Supported values: `"claude-code"`, `"copilot"`, `"codex"`, `"cursor"`. Current headless eval support is shown in [Platform Eval Entry Points](#platform-eval-entry-points). |
7785
| `timeout` | `number` | No | Max seconds per eval case. Default `120`. |
7886
| `judge` | `string` | No | Model used for LLM-as-judge assessment. Default: same as engine model. |
7987
| `sandbox.network` | `bool` | No | Allow network access in sandbox. Default `false`. |

docs/spec/02-manifest.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@ The manifest is the **single required file**. It identifies the package and decl
8787
"evals": [
8888
{
8989
"name": "accuracy-eval",
90-
"path": "evals/accuracy.yaml",
90+
"path": "evals/cases/accuracy-eval.yaml",
9191
"description": "Measures accuracy against benchmark",
9292
"metrics": [
9393
{ "name": "accuracy", "type": "percentage" }

docs/spec/15-validation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
| Rule | Constraint |
44
|------|-----------|
5-
| Package `name` | `[a-z0-9-]`, max 64 chars |
5+
| Package `name` | `[a-z0-9-]` max 64 chars, or scoped `@scope/name` max 130 chars |
66
| Package `version` | Valid SemVer 2.0 |
77
| Skill `description` | Max 1024 chars, must describe WHAT + WHEN |
88
| SKILL.md body | Recommended < 5,000 tokens / 500 lines |

0 commit comments

Comments
 (0)