You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -73,7 +81,7 @@ Evals are **LLM-judged integration tests** that verify skills and hooks work cor
73
81
| Field | Type | Required | Description |
74
82
|-------|------|----------|-------------|
75
83
|`version`|`number`|**Yes**| Eval config format version. Currently `1`. |
76
-
|`engine`|`string`|**Yes**| Agent runtime to use: `"claude-code"`, `"copilot"`, `"codex"`, `"cursor"`. |
84
+
|`engine`|`string`|**Yes**| Agent runtime to use. Supported values: `"claude-code"`, `"copilot"`, `"codex"`, `"cursor"`. Current headless eval support is shown in [Platform Eval Entry Points](#platform-eval-entry-points). |
77
85
|`timeout`|`number`| No | Max seconds per eval case. Default `120`. |
78
86
|`judge`|`string`| No | Model used for LLM-as-judge assessment. Default: same as engine model. |
79
87
|`sandbox.network`|`bool`| No | Allow network access in sandbox. Default `false`. |
0 commit comments