Releases: enkronos/agentevalops
Public Alpha — v0.1.0-alpha.1
AgentEvalOps — Public Alpha
Failure-mode evaluation harness for agent systems.
AgentEvalOps provides structured scenarios and evaluation tooling
to test how agent systems behave under failure conditions.
Key capabilities in this release:
• Scenario specification for agent failure modes
• Scenario validation via JSON schema
• Evaluation runner and scorecard generation
• Pack workflows for reproducible evaluation suites
• CLI interface for validation, inspection, and evaluation
• Schema-validated reports for evaluation results
• CI, tests, and release infrastructure
Example CLI commands:
agentevalops scenario validate
agentevalops scenario inspect
agentevalops evaluate
agentevalops pack validate
agentevalops pack inspect
agentevalops pack evaluate
Project status:
This is an early public alpha release.
APIs, schemas, and CLI behavior may evolve
as the project matures.
Relationship to other projects:
AgentEvalOps is part of a collection of small open infrastructure
modules for reliable and governable agent systems, including:
• SkillPulse — detect and heal broken agent skills
• Agent Contracts — structured task delegation
• CapToken — capability tokens for authorization
• TraceForge — portable execution traces
• GuardMesh — runtime policy enforcement (in development)
Feedback and contributions are welcome.