Skip to content

v0.4.0 — Eval

Choose a tag to compare

@TheStack-ai TheStack-ai released this 26 Mar 10:59
· 9 commits to main since this release

What's New

pulser eval — Skill Testing

Test your skills against real inputs. Write eval.yaml next to your SKILL.md:

tests:
  - name: "catches bugs"
    input: "Review: function add(a,b) { return a - b }"
    assert:
      - contains: "subtract"
      - min-length: 30

Run pulser eval — it executes each test through claude -p, checks assertions, and tracks regressions automatically.

Assertions

contains, not-contains, min-length, max-length, matches (regex)

Baseline & Regression

  • First run saves results as baseline
  • Subsequent runs compare against baseline
  • Exit code 3 on regression (previously passing test now fails)

Other Changes

  • Updated landing page, README (EN/KO), SKILL.md
  • Version bump to 0.4.0

Full Changelog: v0.3.1...v0.4.0