Releases: JohnODowdAI/replaykit
Releases · JohnODowdAI/replaykit
v0.1.0-beta.1
ReplayKit v0.1.0-beta.1
ReplayKit turns failed agent traces into replayable regression cases.
Core loop
replaykit scaffold— turn a failed trace into a case directory with auto-inferred failure signaturesreplaykit lint— validate the casereplaykit replay— run against a candidate, verify whether the original failure still reproducesreplaykit render— re-render saved results
Quick start
pip install git+https://github.com/JohnODowdAI/replaykit.git
replaykit scaffold examples/traces/search-loop.json --out cases/search-loop
replaykit replay cases/search-loop/case.yaml --runner-cmd "python examples/runners/reproduce_failure.py"
replaykit replay cases/search-loop/case.yaml --runner-cmd "python examples/runners/fixed_runner.py"Best first use cases
- Repeated tool loop / tool call storm
- Missing final answer
- Hard error at end of run
- Known-bad output phrase in final response
What is intentionally not built yet
- Vendor trace importers (LangSmith, OpenAI, Anthropic)
- LLM semantic grading
- Dashboard / hosted service
- GitHub PR integration
This is a local-first beta. Feedback welcome via issues.