Skip to content

Releases: JohnODowdAI/replaykit

v0.1.0-beta.1

23 Mar 22:34

Choose a tag to compare

v0.1.0-beta.1 Pre-release
Pre-release

ReplayKit v0.1.0-beta.1

ReplayKit turns failed agent traces into replayable regression cases.

Core loop

  1. replaykit scaffold — turn a failed trace into a case directory with auto-inferred failure signatures
  2. replaykit lint — validate the case
  3. replaykit replay — run against a candidate, verify whether the original failure still reproduces
  4. replaykit render — re-render saved results

Quick start

pip install git+https://github.com/JohnODowdAI/replaykit.git
replaykit scaffold examples/traces/search-loop.json --out cases/search-loop
replaykit replay cases/search-loop/case.yaml --runner-cmd "python examples/runners/reproduce_failure.py"
replaykit replay cases/search-loop/case.yaml --runner-cmd "python examples/runners/fixed_runner.py"

Best first use cases

  • Repeated tool loop / tool call storm
  • Missing final answer
  • Hard error at end of run
  • Known-bad output phrase in final response

What is intentionally not built yet

  • Vendor trace importers (LangSmith, OpenAI, Anthropic)
  • LLM semantic grading
  • Dashboard / hosted service
  • GitHub PR integration

This is a local-first beta. Feedback welcome via issues.