Skip to content

Releases: mulkatz/anvil

v2.1.0

20 Feb 19:52

Choose a tag to compare

Anvil v2.1.0

HTML export, hardened text escaping, and a comprehensive test suite (370 tests).

Highlights

HTML export

  • --output flag — Save debate results to a custom path (--output ./reports/my-analysis.md)
  • Automatic HTML export — Use a .html extension and get a self-contained HTML report with embedded CSS
  • XSS-safe rendering — HTML entities escaped, javascript: URLs sanitized, code blocks protected

Security & robustness

  • Hardened YAML escaping — Newlines, tabs, carriage returns, backslashes, and double quotes all survive the full write → store → read round-trip
  • Robust unescape pipeline — Replaced fragile sed-based unescaping with awk placeholder approach that correctly disambiguates \\n vs \n
  • Persona name validation — Rejects pipe delimiters (|), HTML comment markers (<!--/-->), and embedded newlines that would corrupt state parsing
  • Pipefail-safe parsing — Missing frontmatter fields no longer crash the stop-hook

Test suite

  • 370 bats tests covering setup, stop-hook, and integration scenarios
  • Adversarial input tests — Special characters, escaping edge cases, injection attempts
  • Combinatorial tests — Every mode × framework × focus combination verified
  • Round-trip tests — End-to-end through setup → hook → result pipeline
  • HTML report tests — Rendering correctness and XSS protection

Documentation

  • ADRs for v2 architecture decisions (005–009)
  • Test coverage requirements and conventions documented in CLAUDE.md

v2.0.0

19 Feb 17:06

Choose a tag to compare

Anvil v2.0.0

Structured debates become a full decision-making toolkit. 8 new features that make Anvil context-aware, framework-driven, and interactive.

Highlights

New capabilities

  • Decision frameworks (--framework) — Structure synthesis as ADR, pre-mortem, red-team, RFC, or risk register
  • Focus lens (--focus) — Narrow the debate to a single dimension: security, performance, cost, DX, maintainability, or custom
  • Code-aware debates (--context, --pr, --diff) — Inject real code, PRs, or diffs as debate context
  • Interactive mode (--interactive) — Steer the debate between rounds with user input
  • Debate chains (--follow-up, --versus) — Build on prior results or pit two analyses against each other

New modes

  • Stakeholder simulation (--mode stakeholders) — Each round is a different stakeholder perspective instead of adversarial sides
  • Custom personas (--persona) — Replace Advocate/Critic with named personas. 4 presets or free-text descriptions

Improved synthesis

  • Confidence calibration — Synthesizer derives confidence from debate dynamics (survived arguments, convergence, dismantled claims) rather than gut feeling

Combinability

All options compose freely:

/anvil:anvil "Should we adopt gRPC?" \
  --mode analyst \
  --framework adr \
  --focus performance \
  --context src/api/ \
  --research \
  --interactive \
  --rounds 2

v1.0.0

19 Feb 15:33

Choose a tag to compare

Anvil v1.0.0

Adversarial thinking plugin for Claude Code. Stress-test ideas through structured debates.

Highlights

Three debate modes

  • Analyst — Evidence-based technical analysis with data and benchmarks
  • Philosopher — Socratic exploration using first-principles reasoning
  • Devil's Advocate — Reversed roles that attack YOUR stated position

Core features

  • Multi-round debates — Configurable 1-5 rounds of Advocate vs. Critic argumentation
  • Synthesizer phase — Balanced final analysis with clear recommendation and confidence rating
  • Web research (--research) — Ground arguments in real-time evidence via WebSearch
  • Human-readable state — Debate transcript in Markdown with YAML frontmatter, inspectable at any time
  • Result file — Final analysis saved to .claude/anvil-result.local.md

Commands

  • /anvil:anvil "question" — Start a debate
  • /anvil:anvil-status — Check current round and phase
  • /anvil:anvil-cancel — Cancel an active debate

Installation

/plugin marketplace add Franjoo/anvil
/plugin install anvil@franjoo

Architecture

No TypeScript, no build step. Shell scripts orchestrate, markdown prompts instruct. Uses Claude Code's stop hook mechanism to drive a state machine through debate phases within a single session.