autospec

autospec started as a practical experiment inside a project that leaned hard into writing specs before writing code.

We deliberately built up a large set of product, architecture, UI, and design docs before implementation because we wanted coding agents to have a solid documentation base to work from. That helped, but it also exposed a new bottleneck: the docs themselves were often close to usable without being quite precise enough, consistent enough, or explicit enough to drive implementation without follow-up questions.

The idea behind autospec was simple: if we can run an agent in a loop against a spec, compare the text before and after each pass, and stop when the changes flatten out, then we can try to make the specification itself converge before we ask another agent to write code from it.

This project is the result of repeatedly trying that idea on a real repository, watching it fail, and then tightening both the code and the prompt until the results became useful.

Who this is for

autospec is for teams that already keep meaningful project knowledge in markdown.

It is most useful if you have:

product specs that drive engineering work
UI or API docs that need to stay consistent with canonical docs
a docs-heavy repo where coding agents read markdown before writing code
a willingness to tune prompts and workflow for your own domain

It is probably not a good fit if your project has no written specs at all, or if you want a one-shot generator that invents an entire product definition from nothing.

What it does

autospec runs an AI coding agent in a convergence loop.

For each iteration it:

builds a prompt around the current target doc or scope
lets the agent edit the docs directly
snapshots the resulting text
compares the new text to the previous iteration
stops when the doc stabilizes, oscillates, times out, or hits the iteration limit

It supports three working modes:

strict: only the target doc is in scope
ripple: focus on one doc, but allow limited cross-doc fixes in the surrounding folder
sweep: review every doc in a folder and edit the smallest set needed to resolve real issues

Current capabilities

The current implementation already covers a fair amount of the workflow:

convergence detection from text snapshots rather than prompt-string matching
repeated-state detection to stop obvious oscillation
strict, ripple, and sweep execution modes
multi-agent execution via Copilot, Codex, Claude, Gemini, or a custom command
optional goal injection for targeted runs
optional reasoning-effort passthrough for agents that support it
configurable per-iteration timeout, including disabled timeout for long runs
optional scope-size cap for users who want a safety rail on broad runs
no-commit mode for experimentation without git writes
dirty-doc guardrails with explicit opt-out
per-iteration logs and tabular results tracking
optional --no-artifacts mode to avoid creating a repo-local .autospec/ directory
sparse-seed warnings for repos that are too thin to produce project-specific output reliably

Current status

autospec is still an experiment.

It works well enough to be useful, but it is not a finished product.

What works well today:

single-doc improvement runs
bounded multi-file runs such as a UI folder or design folder
repo-wide ripple runs from a canonical doc

What still needs care:

full-repo sweep runs across a large docs tree
very sparse or nearly empty seed docs
prompt tuning for repositories that use different document structure or terminology than the project that shaped the initial prompt

What we learned by dogfooding it

autospec improved because we used it against a real project's docs and then treated each failure as a product bug.

Some of the major changes came directly from those experiments:

convergence detection moved from prompt-string matching to text snapshots
repeated-state detection was added to catch obvious oscillation
iteration-aware prompting was added so later passes stop polishing and start saying CONVERGED
broad sweeps were tightened to prefer canonical docs first and shrink the touched-file set over time
timeout handling became configurable instead of hard-coded
large-scope runs now expose enough logging to see which files actually moved and by how much

That matters because the tool is not just a script. The prompt and the loop design are part of the product.

What the experiments looked like

The early version did not converge reliably.

Initial live runs on a real docs-heavy project produced patterns like:

repeated 10 to 30 line churn on already-good docs
cross-doc edits that were broader than intended
oscillation between competing wordings
time spent on style changes after the real issues were already fixed

After tightening the prompt and loop logic, the behavior changed materially:

small docs converged in 2 to 4 iterations
medium docs that previously failed began converging consistently
docs/ui folder sweeps became practical
repo-wide ripple runs became reliable
full sweep docs/ runs became far more disciplined, though still not the strongest mode

This is the main reason the README emphasizes experimentation rather than claiming the tool is universally solved.

Bootstrapping and sparse starting points

autospec improves existing docs. It is not currently a true scaffold generator.

Here is the current behavior:

missing target file: hard error
empty docs directory: hard error
one small seed doc: works, but the output is usually generic
one almost-empty seed doc: usually converges to a conservative statement that requirements are unspecified

In practice, autospec needs at least one seed markdown doc with a little real content.

If you want a useful first pass, give it some anchor points such as:

the core nouns in the system
one or two important workflows
explicit constraints or out-of-scope lines
a canonical README that explains what the project is

Without that, the safest thing for the agent to do is stay generic or refuse to invent specifics.

Requirements

an AI coding agent CLI installed locally
markdown docs in the target repository

If you are building from source instead of using a release binary, you also need:

Rust stable

Tested agent CLIs:

GitHub Copilot CLI
OpenAI Codex CLI
Claude Code CLI
Gemini CLI
arbitrary custom commands via --agent-cmd

Git is strongly recommended, but not strictly required if you run with --no-commit --allow-dirty.

Installation

Homebrew on macOS:

brew tap localtaskrepo/lotar
brew install autospec
autospec --version

Scoop on Windows:

scoop bucket add lotar https://github.com/localtaskrepo/scoop-lotar
scoop install autospec
autospec --version

From source:

cargo install --path .
autospec --version

Without installing globally:

cargo build --release
./target/release/autospec --version

Tagged releases publish:

GitHub release archives
a Homebrew formula in localtaskrepo/homebrew-lotar
a Scoop manifest in localtaskrepo/scoop-lotar

See CONTRIBUTING.md if you want to work on autospec itself or maintain a fork.

Git hooks

This repo includes optional git hooks under .githooks/ to catch formatting and clippy issues before they reach CI.

Enable them once per clone:

git config core.hooksPath .githooks

The pre-commit hook auto-runs cargo fmt --all on staged Rust files while preserving partial staging. The pre-push hook runs the main Rust validation checks used by CI:

cargo fmt --all --check
cargo clippy --all-targets --all-features -- -D warnings
cargo test --all-targets
cargo build --profile ci

See .githooks/SETUP.md for the full behavior and bypass options.

Quick start

See the full CLI reference:

autospec --help

Single doc:

autospec docs/product.md

Single doc, no commits:

autospec --no-commit docs/product.md

Keep the repo clean and skip .autospec/ output files:

autospec --no-commit --no-artifacts docs/product.md

Folder sweep:

autospec --scope sweep docs/ui

Repo-wide ripple from a canonical doc:

autospec --scope ripple docs/entity-dictionary.md

Disable per-iteration timeout for large runs:

autospec --scope sweep --agent-timeout 0 docs/ui

Important flags

--scope strict|ripple|sweep
--goal "..."
--max-iters N
--threshold N
--stable-iters N
--agent copilot|codex|claude|gemini|custom
--agent-timeout N with 0 meaning disabled
--no-commit
--no-branch
--allow-dirty
--no-artifacts
--max-scope-files N
--dry-run

How to tune it for your own project

The default prompt is biased toward the repository that created it. That is not a bug. It is a reminder that you will probably get better results if you adapt the prompt to your own document structure and quality bar.

The highest leverage tuning points are:

templates/default_prompt.md Add the language your project actually uses. Name the canonical docs that own cross-cutting rules. Make the agent's idea of a good spec match your repo.
Scope choice Start with strict. Move to ripple when you trust the canonical docs. Use large sweep runs deliberately rather than by default.
Seed quality If the starting docs are vague, autospec can only sharpen them so far. Add a few concrete nouns, states, or constraints first.
Iteration budget Small docs often converge in 2 to 4 iterations. Large section sweeps may need more time or a disabled timeout.
Canonical-first structure If your repo has authoritative docs, make them obvious. autospec performs better when it can distinguish source-of-truth docs from downstream docs.

Recommended workflow

Start on one canonical doc in strict mode
Move to ripple mode once the canonical doc is stable
Use sweep on bounded folders such as docs/ui or docs/design
Treat full-repo sweep as a deep cleanup run, not the default mode
Review the logs and results before trusting the output blindly

Output and logs

autospec writes:

per-iteration logs in .autospec/logs/
run summaries in .autospec/results.tsv

These are worth reading. The change pattern across iterations usually tells you more than the final status line.

Examples:

+37/-26 -> +22/-20 -> +18/-14 -> +2/-2 usually means the run is healthy and narrowing
a sudden large jump late in a broad sweep usually means the agent found a new consistency layer rather than simply polishing text

Limits

It does not create a missing target doc from scratch
It does not scaffold a project from an empty docs directory
It can converge on output that is technically safe but too generic to be useful if the seed docs are too sparse
Very large full-repo sweeps are still the weakest mode

Why publish it at all

Because this pattern seems broadly useful.

Many teams are spending time on prompts, evals, and agent workflows while ignoring the quality of the docs those agents consume. autospec is an attempt to treat the spec itself as something that can be iteratively tightened and measured.

It may not end up as a standalone tool in its current form. But the experiment has already been useful, and this README is meant to be a solid starting point if and when the code moves into its own repository.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.githooks		.githooks
.github/workflows		.github/workflows
docs		docs
legacy		legacy
src		src
templates		templates
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
IDEAS.md		IDEAS.md
LICENSE		LICENSE
README.md		README.md
rust-toolchain.toml		rust-toolchain.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

autospec

Who this is for

What it does

Current capabilities

Current status

What we learned by dogfooding it

What the experiments looked like

Bootstrapping and sparse starting points

Requirements

Installation

Git hooks

Quick start

Important flags

How to tune it for your own project

Recommended workflow

Output and logs

Limits

Why publish it at all

About

Uh oh!

Releases 2

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

autospec

Who this is for

What it does

Current capabilities

Current status

What we learned by dogfooding it

What the experiments looked like

Bootstrapping and sparse starting points

Requirements

Installation

Git hooks

Quick start

Important flags

How to tune it for your own project

Recommended workflow

Output and logs

Limits

Why publish it at all

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages