TL;DR: Test skill docs against multiple AI models. Run
bun run src/cli.ts test <skill-file.md> "<task>" -m qwen/qwen3-coder:freeand see what models found confusing.
Before using Focus Group, you need:
- Bun runtime — Install from https://bun.sh (
curl -fsSL https://bun.sh/install | bash) - OpenRouter API key — Free signup at https://openrouter.ai/keys
- A skill file — Markdown doc describing your tool (see "What's a Skill File?" below)
Focus Group tests AI tool documentation by sending it to multiple AI models along with a task. The models respond as QA testers — reporting what confused them and suggesting improvements. This helps you find documentation gaps before shipping.
A skill file is a markdown document containing instructions for AI agents. It describes a tool, API, or capability that an AI should be able to use. See examples/sample-skill.md for a complete example.
# Podcast Generator
Generate podcasts with multiple voices.
## Usage
podcast-gen --voices <count> --duration <minutes> --output <file>
## Parameters
- `--voices`: Number of distinct voices (1-5)
- `--duration`: Length in minutes
- `--output`: Output file path (.mp3)
## Example
podcast-gen --voices 3 --duration 30 --output episode.mp3# Clone and install
git clone https://github.com/EmZod/Agent-Focus-Group
cd Agent-Focus-Group
bun installImportant: All commands must be run from inside the Agent-Focus-Group directory.
# 1. Set your API key (get from https://openrouter.ai/keys)
export OPENROUTER_API_KEY="sk-or-v1-..."
# 2. Create a test skill file
cat > test-skill.md << 'EOF'
# Test Tool
Does something useful.
## Usage
test-tool --input <file>
## Example
test-tool --input data.txt
EOF
# 3. Run test with FREE model (zero cost)
bun run src/cli.ts test test-skill.md "Process the data in data.txt" -m qwen/qwen3-coder:free
# 4. Results appear in ~30-60 seconds
# View saved results anytime:
bun run src/cli.ts show latestTip: To avoid re-entering your API key each session, add it to your shell config:
echo 'export OPENROUTER_API_KEY="sk-or-v1-..."' >> ~/.zshrc # or ~/.bashrcCOMMANDS WITHOUT -m FLAG COST MONEY (~$0.005 per test)
[BAD] bun run src/cli.ts test skill.md "task" # COSTS MONEY
[GOOD] bun run src/cli.ts test skill.md "task" -m qwen/qwen3-coder:free # FREE
When no -m flag is provided, Focus Group uses paid models. Always include -m qwen/qwen3-coder:free while learning.
All model IDs use the format provider/model-name. Free models add :free suffix.
| Format | Example | Cost |
|---|---|---|
provider/model |
openai/gpt-5-mini |
Paid |
provider/model:free |
qwen/qwen3-coder:free |
Free |
Without :free, the model uses the paid version. For example:
qwen/qwen3-coder= paidqwen/qwen3-coder:free= free
Tasks should be specific and actionable — something a user would actually ask an AI to do.
| Good Tasks | Bad Tasks |
|---|---|
| "Generate a 30-minute podcast with 3 voices" | "Complete this task" |
| "Create a user with email jay@example.com" | "Test the tool" |
| "Convert the PDF at ./report.pdf to markdown" | "Use this" |
Quoting: Tasks must be in quotes. For tasks containing quotes, escape them:
bun run src/cli.ts test skill.md "Create user named \"admin\""# Run tests (from Agent-Focus-Group directory)
bun run src/cli.ts test <skill.md> "<task>" -m qwen/qwen3-coder:free # FREE
bun run src/cli.ts test <skill.md> "<task>" # Paid (~$0.005)
bun run src/cli.ts test <skill.md> "<task>" -m model1,model2 # Specific models
bun run src/cli.ts test <skill.md> "<task>" -p expensive # Use preset
bun run src/cli.ts test <skill.md> "<task>" --no-save # Don't save to DB
bun run src/cli.ts test <skill.md> "<task>" --timeout 120 # Longer timeout
# View results
bun run src/cli.ts show latest # View last run
bun run src/cli.ts show <run-id> # View specific run
bun run src/cli.ts history # List all runs
bun run src/cli.ts diff <run1> <run2> # Compare runs
bun run src/cli.ts cost # Show API costs
bun run src/cli.ts config # Show configurationFile paths: Can be relative (./skill.md) or absolute (/home/user/skill.md).
| Model Type | Response Time |
|---|---|
Free models (:free) |
30-60 seconds |
Paid models (cheap preset) |
10-20 seconds |
Frontier models (opus, o3) |
20-40 seconds |
If nothing happens after 2 minutes, check Troubleshooting below.
Focus Group Test
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Skill: test-skill.md
Task: Process the data in data.txt
Models: 1
qwen/qwen3-coder:free [OK] 45.2s $0.000
Summary
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Completed: 1/1 models
Common Confusions:
• "input file format unclear" (1/1 models)
Suggested Improvements:
• Specify supported file formats (txt, csv, json)
How to read this:
[OK]or checkmark = model responded successfully (not that your docs are perfect!)(1/1 models)= all tested models identified this issue- Higher counts = more serious documentation gaps
- Results are saved to local database automatically
# Single free model
bun run src/cli.ts test ./skill.md "task" -m qwen/qwen3-coder:free
# Multiple free models (comma-separated, no spaces)
bun run src/cli.ts test ./skill.md "task" -m qwen/qwen3-coder:free,meta-llama/llama-3.2-3b-instruct:freeFree models require OPENROUTER_API_KEY but have zero cost. They're slower than paid models.
| Preset | Models | Cost | Use When |
|---|---|---|---|
cheap (default) |
gpt-5-mini, claude-haiku-4.5, gemini-2.5-flash | ~$0.005 | Quick iteration |
expensive |
gpt-5, claude-sonnet-4.5, gemini-2.5-pro | ~$0.05 | Before shipping |
frontier |
claude-opus-4.5, gpt-5.2-pro, o3-pro | ~$0.20 | Critical docs |
comprehensive |
8 models across all families | ~$0.10 | Full coverage |
Usage: bun run src/cli.ts test ./skill.md "task" -p expensive
Results are automatically saved to a local SQLite database (no setup required):
- macOS:
~/Library/Application Support/focus-group/ - Linux:
~/.local/share/focus-group/
Use bun run src/cli.ts config to see exact paths.
"command not found: bun"
curl -fsSL https://bun.sh/install | bash
source ~/.zshrc # or restart terminal"Cannot find module" or "Cannot find package" You're likely running from the wrong directory. All commands must be run from inside the Agent-Focus-Group folder:
cd Agent-Focus-Group && bun install"OPENROUTER_API_KEY not set"
export OPENROUTER_API_KEY="sk-or-v1-..." # Get from https://openrouter.ai/keys"Model not found"
- Use full ID with provider prefix:
openai/gpt-5-mininotgpt-5-mini - Check model exists at https://openrouter.ai/models
"Request timed out"
- Free models are slower. Add
--timeout 120for 2-minute timeout. - Check your internet connection.
"File not found"
- Ensure skill file exists at the specified path
- Try absolute path:
/full/path/to/skill.md
No output after 2+ minutes
- Free models can be slow during high traffic
- Try a different free model or use
--timeout 180 - Check if OpenRouter is having issues at https://status.openrouter.ai
- Writing new skill documentation
- Updating existing skill docs
- Before shipping docs to production
- Debugging why models misunderstand instructions
bun install # Install deps
bun test # Run tests (45 tests)
bun run typecheck # Type check
bun run src/cli.ts # Run CLI directlyKey source files:
src/core/runner.ts— Test executionsrc/core/prompts.ts— System prompts sent to modelssrc/config/defaults.ts— Model presets