Agent Instructions

TL;DR: Test skill docs against multiple AI models. Run bun run src/cli.ts test <skill-file.md> "<task>" -m qwen/qwen3-coder:free and see what models found confusing.

Prerequisites

Before using Focus Group, you need:

Bun runtime — Install from https://bun.sh (curl -fsSL https://bun.sh/install | bash)
OpenRouter API key — Free signup at https://openrouter.ai/keys
A skill file — Markdown doc describing your tool (see "What's a Skill File?" below)

What This Is

Focus Group tests AI tool documentation by sending it to multiple AI models along with a task. The models respond as QA testers — reporting what confused them and suggesting improvements. This helps you find documentation gaps before shipping.

What's a Skill File?

A skill file is a markdown document containing instructions for AI agents. It describes a tool, API, or capability that an AI should be able to use. See examples/sample-skill.md for a complete example.

# Podcast Generator

Generate podcasts with multiple voices.

## Usage
podcast-gen --voices <count> --duration <minutes> --output <file>

## Parameters
- `--voices`: Number of distinct voices (1-5)
- `--duration`: Length in minutes
- `--output`: Output file path (.mp3)

## Example
podcast-gen --voices 3 --duration 30 --output episode.mp3

Installation

# Clone and install
git clone https://github.com/EmZod/Agent-Focus-Group
cd Agent-Focus-Group
bun install

Important: All commands must be run from inside the Agent-Focus-Group directory.

Quick Start (Free, Zero Cost)

# 1. Set your API key (get from https://openrouter.ai/keys)
export OPENROUTER_API_KEY="sk-or-v1-..."

# 2. Create a test skill file
cat > test-skill.md << 'EOF'
# Test Tool
Does something useful.

## Usage
test-tool --input <file>

## Example
test-tool --input data.txt
EOF

# 3. Run test with FREE model (zero cost)
bun run src/cli.ts test test-skill.md "Process the data in data.txt" -m qwen/qwen3-coder:free

# 4. Results appear in ~30-60 seconds
# View saved results anytime:
bun run src/cli.ts show latest

Tip: To avoid re-entering your API key each session, add it to your shell config:

echo 'export OPENROUTER_API_KEY="sk-or-v1-..."' >> ~/.zshrc  # or ~/.bashrc

Cost Warning

COMMANDS WITHOUT -m FLAG COST MONEY (~$0.005 per test)

[BAD]  bun run src/cli.ts test skill.md "task"                        # COSTS MONEY
[GOOD] bun run src/cli.ts test skill.md "task" -m qwen/qwen3-coder:free  # FREE

When no -m flag is provided, Focus Group uses paid models. Always include -m qwen/qwen3-coder:free while learning.

Model IDs

All model IDs use the format provider/model-name. Free models add :free suffix.

Format	Example	Cost
`provider/model`	`openai/gpt-5-mini`	Paid
`provider/model:free`	`qwen/qwen3-coder:free`	Free

Without :free, the model uses the paid version. For example:

qwen/qwen3-coder = paid
qwen/qwen3-coder:free = free

Writing Good Tasks

Tasks should be specific and actionable — something a user would actually ask an AI to do.

Good Tasks	Bad Tasks
"Generate a 30-minute podcast with 3 voices"	"Complete this task"
"Create a user with email jay@example.com"	"Test the tool"
"Convert the PDF at ./report.pdf to markdown"	"Use this"

Quoting: Tasks must be in quotes. For tasks containing quotes, escape them:

bun run src/cli.ts test skill.md "Create user named \"admin\""

Commands

# Run tests (from Agent-Focus-Group directory)
bun run src/cli.ts test <skill.md> "<task>" -m qwen/qwen3-coder:free  # FREE
bun run src/cli.ts test <skill.md> "<task>"                    # Paid (~$0.005)
bun run src/cli.ts test <skill.md> "<task>" -m model1,model2   # Specific models
bun run src/cli.ts test <skill.md> "<task>" -p expensive       # Use preset
bun run src/cli.ts test <skill.md> "<task>" --no-save          # Don't save to DB
bun run src/cli.ts test <skill.md> "<task>" --timeout 120      # Longer timeout

# View results
bun run src/cli.ts show latest              # View last run
bun run src/cli.ts show <run-id>            # View specific run
bun run src/cli.ts history                  # List all runs
bun run src/cli.ts diff <run1> <run2>       # Compare runs
bun run src/cli.ts cost                     # Show API costs
bun run src/cli.ts config                   # Show configuration

File paths: Can be relative (./skill.md) or absolute (/home/user/skill.md).

Expected Timing

Model Type	Response Time
Free models (`:free`)	30-60 seconds
Paid models (`cheap` preset)	10-20 seconds
Frontier models (`opus`, `o3`)	20-40 seconds

If nothing happens after 2 minutes, check Troubleshooting below.

Example Output

Focus Group Test
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Skill: test-skill.md
Task:  Process the data in data.txt
Models: 1

  qwen/qwen3-coder:free               [OK]  45.2s  $0.000

Summary
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Completed: 1/1 models

Common Confusions:
  • "input file format unclear" (1/1 models)

Suggested Improvements:
  • Specify supported file formats (txt, csv, json)

How to read this:

[OK] or checkmark = model responded successfully (not that your docs are perfect!)
(1/1 models) = all tested models identified this issue
Higher counts = more serious documentation gaps
Results are saved to local database automatically

Free Models

# Single free model
bun run src/cli.ts test ./skill.md "task" -m qwen/qwen3-coder:free

# Multiple free models (comma-separated, no spaces)
bun run src/cli.ts test ./skill.md "task" -m qwen/qwen3-coder:free,meta-llama/llama-3.2-3b-instruct:free

Free models require OPENROUTER_API_KEY but have zero cost. They're slower than paid models.

Model Presets

Preset	Models	Cost	Use When
`cheap` (default)	gpt-5-mini, claude-haiku-4.5, gemini-2.5-flash	~$0.005	Quick iteration
`expensive`	gpt-5, claude-sonnet-4.5, gemini-2.5-pro	~$0.05	Before shipping
`frontier`	claude-opus-4.5, gpt-5.2-pro, o3-pro	~$0.20	Critical docs
`comprehensive`	8 models across all families	~$0.10	Full coverage

Usage: bun run src/cli.ts test ./skill.md "task" -p expensive

Data Storage

Results are automatically saved to a local SQLite database (no setup required):

macOS: ~/Library/Application Support/focus-group/
Linux: ~/.local/share/focus-group/

Use bun run src/cli.ts config to see exact paths.

Troubleshooting

"command not found: bun"

curl -fsSL https://bun.sh/install | bash
source ~/.zshrc  # or restart terminal

"Cannot find module" or "Cannot find package" You're likely running from the wrong directory. All commands must be run from inside the Agent-Focus-Group folder:

cd Agent-Focus-Group && bun install

"OPENROUTER_API_KEY not set"

export OPENROUTER_API_KEY="sk-or-v1-..."  # Get from https://openrouter.ai/keys

"Model not found"

Use full ID with provider prefix: openai/gpt-5-mini not gpt-5-mini
Check model exists at https://openrouter.ai/models

"Request timed out"

Free models are slower. Add --timeout 120 for 2-minute timeout.
Check your internet connection.

"File not found"

Ensure skill file exists at the specified path
Try absolute path: /full/path/to/skill.md

No output after 2+ minutes

Free models can be slow during high traffic
Try a different free model or use --timeout 180
Check if OpenRouter is having issues at https://status.openrouter.ai

When to Use

Writing new skill documentation
Updating existing skill docs
Before shipping docs to production
Debugging why models misunderstand instructions

Development

bun install              # Install deps
bun test                 # Run tests (45 tests)
bun run typecheck        # Type check
bun run src/cli.ts       # Run CLI directly

Key source files:

src/core/runner.ts — Test execution
src/core/prompts.ts — System prompts sent to models
src/config/defaults.ts — Model presets

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent Instructions

Prerequisites

What This Is

What's a Skill File?

Installation

Quick Start (Free, Zero Cost)

Cost Warning

Model IDs

Writing Good Tasks

Commands

Expected Timing

Example Output

Free Models

Model Presets

Data Storage

Troubleshooting

When to Use

Development

FilesExpand file tree

AGENTS.md

Latest commit

History

AGENTS.md

File metadata and controls

Agent Instructions

Prerequisites

What This Is

What's a Skill File?

Installation

Quick Start (Free, Zero Cost)

Cost Warning

Model IDs

Writing Good Tasks

Commands

Expected Timing

Example Output

Free Models

Model Presets

Data Storage

Troubleshooting

When to Use

Development