GitHub - cedarscarlett/prompt-analysis-public: Tool that surfaces prompt failure modes.

PromptReboot

Multi-agent prompt diagnostics for identifying reliability issues in LLM prompts before they are used in production workflows.

This tool analyzes prompts for structural failure modes such as ambiguity, missing constraints, conflicting instructions, and evaluation gaps. It returns diagnostic findings instead of rewriting prompts.

The goal is to surface problems early so prompt design decisions remain explicit and under developer control.

What this tool does

PromptReboot runs multiple diagnostic agents in parallel, each responsible for detecting a specific class of prompt failure.

The system:

analyzes prompts using targeted diagnostic passes
detects Medium and High severity prompt design issues
validates agent outputs against a strict schema
combines findings deterministically in code
avoids rewriting or optimizing prompts
skips alignment analysis when goal and example are absent

This is a diagnostic system, not a prompt optimizer.

Failure modes detected

The current agents detect issues in categories including:

Goal–Prompt–Example misalignment
Hard instruction contradictions
Soft instruction contradictions
Role overload
Audience–voice mismatch
Vague success criteria
Missing priority ordering
Implicit domain assumptions
Over-constraint (brittleness)
Under-constraint (hallucination risk)
Ambiguous constraint scope
No self-check or validation step
Example output misuse
Single example overfitting
Unscaffolded complex reasoning
Hidden intermediate requirements

Agents only report issues when they are likely to materially affect correctness, consistency, or reliability.

If no issues are detected, the system returns an empty findings list.

Architecture overview

Execution is parallel, deterministic, and schema-validated.

Flow:

Prompt → Input serialization → Parallel agent execution → Validation → Combined findings

Key properties:

agents run concurrently using asyncio
each agent uses a strict diagnostic system prompt
no aggregator LLM is used
findings are validated at runtime
only "High" and "Medium" severities are allowed
per-agent ordering is enforced (High before Medium)
alignment agent is skipped deterministically when not applicable

The execution model lives in graphy.py.

Example

Prompt:

"Summarize this email thread and decide whether the customer should get a refund."

Typical findings:

Vague success criteria
Under-constraint (hallucination risk)
No self-check or validation step

The tool explains why each issue affects output reliability and cites the relevant prompt text.

Design philosophy

Diagnostics over rewriting

Rewriting a prompt requires making decisions the original prompt did not specify. This tool surfaces problems instead of silently resolving them.

Separation of analysis and design

Prompt analysis identifies failure modes. Prompt design remains the user's responsibility.

Precision-first reporting

Agents report only Medium and High confidence issues that are supported by concrete evidence from the prompt.

Silence is success

If no issues are detected, the correct output is an empty findings list.

Running locally

pip install -r requirements.txt cd backend uvicorn api.asgi:app

Repository structure

backend/graphy.py Parallel diagnostic execution engine.

backend/llm/agents/ Agent configurations and category definitions.

backend/llm/prompts/ Diagnostic system prompts used by each agent.

backend/infra/ Logging and shared utilities.

backend/api/ ASGI application for running diagnostics.

Notes

This project is intentionally small and focused. It is designed to be used as a prompt analysis step before prompts are deployed into LLM workflows.

The system prioritizes deterministic execution, schema validation, and strict agent scope boundaries over flexibility or prompt optimization.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
backend		backend
frontend		frontend
.env.example		.env.example
.gitignore		.gitignore
.pylintrc		.pylintrc
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages