A documentation extraction and generation system that converts structured comment blocks and narrative .pdoc templates into complete Markdown documentation. Perseus centralizes context, vocabulary, and narrative explanation across codebases, keeping documentation maintainable, contextual, and consistent.
Perseus extracts structured @pdoc blocks from code comments, merges them with additional narrative documents, and compiles them into Markdown or JSON outputs using Jinja-style templating.
Demo Video: https://www.youtube.com/watch?v=JB6BY2yXOlQ
Sample Generated Documentation:
View a sample project here
Key design goals:
- Co-locate documentation context directly with code where it is created.
- Provide richer metadata: intentions, tags, vocabulary definitions, tickets, business requirements, and more.
- Enable narrative documentation that links to code-derived context.
- Support reusable glossary terms across the entire repository.
- Build docs with one command:
- Run
perseus buildto scan your codebase and.pdocfiles, extract context, and generate Markdown docs.
- Run
- Single config file:
- Use
perseus.yamlto set source paths, output location, templates, and format (Markdown, JSON, HTML).
- Use
- Narrative
.pdoctemplates:- Write reusable documentation using Jinja-style syntax, referencing code context and glossary terms.
- Glossary merging:
- All vocabulary from code is merged and available in every doc.
Typical workflow:
- Add
@pdocblocks to your code. - Write
.pdoctemplates for narrative docs. - Run the build command to generate output.
Embedded in code comments. These define structured metadata.
Ideal Supported fields (only some implemented currently):
intention– high-level purpose of the unittickets– JIRA or other ticket linkstags– searchable identifiersvocabulary– domain terminology definitionsfeature_flagsexperimentsbusiness_requirementhigh_level_intentionprerequisiteapproversto_dolegacy/migrationmaintainerscross_functional_teamswatch– track file changes (optional)code– embed relevant code regions (optional)
These fields form the underlying data accessible to .pdoc templates.
Markdown-like documents with Jinja syntax to assemble final output.
Capabilities:
- Link to context from any file
- Use global glossary
- Render code blocks
- Combine many contexts into a single final document
The pers build compiler performs:
- Extraction
- Parsing (YAML)
- Context storage
- Template rendering
- Export to Markdown/JSON/HTML
Traverses source paths and locates:
- Code files with
@pdocblocks .pdocnarrative files
Uses a fast and accurate YAML parser for blocks within comments.
A storage layer that collects parsed context and exposes it to the templating engine.
Uses a Jinja2-compatible engine to render Markdown, JSON, or HTML.
The pers binary provides:
pers build- Future commands:
pers watch,pers list,pers glossary, etc.
Supports structured ticket metadata beyond simple strings.
Allows tagging functions, methods, classes, or files for search, indexing, and pattern detection.
Teams may choose between:
- Co-located documentation (context embedded next to code)
- Centralized
.pdocnarrative documents - A hybrid model
Perseus supports the following output types:
- Markdown (primary)
- JSON (intermediate representation)
Markdown is typically generated from .pdoc templates.
All vocabulary from every @pdoc block is merged into a global glossary.
Every documentation file can reference this glossary.
Install dependencies (recommended: uv):
uv pip install -r requirements.txtRun the demo:
uv run perseus/main.py --config test_projects/simple_python/perseus.yaml build --format mdor
uv run perseus/main.py --root test_projects/simple_python build --format mdThis will produce Markdown in test_projects/simple_python/docs/build.
Change the .env for debugging and log levels, use the logger library for logging information over printing
Perseus can automatically enrich ticket references with live data from Jira. To enable this feature:
- Configure your
.envfile:
JIRA_BASE_URL=https://your-company.atlassian.net
JIRA_EMAIL=your-email@company.com
JIRA_API_TOKEN=your_api_token_here-
Generate a Jira API token:
- Go to https://id.atlassian.com/manage-profile/security/api-tokens
- Create a new API token
- Copy the token to your
.envfile
-
Reference tickets in your code:
"""
@pdoc
id: example_function
tickets:
- KAN-1
- KAN-4
@endp
"""- Access enriched data in templates:
{% if block.tickets_enriched %}
| Key | Title | Status | Assignee | Priority |
|-----|-------|--------|----------|----------|
{% for t in block.tickets_enriched %}
| [{{ t.key }}]({{ t.url }}) | {{ t.title }} | {{ t.status }} | {{ t.assignee }} | {{ t.priority }} |
{% endfor %}
{% endif %}The system automatically fetches ticket details (title, status, assignee, priority, URL) during the build process. Results are cached to minimize API calls. If credentials are not configured, tickets will be displayed as simple text.