Fund Migration Demo

A Hankweave pipeline that migrates real SEC financial data from messy Excel workbooks into structured JSON and a polished PDF report.

Built for demonstration purposes. We have a corruption script that introduces errors, sentinels designed to catch reconciliation mistakes, and different models across harnesses.

Quick Start

# Validate (free — no API calls)
./run-validate.sh

# Quick test with Haiku (~$3, ~30 min)
./run-cheap.sh

# Full run on Gladstone Investment (~$10, ~75 min)
./run-gladstone.sh

# Full run on OFS Capital (~$10, ~75 min)
./run-ofs.sh

Prerequisites

Bun (rig scripts are TypeScript)
ANTHROPIC_API_KEY set (for Claude codons)
OPENAI_API_KEY set (for GPT-5.4 via Pi codon)
typst installed (for PDF report generation): brew install typst

What This Demonstrates

The Pipeline

explore ──▶ map-companies ──▶ extract ──▶ reconcile (×2) ──▶ validate ──▶ report ──▶ polish
(Sonnet)    (GPT-5.4/Pi)     (Sonnet)    (Sonnet loop)     (Haiku)      (Sonnet)    (Sonnet)

Stage by Stage

1. Explore & Profile Data

Model: Claude Sonnet | Cost: ~$2 | Duration: ~8 min

The rig script scans all Excel workbooks and extracts metadata (sheet names, column headers, row counts). The agent then opens representative files across different time periods and writes a comprehensive data profile covering format eras, column mappings, data quality issues, and processing recommendations.

Demonstrates:

Rigs — deterministic setup before the agent starts
Fresh context — the agent sees only its prompt and the files

Output: notes/data-profile.md

2. Map & Normalize Companies

Model: GPT-5.4 via Pi harness | Cost: ~$1-2 | Duration: ~10 min

A rig script extracts all raw company names from every workbook. The agent then identifies naming variations (commas, spacing, abbreviations, renames) and builds a canonical mapping.

Demonstrates:

Multi-harness — switching from Claude Code to Pi mid-pipeline
Agentic judgment — deciding which name variants are the same company

Output: notes/companies/canonical.json, notes/companies/mapping-notes.md

3. Extract Per-Company Timelines

Model: Claude Sonnet | Cost: ~$0.10-0.70 | Duration: ~5-15 min

A rig script reads ALL rows from ALL workbooks into a single JSON file. The agent then matches rows to canonical companies, handles duplicate entries (two fund entities per investment), and builds per-company investment timelines.

Demonstrates:

exhaustWithPrompt — auto-extending the codon until all companies are processed
Sentinels — quality observer + reconciliation watchdog running in parallel
Rig-driven extraction — the heavy lifting is deterministic; the agent does judgment

Output: output/companies/*.json, notes/extraction-progress.md

4. Reconcile & Validate (Loop × 2)

Model: Claude Sonnet | Cost: ~$1-2 per pass | Duration: ~8-15 min per pass

A loop that runs twice with fresh context each iteration.

Pass 1: Sums all extracted data per quarter, compares against workbook totals, checks per-company continuity, flags issues by severity.

Pass 2 (fresh eyes): A different agent session reads the first pass's report and independently verifies claims. Spot-checks specific companies, looks for interpretive errors, catches what pass 1 normalized away.

Demonstrates:

Loops — iterating with fresh context
The .8 + .8 = .96 principle — two independent checks catch more than one thorough check
Checkpointing — previous report archived before each iteration

Output: output/reconciliation-report.md, output/fund-summary.json

5. Automated Validation (Script-Based)

Model: Claude Haiku | Cost: ~$0.40 | Duration: ~5 min

The cheapest codon in the pipeline — and one of the most valuable. A Haiku agent writes and runs 5 self-contained validation scripts that test the reconciliation claims with code, not prose.

Scripts check: quarterly sum accuracy, canonical mapping quality (no leaked section headers), timeline continuity (gaps and swings), report claim verification, and duplicate/zero detection.

Demonstrates:

Cheap model for deterministic work — Haiku writes validation scripts at 1/10th the cost of Sonnet
"Receipts not hopes" — deterministic proof that the data is correct
Catching errors upstream codons missed (section headers in canonical list, false claims about companies)

Output: notes/validation/results.md, notes/validation/*.ts

6. Write Report + Polish (Typst PDF)

Model: Claude Sonnet | Cost: ~$2-3 | Duration: ~15-20 min

The agent reads the reconciliation findings and produces a professional PDF report using the Typst typesetting system. The report includes stat boxes, key findings, tables, and section-by-section analysis.

The polish codon uses continue-previous to review and fix the report in the same session, ensuring compilation succeeds.

Demonstrates:

continue-previous — resuming a conversation for iterative refinement
Tangible output — the pipeline produces a shareable artifact, not just data files
Rig-driven setup — the Typst template, fonts, and compile script are all copied in via rigs

Output: template/main.pdf

Sentinels

Two parallel observers run during the extraction and reconciliation phases:

Quality Observer — Tracks how many companies have been processed, detects stuck loops (agent reading the same file 3+ times), and reports progress milestones. Triggers every 40 events. Uses Haiku (~$0.05 total).

Reconciliation Watchdog — Watches specifically for the deduplication mistake (averaging or dropping duplicate rows instead of keeping them). If the agent mentions "deduplicating" or "removing duplicates", it flags CRITICAL immediately. Silent when everything's fine. Triggers every 10 file updates.

Learn more: Sentinel documentation

Budget

The hank uses a $25 shared budget ceiling. Codons run sequentially, each taking what it needs. Typical total cost: $8-12 on Sonnet, $2-4 on Haiku.

Learn more: Budget documentation

The Data

Both datasets are real SEC filing data from publicly traded Business Development Companies (BDCs), transformed into Excel workbooks that simulate three format "eras."

Gladstone Investment Corporation

CIK: 1321741 | 37 portfolio companies | 9 quarters (Q4 2022 – Q1 2025)
Three format eras with genuine column/structure differences
Real company name inconsistencies from SEC XBRL data
Draft vs. finalized workbook for Q1 2025
Source: SEC BDC Data Sets

OFS Capital Corporation

CIK: 1487918 | 122 portfolio companies | 9 quarters (Q4 2022 – Q1 2025)
Larger portfolio with CLO tranches alongside direct loans
Different naming conventions (comma-separated vs dash-separated)
Source: Same SEC BDC Data Sets

Stress Testing

datasets/corrupt-data.ts injects realistic data corruption into both datasets: misspellings, missing quarters, wrong totals, conflicting duplicates, wrong fund mixed in, #REF! errors, .xls format, rogue sheets. Back up first, then run:

cd datasets
cp -r gladstone-investment gladstone-investment-backup
cp -r ofs-capital ofs-capital-backup
bun run corrupt-data.ts

The pipeline handles all of it.

File Structure

finance-demo-hank/
├── hank.json                              # Pipeline configuration
├── prompts/system.md                      # Global system prompt (fund admin domain context)
│
├── codons/
│   ├── explore/prompt.md                  # Stage 1: Data profiling
│   ├── map-companies/prompt.md            # Stage 2: Company normalization
│   ├── extract/prompt.md                  # Stage 3: Timeline extraction
│   ├── reconcile/prompt.md                # Stage 4: Reconciliation (smart: detects pass 1 vs 2)
│   ├── validate/prompt.md                 # Stage 5: Script-based validation (Haiku)
│   └── write-report/
│       ├── prompt.md                      # Stage 6: PDF report writing
│       └── polish.md                      # Stage 7: Report polishing
│
├── rigs/
│   ├── inventory-files.ts                 # Rig: scan workbooks, extract metadata
│   ├── extract-company-names.ts           # Rig: pull raw names from all workbooks
│   ├── extract-raw-investments.ts         # Rig: read ALL rows into structured JSON
│   ├── extract-totals.ts                  # Rig: pull per-quarter totals
│   ├── package.json                       # xlsx (SheetJS) dependency
│   ├── typst-template/                    # Typst template with components and docs
│   ├── fonts/                             # Brand fonts (HK Grotesk, Cardo, Departure Mono)
│   └── main-override.typ                  # Typography overrides for the template
│
├── sentinels/
│   ├── quality-observer.json              # Tracks progress, detects stuck behavior
│   └── reconciliation-watchdog.json       # Catches deduplication mistakes
│
├── datasets/
│   ├── gladstone-investment/              # Dataset 1: 37 companies, 9 quarters
│   ├── ofs-capital/                       # Dataset 2: 122 companies, 9 quarters
│   └── corrupt-data.ts                    # Optional: inject realistic data corruption
│
├── run-gladstone.sh                       # Full run on Gladstone (Sonnet + Pi)
├── run-ofs.sh                             # Full run on OFS Capital (Sonnet + Pi)
├── run-cheap.sh                           # Quick test run (Haiku)
└── run-validate.sh                        # Validate hank against both datasets (free)

Learn More

Built by Southbridge AI and hankhelp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fund Migration Demo

Quick Start

Prerequisites

What This Demonstrates

The Pipeline

Stage by Stage

1. Explore & Profile Data

2. Map & Normalize Companies

3. Extract Per-Company Timelines

4. Reconcile & Validate (Loop × 2)

5. Automated Validation (Script-Based)

6. Write Report + Polish (Typst PDF)

Sentinels

Budget

The Data

Gladstone Investment Corporation

OFS Capital Corporation

Stress Testing

File Structure

Learn More

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
codons		codons
datasets		datasets
prompts		prompts
rigs		rigs
sentinels		sentinels
.gitignore		.gitignore
README.md		README.md
hank.json		hank.json
run-cheap.sh		run-cheap.sh
run-gladstone.sh		run-gladstone.sh
run-ofs.sh		run-ofs.sh
run-validate.sh		run-validate.sh

Folders and files

Latest commit

History

Repository files navigation

Fund Migration Demo

Quick Start

Prerequisites

What This Demonstrates

The Pipeline

Stage by Stage

1. Explore & Profile Data

2. Map & Normalize Companies

3. Extract Per-Company Timelines

4. Reconcile & Validate (Loop × 2)

5. Automated Validation (Script-Based)

6. Write Report + Polish (Typst PDF)

Sentinels

Budget

The Data

Gladstone Investment Corporation

OFS Capital Corporation

Stress Testing

File Structure

Learn More

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages