Generated: 2026-03-08
Source: manus_agencies + manus_protocol_chunks tables (Supabase)
| Metric | Count |
|---|---|
Total agencies (manus_agencies) |
22,839 |
| Agencies WITH protocol chunks | 2,037 (8.9%) |
| Agencies WITHOUT chunks | 20,802 (91.1%) |
| Total protocol chunks | 64,397 |
| States with any coverage | 51 (all states + DC) |
Counties in counties table |
3,090 |
Counties with uses_state_protocols=true |
0 |
The "681 agencies with zero chunks" number from earlier analysis was based on the Drizzle-managed agencies table (admin/subscription system). The actual protocol data lives in manus_agencies (22,839 entries) and manus_protocol_chunks (64,397 chunks across 2,037 agencies).
91% of agencies have zero protocol chunks. But this is expected — most are placeholder entries from the NASEMSO national agency seed. Only agencies with actively ingested protocols have chunks.
| State | Agencies w/ Chunks | Total Agencies | Coverage |
|---|---|---|---|
| GA | 159 | 695 | 22.9% |
| KY | 121 | 464 | 26.1% |
| MO | 115 | 624 | 18.4% |
| KS | 105 | 596 | 17.6% |
| IL | 102 | 1,083 | 9.4% |
| IA | 98 | 917 | 10.7% |
| NE | 93 | 524 | 17.7% |
| IN | 92 | 633 | 14.5% |
| MN | 87 | 533 | 16.3% |
| MI | 83 | 659 | 12.6% |
| MS | 82 | 442 | 18.6% |
| NC | 78 | 430 | 18.1% |
| AR | 75 | 419 | 17.9% |
| AL | 68 | 320 | 21.3% |
| FL | 68 | 772 | 8.8% |
| CO | 66 | 451 | 14.6% |
| LA | 64 | 338 | 18.9% |
| NY | 64 | 1,346 | 4.8% |
| MT | 56 | 238 | 23.5% |
| ID | 44 | 218 | 20.2% |
| State | Agencies w/ Chunks | Total Agencies | Coverage |
|---|---|---|---|
| TX | 8 | 1,771 | 0.5% |
| PA | 7 | 1,388 | 0.5% |
| OH | 7 | 1,169 | 0.6% |
| NY | 64 | 1,346 | 4.8% |
| WI | 3 | 593 | 0.5% |
| NJ | 22 | 539 | 4.1% |
Only 1 agency has protocol_count > 0 but zero chunks:
- Lake County EMS (CA) —
protocol_count=10
This means the ingestion pipeline is well-maintained: if data is ingested, chunks exist.
The counties.uses_state_protocols field is universally false/null (0 out of 3,090 counties). This field was designed to flag counties that defer to state-level protocols instead of having LEMSA-specific ones, but it was never populated.
manus_agencies— NASEMSO-seeded national directory (22,839 entries). Most are placeholders.manus_protocol_chunks— Actual protocol text chunks with embeddings. Only populated for actively ingested agencies.agencies— Drizzle-managed table for the subscription/admin system. Separate from protocol data.county_agency_mapping— Maps counties to their LEMSA for jurisdiction-scoped search.- Ingestion scripts:
ingest-state.ts(multi-state CLI),ingest-ca-protocols.ts(CA-specific),ingest-local-pdfs.ts(manual PDF upload)
| Category | Est. Count | Description |
|---|---|---|
| NASEMSO placeholder entries | ~19,800 | Seeded from national directory. No protocol source identified. |
| State-protocol agencies | ~500-800 | Follow state-level protocols (no unique LEMSA protocols). Should use uses_state_protocols flag. |
| Genuinely missing (protocols exist online) | ~200-400 | Have published protocols but haven't been ingested yet. |
| Lake County EMS (CA) | 1 | Has protocol_count=10 but zero chunks — ingestion incomplete. |
- Fix Lake County EMS — Only agency with protocol_count>0 but no chunks. Run:
npx tsx scripts/ingest-ca-protocols.ts --lemsa "Lake County" - Populate
uses_state_protocols— Audit which counties/agencies defer to state protocols. Start with CA (well-documented LEMSA structure).
- Prioritize TX, PA, OH — Largest states with <1% coverage. Research state EMS protocol structures.
- Audit NASEMSO seed data — Determine which of the 22K agencies are actual protocol-publishing LEMSAs vs. individual fire departments/ambulance services that follow a parent LEMSA.
- Add
is_lemsaflag tomanus_agencies— Distinguish protocol-publishing authorities from individual agencies. Most of the 22K are individual services that follow a LEMSA's protocols.
- State-level protocol ingestion — For states with centralized protocols (e.g., state EMS offices), ingest once and map all agencies in that state.
- Coverage dashboard — Add a
/admin/coveragepage showing ingestion status by state/county. - Automated gap detection — Cron job to identify new agencies added without chunks.
The 20,802 "missing" agencies are mostly not a data gap — they're individual fire departments and ambulance services that follow their regional LEMSA's protocols. The real metric is LEMSA coverage, not individual agency coverage. There are roughly 200-300 LEMSAs nationally; Protocol Guide covers ~50 of them well (primarily through CA's 33 LEMSAs and scattered coverage in other states).
True coverage: ~50/250 LEMSAs nationally (20%). Expanding to the remaining ~200 LEMSAs would effectively cover all 22,839 agencies.