feat(transcription): add offline voice transcription via Whisper by Fadhili5 · Pull Request #336 · fireform-core/FireForm

Fadhili5 · 2026-03-24T18:50:25Z

Summary

Stacked on #335 and #332

Add src/transcriber.py: wraps OpenAI Whisper for fully local, offline audio transcription. Model is lazy-loaded on first use. Size configurable via WHISPER_MODEL env var (tiny/base/small/medium/large, default: base). Supports WAV, MP3, M4A, MP4, OGG, FLAC. No audio data leaves the machine
Add POST /transcribe endpoint: accepts multipart audio file upload, returns {text, model_used, audio_filename}. Returns 415 for unsupported formats, 500 for transcription errors
Add api/schemas/transcribe.py: TranscribeResponse schema
Register /transcribe router in api/main.py
Add 17 tests: model size validation, whitespace stripping, missing file, unsupported format, temp file cleanup, endpoint success/error, all 6 supported formats (parametrized)
Add openai-whisper and python-multipart to requirements.txt

Test plan

python -m pytest tests/test_transcribe.py -v — all 17 tests pass
python -m pytest tests/ -v — full suite (19 tests) passes
Confirm POST /transcribe returns 415 for .txt upload
Confirm temp files are cleaned up after transcribe_bytes()

- Add src/schemas/incident_report.py: canonical Pydantic model covering all fields needed across Cal Fire FIRESCOPE, EMS, and law enforcement forms (identity, location, timestamps, personnel, casualties, wildfire, narrative, law enforcement sections) - Add model_validator that auto-populates requires_review for any core field left null after extraction, so responders can spot gaps before PDF submission - Add llm_schema_hint() classmethod that returns the JSON schema minus requires_review, used to build the structured Ollama system prompt - Refactor LLM class: replace per-field prompt loop with a single structured request using Ollama format="json" and Mistral instruction format ([INST]...[/INST]) - LLM now returns IncidentReport via get_report() in addition to the existing get_data() dict accessor for backward compatibility - Fix test_submit_form: replace stub with a working integration test that creates a template then mocks Controller to assert the full POST /forms/fill response shape

- Add src/template_mapper.py: TemplateMapper loads a YAML agency mapping file and resolves IncidentReport field values to PDF form field names. Supports optional per-field condition expressions - Add safe AST-based condition evaluator: permits only Compare, BoolOp, UnaryOp, Name, Constant nodes — rejects function calls and arbitrary code - Refactor src/filler.py: replace positional answers_list[i] with explicit {pdf_field_name: value} dict so values land in the correct field regardless of page layout - Update src/file_manipulator.py: new _fill_with_mapper() path uses LLM -> IncidentReport -> TemplateMapper -> Filler; legacy positional path preserved for backward compatibility - Add templates/employee_form.yaml: sample mapping for src/inputs/file.pdf - Add pyyaml to requirements.txt Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Add src/transcriber.py: wraps OpenAI Whisper for fully local, offline audio transcription. Model is lazy-loaded on first use. Size is configurable via WHISPER_MODEL env var (tiny/base/small/medium/large, default: base). Supports WAV, MP3, M4A, MP4, OGG, FLAC - Add POST /transcribe endpoint: accepts multipart audio file upload, returns {text, model_used, audio_filename}. Returns 415 for unsupported formats, 500 for transcription errors - Add api/schemas/transcribe.py: TranscribeResponse schema - Register /transcribe router in api/main.py - Add 17 tests covering: model size validation, whitespace stripping, missing file, unsupported format, temp file cleanup, endpoint success, endpoint error handling, all 6 supported formats (parametrized) - Add openai-whisper and python-multipart to requirements.txt

Fadhili5 and others added 3 commits March 24, 2026 14:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(transcription): add offline voice transcription via Whisper#336

feat(transcription): add offline voice transcription via Whisper#336
Fadhili5 wants to merge 3 commits intofireform-core:mainfrom
Fadhili5:feat/voice-transcription

Fadhili5 commented Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Fadhili5 commented Mar 24, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant