feat(transcription): add offline voice transcription via Whisper#336
Open
Fadhili5 wants to merge 3 commits intofireform-core:mainfrom
Open
feat(transcription): add offline voice transcription via Whisper#336Fadhili5 wants to merge 3 commits intofireform-core:mainfrom
Fadhili5 wants to merge 3 commits intofireform-core:mainfrom
Conversation
- Add src/schemas/incident_report.py: canonical Pydantic model covering all fields needed across Cal Fire FIRESCOPE, EMS, and law enforcement forms (identity, location, timestamps, personnel, casualties, wildfire, narrative, law enforcement sections) - Add model_validator that auto-populates requires_review for any core field left null after extraction, so responders can spot gaps before PDF submission - Add llm_schema_hint() classmethod that returns the JSON schema minus requires_review, used to build the structured Ollama system prompt - Refactor LLM class: replace per-field prompt loop with a single structured request using Ollama format="json" and Mistral instruction format ([INST]...[/INST]) - LLM now returns IncidentReport via get_report() in addition to the existing get_data() dict accessor for backward compatibility - Fix test_submit_form: replace stub with a working integration test that creates a template then mocks Controller to assert the full POST /forms/fill response shape
- Add src/template_mapper.py: TemplateMapper loads a YAML agency mapping
file and resolves IncidentReport field values to PDF form field names.
Supports optional per-field condition expressions
- Add safe AST-based condition evaluator: permits only Compare, BoolOp,
UnaryOp, Name, Constant nodes — rejects function calls and arbitrary code
- Refactor src/filler.py: replace positional answers_list[i] with explicit
{pdf_field_name: value} dict so values land in the correct field
regardless of page layout
- Update src/file_manipulator.py: new _fill_with_mapper() path uses
LLM -> IncidentReport -> TemplateMapper -> Filler; legacy positional
path preserved for backward compatibility
- Add templates/employee_form.yaml: sample mapping for src/inputs/file.pdf
- Add pyyaml to requirements.txt
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add src/transcriber.py: wraps OpenAI Whisper for fully local, offline
audio transcription. Model is lazy-loaded on first use. Size is
configurable via WHISPER_MODEL env var (tiny/base/small/medium/large,
default: base). Supports WAV, MP3, M4A, MP4, OGG, FLAC
- Add POST /transcribe endpoint: accepts multipart audio file upload,
returns {text, model_used, audio_filename}. Returns 415 for unsupported
formats, 500 for transcription errors
- Add api/schemas/transcribe.py: TranscribeResponse schema
- Register /transcribe router in api/main.py
- Add 17 tests covering: model size validation, whitespace stripping,
missing file, unsupported format, temp file cleanup, endpoint success,
endpoint error handling, all 6 supported formats (parametrized)
- Add openai-whisper and python-multipart to requirements.txt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
src/transcriber.py: wraps OpenAI Whisper for fully local, offline audio transcription. Model is lazy-loaded on first use. Size configurable viaWHISPER_MODELenv var (tiny/base/small/medium/large, default:base). Supports WAV, MP3, M4A, MP4, OGG, FLAC. No audio data leaves the machinePOST /transcribeendpoint: accepts multipart audio file upload, returns{text, model_used, audio_filename}. Returns415for unsupported formats,500for transcription errorsapi/schemas/transcribe.py:TranscribeResponseschema/transcriberouter inapi/main.pyopenai-whisperandpython-multiparttorequirements.txtTest plan
python -m pytest tests/test_transcribe.py -v— all 17 tests passpython -m pytest tests/ -v— full suite (19 tests) passesPOST /transcribereturns415for.txtuploadtranscribe_bytes()