Test coverage: Document processing jobs (AnalyzePdf, OCR)

## Summary

The document processing jobs (`Documents::AnalyzePdfJob` and `Documents::OcrJob`) have no test coverage. These jobs handle PDF text extraction and OCR — the foundation for all downstream AI analysis.

## Jobs Needing Tests

### `Documents::AnalyzePdfJob` (`app/jobs/documents/analyze_pdf_job.rb`)
- Extracts text from PDF documents using `pdftotext`
- Performs page-level analysis (creates `Extraction` rows)
- Classifies `text_quality` (good, poor, no_text)
- Triggers `OcrJob` for scanned/image-based documents

**Test scenarios:**
- Text-based PDF → extracts text, sets `text_quality: good`, creates `Extraction` rows per page
- Image-based/scanned PDF → detects low text quality, enqueues `OcrJob`
- Already-processed document (idempotent re-run) → clears and rebuilds extractions
- Handles corrupt/unreadable PDFs gracefully (doesn't crash)
- Updates `MeetingDocument` fields: `extracted_text`, `text_chars`, `avg_chars_per_page`, `page_count`

### `Documents::OcrJob` (`app/jobs/documents/ocr_job.rb`)
- Runs Tesseract OCR on image-based PDFs
- Updates `MeetingDocument.ocr_status` and extracted text

**Test scenarios:**
- Scanned PDF → OCR produces text, updates `extracted_text` and `ocr_status`
- PDF with mixed text/image pages → handles correctly
- Tesseract unavailable → graceful failure with appropriate status
- Idempotent re-run

## Approach

- Create small test PDF fixtures (`test/fixtures/files/`): one text-based, one image-based
- Stub system calls to `pdftotext` and `tesseract` where appropriate
- Test the full flow: download → analyze → OCR → extraction rows
- Verify `text_quality` classification logic

## Dependencies

- `Documents::DownloadJob` already has tests (`test/jobs/documents/download_job_test.rb`) — use as a pattern

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test coverage: Document processing jobs (AnalyzePdf, OCR) #55

Summary

Jobs Needing Tests

`Documents::AnalyzePdfJob` (`app/jobs/documents/analyze_pdf_job.rb`)

`Documents::OcrJob` (`app/jobs/documents/ocr_job.rb`)

Approach

Dependencies

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Test coverage: Document processing jobs (AnalyzePdf, OCR) #55

Description

Summary

Jobs Needing Tests

Documents::AnalyzePdfJob (app/jobs/documents/analyze_pdf_job.rb)

Documents::OcrJob (app/jobs/documents/ocr_job.rb)

Approach

Dependencies

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

`Documents::AnalyzePdfJob` (`app/jobs/documents/analyze_pdf_job.rb`)

`Documents::OcrJob` (`app/jobs/documents/ocr_job.rb`)