Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
73 commits
Select commit Hold shift + click to select a range
e71423d
Improve meter OCR and add test runner
AndreaPi Feb 5, 2026
589e4cd
Update .gitignore to include node_modules, output, and Playwright CLI…
AndreaPi Feb 10, 2026
84f5ff2
Refactor app into modular src architecture
AndreaPi Feb 10, 2026
9392414
Update AGENTS with OCR handoff and module guidance
AndreaPi Feb 10, 2026
1d0dabe
Use scaled-MSE debug score in test set results
AndreaPi Feb 10, 2026
a9844cc
Update AGENTS with neural ROI status and next steps
AndreaPi Feb 10, 2026
355751b
Add neural ROI backend service and training tooling
AndreaPi Feb 13, 2026
27b65f7
Integrate neural ROI and improve OCR candidate ranking
AndreaPi Feb 13, 2026
c1f8b31
Add corrected ROI annotations and expand meter test set
AndreaPi Feb 13, 2026
9f5768c
Add ROI label QA renderer and ignore local manifest artifacts
AndreaPi Feb 13, 2026
a9e7048
Add new ROI val dataset samples
AndreaPi Feb 14, 2026
ce24cea
Improve backend CPU/device handling for ROI service
AndreaPi Feb 14, 2026
1f2f3b3
Add neural ROI sanity gating and ROI-specific OCR candidates
AndreaPi Feb 14, 2026
1e20296
Harden test-set runner and update metric labels
AndreaPi Feb 14, 2026
7dcb45a
Refresh AGENTS OCR notes and remove obsolete handoffs
AndreaPi Feb 14, 2026
b0b842a
Refine OCR branch flow and candidate gating
AndreaPi Feb 16, 2026
f1854a2
Add rotation augmentation options for ROI training
AndreaPi Feb 16, 2026
073b719
Add digit dataset export tooling and baseline artifacts
AndreaPi Feb 16, 2026
4e750f6
Ignore root-level YOLO weight artifacts
AndreaPi Feb 16, 2026
00e3a73
Add digit classifier pipeline and optional OCR integration
AndreaPi Feb 16, 2026
59922df
Add digit capture planning and dataset QA validation tooling
AndreaPi Feb 16, 2026
1db8301
Pin ROI backend default model and remove fallback
AndreaPi Feb 16, 2026
d74ff83
Require neural ROI and add E2E merge checks
AndreaPi Feb 16, 2026
7d154dc
Align E2E workflow with master default branch
AndreaPi Feb 16, 2026
decd7e4
Harden E2E reliability and add neural ROI success coverage
AndreaPi Feb 16, 2026
ef77822
Show failure reasons in test-set results
AndreaPi Feb 16, 2026
f9229d6
Refresh AGENTS with current OCR state and tomorrow plan
AndreaPi Feb 16, 2026
3f1e79f
Add test-set reject histograms and clearer mismatch labeling
AndreaPi Feb 17, 2026
b0688a5
Soften ROI geometry rejects to avoid hard drops on borderline strips
AndreaPi Feb 17, 2026
10d6315
Enable neural digit classifier by default in OCR flow
AndreaPi Feb 17, 2026
df69394
Extract shared normalizeAngle to canvas-utils to remove duplication
AndreaPi Feb 21, 2026
5a1d7a4
Clarify digit classifier default as false in CLAUDE.md
AndreaPi Feb 21, 2026
1826026
Comment the early-stop \!roiMode guard in pipeline.js
AndreaPi Feb 21, 2026
bb583eb
Comment deskew scoring formula in recognition.js
AndreaPi Feb 21, 2026
221be24
Remove unused offsetPx plumbing from splitIntoCells
AndreaPi Feb 21, 2026
68dabca
Improve OCR crop bounds and ROI decoding defaults
AndreaPi Feb 23, 2026
945af8e
Refresh OCR test readings and digit capture plan
AndreaPi Feb 23, 2026
e8f69b9
Update OCR workflow docs and local env guidance
AndreaPi Feb 23, 2026
9245943
Add new ROI dataset captures and labels
AndreaPi Feb 25, 2026
36efdd5
Make OCR neural-ROI-only and separate ROI debug stages
AndreaPi Feb 25, 2026
5746208
Update meter_readings.csv for renamed and new captures
AndreaPi Feb 25, 2026
f454455
Harden ROI OCR selection and candidate refinement
AndreaPi Feb 25, 2026
70d9dc7
Add OCR guardrail regression coverage in e2e tests
AndreaPi Feb 25, 2026
5471a97
Update OCR docs, working state, and next steps
AndreaPi Feb 25, 2026
7c60d41
Refresh digit capture plan manifests
AndreaPi Feb 25, 2026
56b2597
Switch OCR pipeline to strip-only recognition
AndreaPi Feb 26, 2026
3111614
Update AGENTS OCR lessons and next steps
AndreaPi Feb 26, 2026
f127ea1
Remove duplicate meter entries from asset readings
AndreaPi Feb 26, 2026
9f10ddd
Prune duplicate samples from OCR training datasets
AndreaPi Feb 26, 2026
98df3e2
Instrument OCR reject reasons and update working-state notes
AndreaPi Feb 28, 2026
71f71fe
Ingest Feb 27 meter reading into assets and ROI dataset
AndreaPi Feb 28, 2026
d65f347
Add docs for app logic, backend API, and OCR tuning
AndreaPi Feb 28, 2026
51b09ee
Ignore AOB parking folder
AndreaPi Mar 1, 2026
bf116c4
Enforce ROI augmentation policy in training
AndreaPi Mar 1, 2026
db486f3
Refresh AGENTS OCR state and next TODOs
AndreaPi Mar 1, 2026
6e4a457
Remove dormant refined OCR guardrails
AndreaPi Mar 1, 2026
8e4a702
Update OCR docs and re-render app logic diagram
AndreaPi Mar 1, 2026
cc5330b
Add ROI checkpoint diff benchmark tooling
AndreaPi Mar 2, 2026
5efd23b
Add gated digit-classifier fallback path
AndreaPi Mar 2, 2026
8676b4e
Refresh OCR docs and benchmark status
AndreaPi Mar 2, 2026
7586660
Switch OCR evaluation to MAE with guardrails
AndreaPi Mar 3, 2026
14e0315
chore(dataset): add meter_03032026 reading and ROI label
AndreaPi Mar 3, 2026
639286c
feat(ocr): harden edge-candidate selection
AndreaPi Mar 3, 2026
81cfe74
feat(ocr): add toggle for edge-derived candidates
AndreaPi Mar 3, 2026
41c3b3a
test(ocr): reject isolated edge single-hit in e2e
AndreaPi Mar 3, 2026
7e4b958
docs(ocr): add medium-term OBB evaluation notes
AndreaPi Mar 3, 2026
b7bfcd7
docs(ocr): sync guides and roi-diff reporting details
AndreaPi Mar 3, 2026
86d8552
feat(dataset): rebuild digit dataset with canonical windows and secti…
AndreaPi Mar 4, 2026
3d6f392
feat(training): add split-jitter augmentation for digit classifier
AndreaPi Mar 4, 2026
4e48bf0
feat(ocr): canonicalize ROI major axis before cell splitting
AndreaPi Mar 4, 2026
c022096
docs(ocr): record digit dataset orientation labeling blocker
AndreaPi Mar 4, 2026
3c14b6b
feat(training): add synthetic digit dataset generation and mixed trai…
AndreaPi Mar 4, 2026
28b83e9
feat(ocr): rank classifier fallback candidates and relax edge gate
AndreaPi Mar 4, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
38 changes: 38 additions & 0 deletions .github/workflows/e2e.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
name: E2E

on:
pull_request:
push:
branches:
- master

jobs:
playwright:
runs-on: ubuntu-latest
timeout-minutes: 20
steps:
- name: Checkout
uses: actions/checkout@v4

- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 20
cache: npm

- name: Install dependencies
run: npm ci

- name: Install Playwright browsers
run: npx playwright install --with-deps chromium

- name: Run e2e tests
run: npm run test:e2e -- --project=chromium

- name: Upload Playwright report
if: always()
uses: actions/upload-artifact@v4
with:
name: playwright-report
path: playwright-report
retention-days: 7
47 changes: 29 additions & 18 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,21 +1,32 @@
# Node
*:Zone.Identifier
node_modules/

# Logs
npm-debug.log*
yarn-debug.log*
yarn-error.log*
pnpm-debug.log*

# OS files
.DS_Store
Thumbs.db

# Env files
output/
assets/*.jpg
assets/*.jpeg
assets/*.png
assets/*.JPG
assets/*.JPEG
assets/*.PNG
.venv/
backend/.venv/
__pycache__/
*.pyc
backend/__pycache__/
backend/runs/
backend/models/*.pt
backend/models/*.onnx
backend/yolov8*.pt
/yolov8*.pt
backend/data/roi_dataset/labels/*.cache
backend/data/roi_dataset/previews/
backend/data/roi_dataset/qa_previews/
backend/data/roi_dataset/roi_boxes.json
.env
.env.local
.env.*.local

# Build output
coverage/
dist/
# Playwright CLI generated artifacts
.playwright-cli/
playwright-report/
test-results/

# Local parking area for unrelated files
AOB/
96 changes: 90 additions & 6 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,27 +3,53 @@
## Project Structure & Module Organization
- `index.html`: Single-page UI layout and content.
- `styles.css`: Global styles and visual system.
- `app.js`: Client-side logic (OCR flow, email draft generation).
- `app.js`: Thin module entrypoint that imports `src/main.js`.
- `src/main.js`: UI orchestration and event wiring.
- `src/ocr/`: Neural-ROI-first OCR pipeline with strip-first decoding and selection safeguards.
- `src/email/`: Email draft generation and link helpers.
- `src/testset/`: Manual test-set runner logic.
- `src/debug/`: Debug overlay rendering helpers.
- `backend/`: Optional FastAPI service for neural ROI + digit classifier inference and training scripts.
- `backend/build_digit_dataset.py`: Export strip/cell OCR datasets + QA previews from ROI labels.
- `backend/generate_synthetic_digit_dataset.py`: Build synthetic train-only digit sections (direct cell augmentation + optional composed windows re-split equispaced).
- `backend/plan_digit_expansion.py`: Generate prioritized capture plan for underrepresented digits.
- `backend/validate_digit_dataset.py`: Validate manifest consistency and QA preview coverage.
- `backend/train_digit_classifier.py`: Train per-cell digit classifier checkpoint.
- `package.json`: Local dev scripts.
- `README.md`: Project overview and setup notes.
- `assets/`: Static assets and example uploads.
- `assets/meter_13012026.jpg`: Example upload asset.

## Build, Test, and Development Commands
- `npm run serve`: Start a simple local web server on port 8000.
- `npm run dev`: Alias of `npm run serve`.
- `cd backend && python3 -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt`: Backend setup.
- `cd backend && source .venv/bin/activate && python train_roi.py --data data/roi_dataset.yaml --base-model yolov8n.pt --rotation-angles 90,180,270,360 --heavy-augment`: Fine-tune pretrained ROI detector with enforced augmentation policy.
- `cd backend && source .venv/bin/activate && python build_digit_dataset.py --clean`: Rebuild digit strip/cell exports and QA previews.
- `cd backend && source .venv/bin/activate && python validate_digit_dataset.py`: Validate dataset/manifests before training.
- `cd backend && source .venv/bin/activate && python generate_synthetic_digit_dataset.py --clean --direct-per-real 6 --compose-window-count 180`: Generate synthetic train-only digit sections from real train labels.
- `cd backend && source .venv/bin/activate && python plan_digit_expansion.py --target-train-per-digit 12 --priority-digits 4,5,6,9`: Refresh targeted capture checklist.
- `cd backend && source .venv/bin/activate && python train_digit_classifier.py --device cpu`: Train per-cell digit classifier model (real-only).
- `cd backend && source .venv/bin/activate && python train_digit_classifier.py --device cpu --synthetic-root data/digit_dataset/sections_synthetic --synthetic-target-ratio 2.0`: Train on mixed real + synthetic train split while keeping val/test real-only.
- `cd backend && source .venv/bin/activate && uvicorn app:app --host 127.0.0.1 --port 8001 --reload`: Run neural ROI API.

Open `http://localhost:8000` after running a serve command.
Open `http://localhost:8000` after running a serve command. Backend endpoints default to `http://127.0.0.1:8001/roi/detect` and `http://127.0.0.1:8001/digit/predict-cells`.

## Coding Style & Naming Conventions
- Use 2-space indentation in HTML/CSS/JS.
- Keep files ASCII-only unless there is a strong reason for Unicode.
- Use descriptive, lower-case IDs and class names (e.g., `photo-input`, `module-grid`).
- Prefer clear, small functions in `app.js` and avoid deep nesting.
- Prefer clear, small functions in `src/` modules and avoid deep nesting.

## Testing Guidelines
- No automated tests are configured.
- Manual checks: upload image, run OCR, verify email draft fields, and confirm Gmail draft link.
- Automated browser tests are configured with Playwright.
- `npm run test:e2e`: Runs `tests/e2e/neural-roi.spec.js` (neural ROI failure handling + ROI geometry + strip-only OCR behavior).
- CI: `.github/workflows/e2e.yml` runs on each pull request and on pushes to `master`.
- Frontend manual checks: upload image, run OCR, verify email draft fields, and confirm Gmail draft link.
- OCR test-set checks: run "Run test set" and inspect `MAE`, `Exact Match`, `No-read`, `Failure Reason`, and debug stages.
- Backend sanity checks: `GET /health` and confirm `ready: true`, `roi_ready: true`, and expected `model_path`.
- Prefer running the test set from UI with debug overlay enabled.
- Before committing OCR changes, run both `npm run test:e2e` and the UI "Run test set".
- ROI training policy: always use heavy augmentation and rotation expansion (`90,180,270,360`). `train_roi.py` enforces this by default and only allows weaker runs with `--allow-no-augment-policy`.

## Commit & Pull Request Guidelines
- No commit message convention is established in this repo.
Expand All @@ -33,3 +59,61 @@ Open `http://localhost:8000` after running a serve command.
## Security & Configuration Tips
- The Gmail draft flow opens a client-side draft; no credentials are stored in code.
- OCR runs in the browser; avoid adding API keys to the client without a secure proxy.
- Backend is intended for local use; keep host/CORS scoped to localhost unless explicitly deploying.

## IMPORTANT
- When using Playwright in this environment, global `playwright-cli` may be more reliable than the wrapper if npm network is flaky.

## OCR Working State

- App + backend run locally on `127.0.0.1:8000` and `127.0.0.1:8001`.
- Neural ROI is mandatory in the frontend OCR flow (heuristic ROI fallback removed).
- On neural ROI failure, the UI shows an explicit reason and asks for manual measurement input.
- Backend default ROI model is pinned to `backend/models/roi-rotaug-e30-640.pt` (override with `ROI_MODEL_PATH`).
- `train_roi.py` enforces augmentation policy by default: heavy online augmentation + rotation expansion `90,180,270,360`.
- Digit-classifier inference is optional behind `OCR_CONFIG.digitClassifier.enabled` (default `false`).
- Backend serves ROI + digit endpoints and reports readiness via `GET /health`.
- Test-set table includes `Detected`, `Absolute Error`, `Failure Reason`, and `Result`.
- Frontend OCR branch evaluation is strip-only (word-pass + sparse scan); the 4-cell refine stage is removed from the active pipeline.
- ROI word-pass defaults to raw candidate input (`roiDeterministic.wordPassModes: ['raw']`); debug stage `6. OCR input candidate` mirrors this mode.
- `roiDeterministic.minWordPassHits` is `1`, but isolated edge-only single hits are rejected unless corroborated by non-edge evidence or very strong per-cell confidence.
- Edge-derived candidate generation is toggleable via `roiDeterministic.useEdgeCandidates` (default `true`) for controlled A/B experiments.
- Current local benchmark set has `15` images.
- Historical checkpoint comparison (March 2, 2026, fallback `OFF`, 14-image snapshot):
- `roi-rotaug-e30-640.pt` (default pinned): exact-match `0/14`, failure mix `ocr-no-digits` (7), `mismatch` (6), `no-detection` (1).
- `roi.pt` (challenger): exact-match `0/14`, failure mix `ocr-no-digits` (10), `mismatch` (4), `no-detection` (0).
- Automated diff workflow is available via `npm run benchmark:roi-diff` (recent artifacts: `output/roi-checkpoint-diff/20260303-194206-fallback-off/roi-diff-report.md`).
- Gated digit-classifier fallback is implemented in pipeline but remains disabled by default (`digitClassifier.enabled: false`).
- Historical fallback benchmark (March 2, 2026, 14-image snapshot):
- Fallback `OFF` (`output/roi-checkpoint-diff/20260302-083324-fallback-off`): baseline `mismatch` 6 / `ocr-no-digits` 7; challenger `mismatch` 4 / `ocr-no-digits` 10.
- Fallback `ON` (`output/roi-checkpoint-diff/20260302-083529-fallback-on`): baseline `mismatch` 10 / `ocr-no-digits` 3; challenger `mismatch` 13 / `ocr-no-digits` 1.
- Net: no exact-match gain (`0/14` stays `0/14`), with strong false-positive shift (`ocr-no-digits` -> `mismatch`), so fallback stays disabled.
- Promotion and rollback decisions should now use `MAE` from `roi-diff-report` as the primary signal, with exact-match and no-read as guardrails.
- ROI diff reports now include per-image selected metadata columns (`sourceLabel`, `method`, `preprocessMode`) and explicitly export the last stage `6. OCR input candidate` snapshot.

## Next TODOs

1. Keep `roi-rotaug-e30-640.pt` as default until a challenger beats it on end-to-end OCR metrics, not only detection presence.
2. Re-run `npm run benchmark:roi-diff` after each ROI challenger to track per-image movement (`Detected`, stage `5/6` snapshots, reject reason), then summarize deltas in notes/PR.
3. Tune strip preprocessing and candidate ranking for the current hard failures (`meter_07012020.JPEG`, `meter_02192026.JPEG`, `meter_02202026.JPEG`, `meter_02242026.JPEG`).
4. Keep classifier fallback disabled until it beats fallback-off on `MAE` while respecting exact-match and no-read guardrails; focus on stricter fallback acceptance/ranking before re-testing.
5. Enforce checkpoint promotion gates from docs: no MAE regression, no exact-match regression, no no-read regression, and no regression in `ocr-no-digits`.
6. Keep running both `npm run test:e2e` and UI `Run test set` before commits; include histogram deltas in commit/PR notes.
7. Medium-term: evaluate YOLO OBB ROI detection to reduce rotation/edge ambiguity; this requires OBB relabeling, retraining, and backend response/schema changes before frontend adoption.

### OBB Notes (Re-verify Before Implementation)

- OBB inference outputs rotated geometry (`xywhr`) and polygon corners.
- OBB training labels use corners format: `class x1 y1 x2 y2 x3 y3 x4 y4`.
- OBB angle handling has constraints (Ultralytics OBB uses angles in the `0-90` exclusive range).

## Dataset Expansion Loop (`4/5/6/9`)

1. Refresh capture planning:
- `cd backend && source .venv/bin/activate && python plan_digit_expansion.py --target-train-per-digit 12 --priority-digits 4,5,6,9`
2. Add labeled captures with QA previews.
3. Validate manifests after each dataset update:
- `cd backend && source .venv/bin/activate && python validate_digit_dataset.py`
4. Retrain classifier only after class coverage improves:
- `cd backend && source .venv/bin/activate && python train_digit_classifier.py --device cpu`
5. Keep classifier fallback disabled by default; only enable if benchmarked `MAE` improves without exact-match/no-read guardrail regressions.
116 changes: 112 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,22 @@

Jarvis is a lightweight personal assistant web app. The first module helps you read a water meter photo, review the detected value, and draft an email in Gmail.

## Documentation

- Docs index: [`docs/README.md`](./docs/README.md)
- OCR app logic flow: [`docs/app-logic.md`](./docs/app-logic.md)
- Backend API guide: [`docs/backend-api.md`](./docs/backend-api.md)
- OCR tuning playbook: [`docs/ocr-tuning-playbook.md`](./docs/ocr-tuning-playbook.md)

## Features
- Upload a meter photo and preview it.
- OCR the reading (manual override supported).
- OCR from a neural-ROI crop with conservative acceptance (unsupported OCR guesses are rejected to manual input).
- Auto-fill an email draft with the current date in Italian format.
- Open a Gmail draft or use a mailto fallback.
- Run a built-in OCR test set table with `Detected`, `Absolute Error`, and `Failure Reason` columns plus MAE/exact-match/no-read summary stats.

## Local Development
1. Install dependencies (none required beyond Python).
1. Ensure Python 3 and Node.js are installed.
2. Run the dev server:

```bash
Expand All @@ -18,19 +26,119 @@ npm run serve

Then open `http://localhost:8000`.

If you also want to run Playwright checks, install JS dependencies once:

```bash
npm install
```

### Optional Neural ROI Backend (recommended)
You can run a Python backend that detects the meter digit window using a fine-tuned pretrained model.

1. Open a second terminal and set up backend dependencies:

```bash
cd backend
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
```

For CPU-only environments (for example Vercel), install:

```bash
pip install -r requirements-cpu.txt
```


2. Train/fine-tune a model (copies best checkpoint to `backend/models/roi.pt`):

```bash
python train_roi.py \
--data data/roi_dataset.yaml \
--base-model yolov8n.pt \
--rotation-angles 90,180,270,360 \
--heavy-augment
```

The API default ROI checkpoint is pinned to `backend/models/roi-rotaug-e30-640.pt`.
To run with a newly trained checkpoint, set `ROI_MODEL_PATH` explicitly before starting the backend.
`train_roi.py` now enforces heavy augmentation + rotation expansion by default; weaker runs require explicit `--allow-no-augment-policy`.

Optional: train the per-cell digit classifier checkpoint:

```bash
python train_digit_classifier.py --device cpu
```

For dataset expansion/QA before retraining:

```bash
python plan_digit_expansion.py --target-train-per-digit 12 --priority-digits 4,5,6,9
python validate_digit_dataset.py
```

3. Start the API:

```bash
uvicorn app:app --host 127.0.0.1 --port 8001 --reload
```

By default, the frontend calls `http://127.0.0.1:8001/roi/detect` and requires neural ROI detection before OCR.
The frontend can also call `http://127.0.0.1:8001/digit/predict-cells` when `OCR_CONFIG.digitClassifier.enabled` is set to `true`.
Check backend readiness with:

```bash
curl -s http://127.0.0.1:8001/health
```

### E2E Tests

Run Playwright checks for neural-ROI failure handling and OCR selection guard regressions:

```bash
npm run test:e2e
```

Generate a per-image ROI checkpoint comparison report (`roi-rotaug-e30-640.pt` vs `roi.pt`) with stage `5/6` debug snapshots:

```bash
npm run benchmark:roi-diff
```

Report artifacts are written under `output/roi-checkpoint-diff/<timestamp>/`.
Per-image diff tables include selected OCR metadata (`sourceLabel`, `method`, `preprocessMode`) and stage `6` exports use the last `6. OCR input candidate` frame from each debug session.
To benchmark with digit-classifier fallback enabled (gated to `ocr-no-digits`), run:

```bash
JARVIS_DIGIT_FALLBACK=1 npm run benchmark:roi-diff
```

CI runs these tests on every pull request and on pushes to `master`.

## File Overview
- `index.html`: UI layout.
- `styles.css`: Styling.
- `app.js`: OCR + email draft logic.
- `app.js`: Thin entrypoint that imports `src/main.js`.
- `src/main.js`: UI orchestration and event wiring.
- `src/ocr/`: OCR pipeline and neural ROI integration.
- `src/testset/`: Manual OCR test-set runner.
- `backend/`: Optional FastAPI service for neural ROI and digit-classifier inference/training.
- `AGENTS.md`: Contributor guide.
- `assets/`: Static assets and example uploads.

## Notes
- OCR runs fully in the browser using Tesseract.js.
- OCR now relies on neural ROI detection; if the backend is unavailable or ROI fails, the app asks for manual reading input.
- ROI word-pass defaults to raw strip input; stage `6. OCR input candidate` mirrors the configured OCR input mode.
- Edge-derived ROI strip candidates are enabled by default and can be toggled with `OCR_CONFIG.roiDeterministic.useEdgeCandidates`.
- Digit decoding can optionally use a backend classifier (`src/ocr/config.js` -> `digitClassifier.enabled`), which is `false` by default.
- The selection layer is fail-safe: isolated edge-only single hits are rejected unless independently corroborated.
- Use the UI `Run test set` action plus `npm run test:e2e` for OCR regressions before and after tuning.
- The Gmail flow opens a draft; you always review and send manually.

## Asset Naming (Meter Images)
- Use the EXIF `DateTimeOriginal` value as the source of truth for the acquisition date.
- Rename files to `meter_mmddyyyy` (zero-padded) and keep the original extension.
- If multiple images share the same date, keep one as-is and add suffixes to the rest (e.g., `_b`, `_c`).
- If multiple images share the same date, keep one as-is and add numeric suffixes to the rest (e.g., `_1`, `_2`).
- If EXIF is missing, prefer a known date from the filename or capture notes and document it.
Loading
Loading