Phase 1 — Core pipeline MVP (capture → OCR → translate → terminal)

# Phase 1 — Core pipeline MVP (capture → OCR → translate → terminal)

Purpose
-------
Implement the end-to-end minimal pipeline to capture a frame, run OCR, translate, and output to terminal. This phase enables a CLI-based MVP that proves pipeline wiring before adding overlays, voice, or UI.

Tasks (atomic, AI-sized)
- [ ] 1.1 Implement `capture_region` and `capture_full`
  - File: `src/capture/screen.py`
  - Work: Add `capture_region(region: Tuple[int,int,int,int]) -> numpy.ndarray` and `capture_full() -> numpy.ndarray` using `mss`.
  - Tests: `tests/test_capture.py` that mocks `mss` and validates return type and shape.
  - DoD: functions exist, documented, and unit tests pass locally.

- [ ] 1.2 Implement Region dataclass and profile save/load
  - File: `src/capture/regions.py`
  - Work: `Region` dataclass (id, name, coords, profile metadata). Add `save_profile(name, regions)` and `load_profile(name)` storing JSON under XDG path or repo-local `.kanjilens/profiles`.
  - Tests: `tests/test_regions.py` saves and loads a temp profile.
  - DoD: roundtrip save/load works and is used by capture calls.

- [ ] 1.3 Implement frame change detection
  - File: `src/capture/change_detector.py`
  - Work: `has_changed(prev, new, threshold=0.02) -> bool` using OpenCV/numpy diffs; add debounce helper `should_ocr`.
  - Tests: `tests/test_change_detector.py` verifying identical vs different frames for multiple thresholds.
  - DoD: change detection used by pipeline to skip OCR when unchanged.

- [ ] 1.4 Add CRAFT detector wrapper (interface only)
  - File: `src/ocr/detector.py`
  - Work: `class CraftDetector` with `load_model()` and `detect_text_regions(image) -> List[Rect]`. Provide a mockable interface; implement a no-op stub mode for CI.
  - Tests: `tests/test_detector.py` verifies API usage with a mock.
  - DoD: detector class present, documented I/O, tests pass.

- [ ] 1.5 Add MangaOCR reader wrapper (interface only)
  - File: `src/ocr/reader.py`
  - Work: `class MangaOcrReader` with `load_model()` and `read_region(image) -> (text, confidence)`. Provide a fallback/no-op mode for CI.
  - Tests: `tests/test_reader.py` using a fake model.
  - DoD: reader class present and callable from pipeline.

- [ ] 1.6 Compose OCR pipeline
  - File: `src/ocr/pipeline.py`
  - Work: Create `translate_frame(image) -> List[WordResult]` combining detector + reader returning bounding box, surface text, and confidence.
  - Tests: `tests/test_pipeline.py` mocking detector/reader to assert output schema.
  - DoD: pipeline returns deterministic structured output.

- [ ] 1.7 Minimal terminal runner (single-frame mode)
  - Files: `src/core/app.py`, adds CLI flag `--mode terminal`
  - Work: Capture single frame, run pipeline, print numbered words (surface + confidence).
  - Tests: `tests/test_cli_terminal.py` that runs main in dry-run with mocks.
  - DoD: `python -m src.core.app --mode terminal` prints numbered words in CI dry-run.

Notes
-----
- Keep model-loading optional in Phase 1 (tests use stubs).
- Focus on interfaces and contract stability for downstream phases.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phase 1 — Core pipeline MVP (capture → OCR → translate → terminal) #1

Phase 1 — Core pipeline MVP (capture → OCR → translate → terminal)

Purpose

Notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Phase 1 — Core pipeline MVP (capture → OCR → translate → terminal) #1

Description

Phase 1 — Core pipeline MVP (capture → OCR → translate → terminal)

Purpose

Notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions