This repository now contains a practical document-processing toolkit focused on one main goal:
- Accept camera captures or imported files/folders.
- Run document preprocessing and cleanup.
- Export a clean merged PDF.
- Add optional OCR later as a controlled extension.
Run:
python camscan_hybrid_tool.pycamscan_hybrid_tool.py is the current unified variant for your workflow.
A new package-based unified application is being built under src/uniscan.
Run (from repository root):
set PYTHONPATH=src
python -m uniscan.cliOne-script launcher (recommended on Windows):
.\run_uniscan.cmdOr after installation:
uniscanQuick workflow (Office Lens style):
- Open tab
1. Import(main mode) and load files/folder, or use2. Scanfor camera capture. ImportandScanare acquisition-only: they load/capture raw pages into session.- App switches to
3. Review: reorder, rotate, deskew, auto crop, manual corners, and side-by-sideBefore/Afterpreview. - All processing controls are in
Review: quick dropdowns (Lens,Post,Preset),Advanced...popup sliders, and anApply all changes to all filesscope checkbox. - Review uses lightweight previews by default (
Full HD); uncheck it to work directly with full-resolution previews. Auto Crop...opens a page browser with auto-detect and manual corner editing for one page or all pages.- Open
4. Export, choose OCR engine if needed, then save merged PDF or image files.
Current implemented modules in this new app:
Capture: live preview, single capture, burst capture, camera configurationImport: folder/files (multi-select)/PDF import into one sessionPages: page list management (preview, reorder, select/delete)Export: merged PDF and separate image export
Implementation notes:
- Session pages are disk-backed (
uniscancache) with lazy reads to reduce RAM usage on large batches. Pagesreview now showsBefore/Afterpreview for preprocessing visibility.- Capture/import keep originals first; processing is only applied from
Review. - Export tab supports OCR engine selection with dependency status checks.
Importsupports multi-file selection and background loading.- Import order is preserved end-to-end: folder order, document page order, and mixed import order are kept as selected.
- Searchable PDF is currently wired for
pytesseract,OCRmyPDF, andPyMuPDF OCR. PaddleOCR,Surya, andMinerUare available as selectable OCR backends with readiness checks (searchable-PDF wiring pending).
camscan_hybrid_tool.py supports three source modes:
Import folderImport filesCamera capture
Processing features:
- Document detection and perspective extraction using third-party logic from
camscan_suhren:camscan.scanner.main - Postprocessing effects from
camscan_suhren:None,Sharpen,Grayscale,Black and White - Optional two-page split (
left/right) for book-like captures. - Merged PDF export from all source modes.
- Quality profiles (
Fast,Balanced,Best quality) for practical output control.
Note:
- OCR is intentionally left as the next stage and is not active in
camscan_hybrid_tool.pyyet.
For folder/files mode:
- Load input images (and PDF pages if PDF files are provided and
pymupdfis installed). - Optionally detect and extract document contour.
- Apply selected postprocessing function.
- Optionally split each page into left/right halves.
- Convert processed pages into one merged PDF.
For camera mode:
- Capture N shots from selected camera index.
- Wait configured delay between shots.
- Apply the same processing pipeline as above.
- Export merged PDF.
Recommended Python:
- Python
3.11+
Install dependencies:
pip install opencv-python numpy pillow img2pdf pymupdfOptional OCR dependencies in the new app:
pip install pytesseract pypdf ocrmypdf paddleocr pymupdfAlso install CLI/system tools where needed:
- Tesseract OCR engine in
PATHforpytesseractandPyMuPDF OCRmode. ocrmypdfcommand inPATHforOCRmyPDFmode.
Experimental engine packages:
Surya(ormarkerpackage path that bundles Surya OCR).MinerU(mineruormagic_pdfpackage).
If you plan to use legacy scripts with OCR, install additionally:
pip install ocrmypdf pypdfExternal OCR tools for legacy OCR scripts:
- Tesseract OCR
- Ghostscript
- qpdf
- Poppler (
pdftoppm,pdfunite) foronly_tesseract.pyin PDF mode
Main app:
python camscan_hybrid_tool.pyAlternative app (images/file + optional OCR already integrated):
python unified_pdf_tool.pyLegacy apps (kept for reference/fallback):
python fast.py
python img_2_pdf.py
python only_tesseract.py
python "prepare pdf to tesseract.py"| File | Role |
|---|---|
camscan_hybrid_tool.py |
Main hybrid app (camera + files/folder) using third-party processing logic from camscan_suhren |
unified_pdf_tool.py |
Unified app for folder/file workflows with optional OCR path |
fast.py |
OCR-focused GUI with batch PDF support |
img_2_pdf.py |
Photo-to-PDF app with OpenCV preprocessing and optional OCR |
only_tesseract.py |
OCR pipeline using direct tesseract.exe calls |
imgs_and_pdfs_ocr_fast_STABLE.py |
Stable previous OCR GUI version |
prepare pdf to tesseract.py |
PDF conditioning helper before OCR |
camscan_suhren/ |
Third-party camera scanner project used as source of preprocessing logic |
- OCR in
camscan_hybrid_tool.pyis not enabled yet (planned next). - Camera mode is shot-based capture (not a full continuous preview UI).
- PDF import in hybrid mode requires
pymupdf.
- Error about missing
camscanmodules: Ensure foldercamscan_suhrenexists directly inside repo root. - Cannot open camera: Check camera index and close other apps using webcam.
- PDF import error:
Install
pymupdf(pip install pymupdf).
- Add optional OCR to
camscan_hybrid_tool.pywith toggle and language setting. - Add stronger camera UX (preview/retake/selection before export).
- Add job queue for large folder batches.
- Add tests for hybrid pipeline stages.