Transform your document scanning workflow with AI‑powered OCR technology.
Features • Installation • Usage • API • Contributing
Frappe OCR is a passport‑scanning and text‑extraction app for the Frappe Framework. It combines traditional OCR (Tesseract) with Google’s Gemini AI for accurate, intelligent document processing.
- Government Services – Passport applications and renewals
- Travel Agencies – Fast customer data entry from travel documents
- Banks & Financial Institutions – KYC document processing
- Immigration Services – Automated document verification
- Any Business – General document digitization
- Gemini AI Integration – Intelligent parsing and error correction
- Smart Error Correction – Automatically fixes common OCR mistakes
- Contextual Understanding – Extracts structured data (e.g., MRZ fields)
- Multi‑format – PNG, JPG, JPEG, PDF
- Passport‑Optimized – MRZ (Machine Readable Zone) parsing
- Batch Processing – Scan multiple files at once
- Image Preprocessing – Enhancement and noise reduction
- Adaptive Thresholding – Robust to varying lighting
- Skew Correction – Auto‑straighten tilted scans
- Native DocTypes – Leverage File Manager with full Frappe features
- Role‑Based Access – Fine‑grained permissions
- Search & Filter – Full‑text search on extracted content
- RESTful API – Integrate with external systems
Install Tesseract and development headers:
# Ubuntu/Debian
sudo apt update && sudo apt install tesseract-ocr libtesseract-dev
# macOS (Homebrew)
brew install tesseract
# CentOS/RHEL
sudo yum install tesseract tesseract-devel# In your bench directory
cd /path/to/bench
# Clone and install
bench get-app https://github.com/DeliveryDevs-ERP/ocr.git
bench --site your-site install-app ocr
# Install Python dependencies
bench setup requirements
# Restart bench
bench restartCreate a .env file at the app root:
# Gemini AI settings
GEMINI_API_KEY=your_gemini_api_key_here
GEMINI_MODEL=gemini-2.0-flash🔑 Get your Gemini API Key: Visit Google AI Studio.
- Open the OCR module in your Frappe Desk.
- Click New to create a File Manager record.
- Upload an image or PDF.
- Click Save – OCR runs automatically.
- Review extracted text in Scanned Contents.
import frappe
# Scan a passport image
parsed, raw = frappe.call(
'ocr.ocr.doctype.file_manager.file_manager.scan_passport',
file_url='/files/passport_scan.jpg'
)
print(parsed)
print(raw){
"passport_type": "P",
"last_name": "SMITH",
"first_name": "JOHN",
"passport_number": "AB1234567",
"nationality": "USA",
"date_of_birth": "1985-06-15",
"sex": "M",
"date_of_expiry": "2030-12-20",
"cnic": "1234567890123"
}| Raw Scan | Extracted Data |
|---|---|
![]() |
json{"first_name": "JOHN", "last_name": "SMITH", "passport_number": "AB1234567"} |
Extracts structured data from passport files.
Parameters:
file_url(string): Path to the uploaded file
Returns:
parsed_data(dict): Structured fieldsraw_text(string): Full OCR output
- Fork this repo
- Create a feature branch (
git checkout -b feature/XYZ) - Commit your changes (
git commit -m "feat: add XYZ") - Push to branch (
git push origin feature/XYZ) - Open a Pull Request
We welcome contributions of all kinds! Feel free to open issues or submit PRs.
© 2025 Deliverydevs. Licensed under MIT.
