Receipt OCR MLX

A high-performance receipt OCR and data extraction service using MLX-VLM (Apple Silicon optimized vision-language models).

Features

🚀 Fast inference on Apple Silicon (M1/M2/M3)
🎯 Specialized for receipt/invoice data extraction
🔒 Runs locally - no API costs or data privacy concerns
📊 Structured JSON output
🔄 RESTful API for easy integration

Requirements

Apple Silicon Mac (M1/M2/M3)
Python 3.9+
8GB+ RAM recommended

Installation

# Create virtual environment
python3 -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

Usage

Start the service

python app.py

The service will start on http://localhost:8000

API Endpoints

POST /extract

Extract receipt data from an image

Request:

curl -X POST http://localhost:8000/extract -F "image=@testdata/sample_receipt.png"

Response:

{
  "vendor_name": "J.Co",
  "invoice_number": "12345",
  "invoice_date": "2024-01-15",
  "due_date": "2024-01-15",
  "items": [
    {
      "description": "Caffe Latte",
      "details": ["Grande", "Whole Milk"],
      "quantity": 1.0,
      "unit_price": 4.95,
      "currency": "USD",
      "total": 4.95,
      "category": "Food & Beverage"
    }
  ],
  "subtotal": 4.95,
  "tax_rate_percent": 10.0,
  "tax_amount": 0.50,
  "discount": 0.0,
  "total_due": 5.45
}

GET /health

Health check endpoint

Configuration

Edit config.py to customize:

Model selection
Port number
Max image size
Timeout settings

Integration with Go Backend

Update your Go backend's .env:

# Use local MLX service instead of OpenRouter
USE_MLX_SERVICE=true
MLX_SERVICE_URL=http://localhost:8000

Models

Default model: qwen/Qwen2-VL-2B-Instruct

Other supported models:

qwen/Qwen2-VL-7B-Instruct (better accuracy, slower)
llava-hf/llava-1.5-7b-hf (general purpose)

Change model in config.py

Performance

Typical inference times on M2 Pro:

First request: ~5-10s (model loading)
Subsequent requests: ~2-4s

Troubleshooting

Model download fails

# Manually download model
python -c "from mlx_vlm import load; load('qwen/Qwen2-VL-2B-Instruct')"

Out of memory

Use smaller model (2B instead of 7B)
Close other applications
Reduce max_tokens in config

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.env.example		.env.example
.gitignore		.gitignore
CLEANUP_GUIDE.md		CLEANUP_GUIDE.md
INTEGRATION.md		INTEGRATION.md
MODEL_GUIDE.md		MODEL_GUIDE.md
Makefile		Makefile
README.md		README.md
RENAME_SUMMARY.md		RENAME_SUMMARY.md
SETUP.md		SETUP.md
TEST_RESULTS.md		TEST_RESULTS.md
app.py		app.py
config.py		config.py
requirements.txt		requirements.txt
test_service.py		test_service.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Receipt OCR MLX

Features

Requirements

Installation

Usage

Start the service

API Endpoints

POST /extract

GET /health

Configuration

Integration with Go Backend

Models

Performance

Troubleshooting

Model download fails

Out of memory

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Receipt OCR MLX

Features

Requirements

Installation

Usage

Start the service

API Endpoints

POST /extract

GET /health

Configuration

Integration with Go Backend

Models

Performance

Troubleshooting

Model download fails

Out of memory

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages