This guide explains how to set up Google Cloud Vision API for OCR on scanned protocol PDFs.
Problem: Orange County's 93 PDFs are scanned images (photos of paper documents) Solution: Google Vision API provides high-accuracy OCR that handles poor-quality scans
Cost: $1.50 per 1,000 pages
- Orange County: 93 PDFs × ~3 pages = 279 pages = $0.42 one-time
- Future LEMSAs with scanned PDFs: ~$0.30-0.50 per LEMSA
Benefits:
- 95%+ accuracy even on low-resolution scans
- 10-20x faster than Tesseract.js
- Handles multi-column layouts
- Detects text orientation automatically
- Go to Google Cloud Console
- Click "Select a project" → "New Project"
- Project name:
protocol-guide-ocr - Click "Create"
- Go to Vision API Library
- Select your project (
protocol-guide-ocr) - Click "Enable"
- Wait 1-2 minutes for API to activate
- Go to Service Accounts
- Click "Create Service Account"
- Name:
protocol-guide-ocr-bot - Description:
OCR service for protocol PDF ingestion - Click "Create and Continue"
- Role: Select "Cloud Vision AI Service Agent"
- Click "Continue"
- Click "Done"
- Click on the service account you just created
- Go to "Keys" tab
- Click "Add Key" → "Create new key"
- Key type: JSON
- Click "Create"
- Save the downloaded JSON file as
/Users/tanner-osterkamp/Protocol-Guide/google-cloud-key.json
The .env file has been pre-configured with:
GOOGLE_APPLICATION_CREDENTIALS=./google-cloud-key.jsonVerify the file exists:
ls -la google-cloud-key.jsonThe key file is already in .gitignore to prevent accidental commits:
google-cloud-key.json
npx tsx scripts/test-oc-ocr-vision.tsExpected output:
Testing Google Vision API OCR...
Vision API: Processing PDF...
Vision API: Extracted 2,847 characters from 2 pages
=== EXTRACTION RESULTS ===
Protocol: SO-M-15
Title: Allergic Reaction / Anaphylaxis
Content: [Full protocol text extracted]
Tesseract.js result (current):
- 2-page PDF → 84 characters (gibberish)
- Accuracy: ~10%
Vision API result (expected):
- 2-page PDF → 2,000-3,000 characters (clean text)
- Accuracy: 95%+
- Pages processed: 279 (93 PDFs × ~3 pages)
- Cost: ~$0.42
- API calls: 93 (one per PDF)
If ingesting 10 new LEMSAs per month with scanned PDFs:
- Pages: ~1,000 per month
- Cost: ~$1.50 per month
- Annual: ~$18
- Go to Billing
- Click "Budgets & alerts"
- Create budget: $5/month
- Set alert at 50%, 90%, 100%
✅ Correct: Cloud Vision AI Service Agent (read-only access to Vision API)
❌ Avoid: Owner, Editor (overly broad permissions)
Rotate service account keys every 90 days:
# Revoke old key in Google Cloud Console
# Create new key
# Update google-cloud-key.json
# Restart ingestion servicesLocal development: ./google-cloud-key.json (gitignored)
Railway production: Upload key as environment variable:
# Minify JSON (remove whitespace)
cat google-cloud-key.json | jq -c > minified-key.json
# Set in Railway
railway variables set GOOGLE_APPLICATION_CREDENTIALS="$(cat minified-key.json)"Cause: GOOGLE_APPLICATION_CREDENTIALS not set or file not found
Fix:
export GOOGLE_APPLICATION_CREDENTIALS="./google-cloud-key.json"
npx tsx scripts/ingest-ca-protocols.ts --lemsa "Orange"Cause: API not enabled for your project
Fix: Go to Vision API Library and click "Enable"
Cause: Service account lacks Vision API permissions
Fix:
- Go to IAM
- Find your service account
- Click "Edit"
- Add role: "Cloud Vision AI Service Agent"
Cause: Hit free tier limit (1,000 images/month)
Fix: Enable billing or wait until next month
Free tier: 1,000 images/month free
After free tier: $1.50 per 1,000 images
# Minify JSON key
cat google-cloud-key.json | jq -c > minified-key.json
# Upload to Railway
railway link protocol-guide-production
railway variables set GOOGLE_APPLICATION_CREDENTIALS="$(cat minified-key.json)"
# Redeploy
railway upIf Railway is hosted on Google Cloud:
railway variables set GOOGLE_CLOUD_PROJECT=protocol-guide-ocrThe OCR extractor automatically falls back to Tesseract.js if Vision API is unavailable:
- Vision API configured? → Try Vision API first
- Vision API fails? → Fall back to Tesseract.js
- Tesseract fails? → Return error
Log example:
Using Google Vision API for OCR...
Vision API failed: Invalid authentication credentials
Falling back to Tesseract.js...
OCR: Processing 2 pages...
After setup:
- Test:
npx tsx scripts/test-oc-ocr-vision.ts - Ingest Orange County:
npx tsx scripts/ingest-ca-protocols.ts --lemsa "Orange" - Verify: Check database for 600-800 new Orange County chunks
Last Updated: 2026-02-18
Status: Ready for testing