Google Cloud Vision API Setup Guide

This guide explains how to set up Google Cloud Vision API for OCR on scanned protocol PDFs.

Why Vision API?

Problem: Orange County's 93 PDFs are scanned images (photos of paper documents) Solution: Google Vision API provides high-accuracy OCR that handles poor-quality scans

Cost: $1.50 per 1,000 pages

Orange County: 93 PDFs × ~3 pages = 279 pages = $0.42 one-time
Future LEMSAs with scanned PDFs: ~$0.30-0.50 per LEMSA

Benefits:

95%+ accuracy even on low-resolution scans
10-20x faster than Tesseract.js
Handles multi-column layouts
Detects text orientation automatically

Setup Steps (10 minutes)

1. Create Google Cloud Project

Go to Google Cloud Console
Click "Select a project" → "New Project"
Project name: protocol-guide-ocr
Click "Create"

2. Enable Vision API

Go to Vision API Library
Select your project (protocol-guide-ocr)
Click "Enable"
Wait 1-2 minutes for API to activate

3. Create Service Account

Go to Service Accounts
Click "Create Service Account"
Name: protocol-guide-ocr-bot
Description: OCR service for protocol PDF ingestion
Click "Create and Continue"

4. Grant Permissions

Role: Select "Cloud Vision AI Service Agent"
Click "Continue"
Click "Done"

5. Create JSON Key

Click on the service account you just created
Go to "Keys" tab
Click "Add Key" → "Create new key"
Key type: JSON
Click "Create"
Save the downloaded JSON file as /Users/tanner-osterkamp/Protocol-Guide/google-cloud-key.json

6. Update .env File

The .env file has been pre-configured with:

GOOGLE_APPLICATION_CREDENTIALS=./google-cloud-key.json

Verify the file exists:

ls -la google-cloud-key.json

7. Add to .gitignore

The key file is already in .gitignore to prevent accidental commits:

google-cloud-key.json

Testing

Test Vision API on Single PDF

npx tsx scripts/test-oc-ocr-vision.ts

Expected output:

Testing Google Vision API OCR...
  Vision API: Processing PDF...
  Vision API: Extracted 2,847 characters from 2 pages

=== EXTRACTION RESULTS ===
Protocol: SO-M-15
Title: Allergic Reaction / Anaphylaxis
Content: [Full protocol text extracted]

Compare Vision API vs Tesseract.js

Tesseract.js result (current):

2-page PDF → 84 characters (gibberish)
Accuracy: ~10%

Vision API result (expected):

2-page PDF → 2,000-3,000 characters (clean text)
Accuracy: 95%+

Cost Monitoring

Current Usage (After Orange County)

Pages processed: 279 (93 PDFs × ~3 pages)
Cost: ~$0.42
API calls: 93 (one per PDF)

Monthly Estimates

If ingesting 10 new LEMSAs per month with scanned PDFs:

Pages: ~1,000 per month
Cost: ~$1.50 per month
Annual: ~$18

Set Billing Alert

Go to Billing
Click "Budgets & alerts"
Create budget: $5/month
Set alert at 50%, 90%, 100%

Security Best Practices

1. Service Account Permissions

✅ Correct: Cloud Vision AI Service Agent (read-only access to Vision API)
❌ Avoid: Owner, Editor (overly broad permissions)

2. Key Rotation

Rotate service account keys every 90 days:

# Revoke old key in Google Cloud Console
# Create new key
# Update google-cloud-key.json
# Restart ingestion services

3. Key Storage

Local development: ./google-cloud-key.json (gitignored)
Railway production: Upload key as environment variable:

# Minify JSON (remove whitespace)
cat google-cloud-key.json | jq -c > minified-key.json

# Set in Railway
railway variables set GOOGLE_APPLICATION_CREDENTIALS="$(cat minified-key.json)"

Troubleshooting

Error: "Could not load the default credentials"

Cause: GOOGLE_APPLICATION_CREDENTIALS not set or file not found

Fix:

export GOOGLE_APPLICATION_CREDENTIALS="./google-cloud-key.json"
npx tsx scripts/ingest-ca-protocols.ts --lemsa "Orange"

Error: "Vision API has not been enabled"

Cause: API not enabled for your project

Fix: Go to Vision API Library and click "Enable"

Error: "The caller does not have permission"

Cause: Service account lacks Vision API permissions

Fix:

Go to IAM
Find your service account
Click "Edit"
Add role: "Cloud Vision AI Service Agent"

Error: "Quota exceeded"

Cause: Hit free tier limit (1,000 images/month)

Fix: Enable billing or wait until next month

Free tier: 1,000 images/month free
After free tier: $1.50 per 1,000 images

Production Deployment (Railway)

Option 1: Environment Variable (Recommended)

# Minify JSON key
cat google-cloud-key.json | jq -c > minified-key.json

# Upload to Railway
railway link protocol-guide-production
railway variables set GOOGLE_APPLICATION_CREDENTIALS="$(cat minified-key.json)"

# Redeploy
railway up

Option 2: Application Default Credentials

If Railway is hosted on Google Cloud:

railway variables set GOOGLE_CLOUD_PROJECT=protocol-guide-ocr

Fallback Behavior

The OCR extractor automatically falls back to Tesseract.js if Vision API is unavailable:

Vision API configured? → Try Vision API first
Vision API fails? → Fall back to Tesseract.js
Tesseract fails? → Return error

Log example:

  Using Google Vision API for OCR...
  Vision API failed: Invalid authentication credentials
  Falling back to Tesseract.js...
  OCR: Processing 2 pages...

Next Steps

After setup:

Test: npx tsx scripts/test-oc-ocr-vision.ts
Ingest Orange County: npx tsx scripts/ingest-ca-protocols.ts --lemsa "Orange"
Verify: Check database for 600-800 new Orange County chunks

Last Updated: 2026-02-18
Status: Ready for testing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Google Cloud Vision API Setup Guide

Why Vision API?

Setup Steps (10 minutes)

1. Create Google Cloud Project

2. Enable Vision API

3. Create Service Account

4. Grant Permissions

5. Create JSON Key

6. Update .env File

7. Add to .gitignore

Testing

Test Vision API on Single PDF

Compare Vision API vs Tesseract.js

Cost Monitoring

Current Usage (After Orange County)

Monthly Estimates

Set Billing Alert

Security Best Practices

1. Service Account Permissions

2. Key Rotation

3. Key Storage

Troubleshooting

Error: "Could not load the default credentials"

Error: "Vision API has not been enabled"

Error: "The caller does not have permission"

Error: "Quota exceeded"

Production Deployment (Railway)

Option 1: Environment Variable (Recommended)

Option 2: Application Default Credentials

Fallback Behavior

Next Steps

FilesExpand file tree

GOOGLE_VISION_API_SETUP.md

Latest commit

History

GOOGLE_VISION_API_SETUP.md

File metadata and controls

Google Cloud Vision API Setup Guide

Why Vision API?

Setup Steps (10 minutes)

1. Create Google Cloud Project

2. Enable Vision API

3. Create Service Account

4. Grant Permissions

5. Create JSON Key

6. Update .env File

7. Add to .gitignore

Testing

Test Vision API on Single PDF

Compare Vision API vs Tesseract.js

Cost Monitoring

Current Usage (After Orange County)

Monthly Estimates

Set Billing Alert

Security Best Practices

1. Service Account Permissions

2. Key Rotation

3. Key Storage

Troubleshooting

Error: "Could not load the default credentials"

Error: "Vision API has not been enabled"

Error: "The caller does not have permission"

Error: "Quota exceeded"

Production Deployment (Railway)

Option 1: Environment Variable (Recommended)

Option 2: Application Default Credentials

Fallback Behavior

Next Steps