This guide provides comprehensive instructions for running the 잇집 AI OCR service using Docker.
- Docker Desktop installed (Windows/Mac) or Docker Engine (Linux)
- Docker Compose v2.0+
- Google Cloud Vision API credentials
- At least 2GB of available RAM
-
Clone the repository and navigate to the project
cd /path/to/itzip/AI-develop -
Set up environment variables
# Copy the example environment file # Windows: copy .env.example .env # Linux/Mac: cp .env.example .env cp .env.example .env # Edit .env and add your API keys: # GOOGLE_APPLICATION_CREDENTIALS=credentials/google-vision-key.json # GOOGLE_API_KEY=your-google-api-key-here
-
Place your Google Cloud credentials
- Download your service account JSON from Google Cloud Console
- Save it as
credentials/google-vision-key.json
-
Build and run the service
docker-compose up -d
-
Verify the service is running
# Check health endpoint curl http://localhost:8000/health # View API documentation # Open http://localhost:8000/docs in your browser
- Base Image: Python 3.11 slim (Debian Bullseye)
- Multi-stage Build: Optimized for size and security
- Dependencies: All OCR, PDF processing, and AI libraries pre-installed
- Health Check: Built-in health monitoring
- Resource Limits: Memory and CPU constraints configured
The docker-compose.yml provides:
- Service Name:
ai-service - Container Name:
itzip-ai-service - Network: Isolated
itzip-networkfor security - Volume Mounts:
./credentials: Google Cloud credentials (read-only)./logs: Application logs./data: Vector store and documents./temp: Temporary file processing./law_system: Legal documents
./credentials:/app/credentials:ro- Mount your Google Cloud service account JSON here
- Read-only access for security
- Never commit credentials to Git
./data:/app/data- Contains vector store for legal document analysis
- Persists between container restarts
- Initialize with
law_docs/PDF files
./logs:/app/logs- Application logs persist outside container
- Useful for debugging and monitoring
- Rotate logs periodically to save space
| Variable | Description | Default |
|---|---|---|
GOOGLE_APPLICATION_CREDENTIALS |
Path to Google Cloud JSON key | /app/credentials/google-vision-key.json |
GOOGLE_API_KEY |
Google Generative AI API key | Required |
PORT |
Service port | 8000 |
HOST |
Service host | 0.0.0.0 |
LOG_LEVEL |
Logging level | INFO |
LOG_FILE |
Log file path | /app/logs/app.log |
# Start the service
docker-compose up -d
# Stop the service
docker-compose down
# Restart the service
docker-compose restart
# View service status
docker-compose ps# View logs
docker-compose logs -f
# View last 100 lines of logs
docker-compose logs --tail=100
# Access container shell
docker-compose exec ai-service bash
# Check resource usage
docker stats itzip-ai-service# Rebuild the image (after requirements change)
docker-compose build --no-cache
# Remove unused images
docker image prune
# Clean up everything (careful!)
docker-compose down -v --rmi allThe service is configured with resource limits:
- Memory: 2GB limit, 1GB reserved
- CPU: 1.0 CPU limit, 0.5 CPU reserved
Adjust these in docker-compose.yml if needed:
deploy:
resources:
limits:
memory: 2G
cpus: "1.0"
reservations:
memory: 1G
cpus: "0.5"The service includes a health check that:
- Runs every 30 seconds
- Times out after 10 seconds
- Retries 3 times before marking unhealthy
- Waits 40 seconds before first check
Check health status:
docker inspect itzip-ai-service --format='{{.State.Health.Status}}'- Check logs:
docker-compose logs - Verify credentials file exists:
ls credentials/ - Ensure ports are available:
netstat -an | findstr 8000- Windows:
netstat -an | findstr 8000 - Linux/Mac:
netstat -an | grep 8000orlsof -i :8000
- Windows:
- Verify Google Cloud credentials are valid
- Check API quotas in Google Cloud Console
- Review logs for authentication errors
- Increase Docker Desktop memory allocation
- Adjust container memory limits in docker-compose.yml
- Check for memory leaks in logs
- Ensure law_docs directory has PDF files
- Check file permissions
- Verify sufficient disk space
For production environments:
- Use Docker Swarm or Kubernetes for orchestration
- Enable SSL/TLS with a reverse proxy
- Set up monitoring with Prometheus/Grafana
- Configure log aggregation with ELK stack
- Use secrets management instead of .env files
- Set up automated backups for data volumes
Production Server:
├── /home/ubuntu/itzip-ai/ # Production (port 8000)
│ ├── credentials/
│ ├── logs/
│ ├── data/
│ └── .env
└── /home/ubuntu/itzip-ai-dev/ # Development (port 8001)
├── credentials/
├── logs/
├── data/
└── .env
- Never expose credentials in images or logs
- Use read-only mounts where possible
- Run as non-root user (already configured)
- Keep base images updated regularly
- Scan images for vulnerabilities with tools like Trivy
- Use network isolation between services
-
Enable BuildKit for faster builds:
set DOCKER_BUILDKIT=1 docker-compose build -
Use layer caching effectively
-
Minimize image size with multi-stage builds
-
Configure appropriate worker counts based on CPU cores
-
Monitor and adjust resource limits based on usage
# Backup vector store
# Windows
docker run --rm -v ai-develop_data:/data -v %cd%:/backup alpine tar czf /backup/data-backup.tar.gz -C /data .
# Linux/Mac
docker run --rm -v ai-develop_data:/data -v $(pwd):/backup alpine tar czf /backup/data-backup.tar.gz -C /data .
# Backup logs
# Windows
docker run --rm -v ai-develop_logs:/logs -v %cd%:/backup alpine tar czf /backup/logs-backup.tar.gz -C /logs .
# Linux/Mac
docker run --rm -v ai-develop_logs:/logs -v $(pwd):/backup alpine tar czf /backup/logs-backup.tar.gz -C /logs .# Restore vector store
docker run --rm -v ai-develop_data:/data -v %cd%:/backup alpine tar xzf /backup/data-backup.tar.gz -C /data
# Restore logs
docker run --rm -v ai-develop_logs:/logs -v %cd%:/backup alpine tar xzf /backup/logs-backup.tar.gz -C /logsFor issues or questions:
- Check the logs first:
docker-compose logs - Review this documentation
- Check the main README.md for API usage
- Submit issues to the project repository