An intelligent invoice processing system that automates data extraction, validation, and standardization for Accounts Payable teams.
Team Meta Cognition | National Institute of Technology, Rourkela
- Overview
- Problem Statement
- Solution
- Features
- Architecture
- Technology Stack
- Installation
- Usage
- API Documentation
- Performance
- Roadmap
- Contributing
- Team
- License
PayGo is an AI-powered invoice processing system designed to solve the challenges faced by Accounts Payable (AP) teams when dealing with diverse invoice formats. By leveraging OCR and Large Language Models, PayGo automatically extracts, validates, and standardizes invoice data, reducing manual workload and eliminating processing errors.
- 95%+ Accuracy in data extraction across diverse invoice formats
- 10x Faster processing compared to manual data entry
- Multi-format Export (JSON, CSV, TXT, XLSX)
- Human-in-the-Loop validation for quality assurance
- Cloud-Ready containerized deployment
- Audit Trail for compliance and tracking
-
Invoice Diversity
- Varied layouts make fixed rules ineffective
- AP teams struggle to maintain speed and accuracy
- Scalability issues with increasing invoice volumes
-
Manual Processing
- Slow and error-prone human-driven workflows
- Incorrect payments and lost discounts
- Rising operational expenses
-
Lack of Standardization
- Inconsistent date formats, vendor names, and currencies
- Limited audit trails create compliance risks
- Difficult system integration
PayGo addresses these challenges through three core components:
- OCR Technology: Azure OCR Intelligence for text extraction
- NLP/LLM Processing: OpenAI GPT-4o-mini for intelligent parsing
- Format Agnostic: Handles any invoice layout or language
- Field Extraction: Automatically identifies Invoice ID, Date, Vendor, Amount, Currency, and more
- Auto-validation: High-confidence fields processed automatically
- Smart Flagging: Low-confidence extractions flagged for review
- Quality Assurance: Guarantees accuracy while reducing manual workload
- Confidence Scoring: Transparency in extraction reliability
- Format Canonicalization: Standardizes dates, vendor names, and currencies
- Structured Output: Converts data to JSON, CSV, TXT, or XLSX
- ERP Integration: Seamless data push to accounting systems
- Audit Trail: Complete processing history for compliance
- β Multi-format Invoice Support - PDF, JPG, PNG, and more
- β Intelligent Field Extraction - Invoice ID, Date, Vendor, Amount, Tax, Currency
- β Confidence Scoring - AI-powered reliability indicators
- β Validation Workflow - Human review for uncertain extractions
- β Data Standardization - Consistent format across all outputs
- β Multiple Export Formats - JSON, CSV, TXT, XLSX
- β Batch Processing - Handle multiple invoices simultaneously
- β Audit Logging - Complete processing history
- π§ Parallelized Processing - Multi-threaded invoice handling
- π³ Docker Support - Containerized for easy deployment
- βοΈ Cloud-Ready - Scalable architecture
- π Secure - Data encryption and secure storage
- π Database Integration - Persistent storage of processed data
- π API-First Design - RESTful API for integration
βββββββββββββββββββ
β User Uploads β
β Invoice (PDF/ β
β JPG/PNG) β
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββ ββββββββββββββββ βββββββββββββββ
β Image βββββββΆβ OCR Model βββββββΆβ Parser β
β Processing β β (Azure) β β Model β
βββββββββββββββββββ ββββββββββββββββ β (GPT-4o β
β mini) β
ββββββββ¬βββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββ ββββββββββββββββ βββββββββββββββ
β Flagging ββββββββ Format ββββββββ Database β
β (Low β β Conversion β β Storage β
β Confidence) β ββββββββββββββββ βββββββββββββββ
ββββββββββ¬βββββββββ β
β β
βΌ βΌ
βββββββββββββββββββ ββββββββββββββββββββββββ
β Human Review β β Available to β
β & Validation β β Download (JSON/CSV/ β
βββββββββββββββββββ β XLSX/TXT) β
ββββββββββββββββββββββββ
Invoice Capture β Image Processing β Multithreading Manager
β
βββββββββββββββββββββββΌββββββββββββββββββββββ
βΌ βΌ βΌ
[Thread 1] [Thread 2] [Thread 3]
β β β
βββββββββββββββββββββββΌββββββββββββββββββββββ
βΌ
Aggregation & Structured
Output Generation
β
βΌ
Validation and Review
β
βΌ
Integration with ERP
| OCR Engine | Parsing Model | Performance | Status |
|---|---|---|---|
| Tesseract | Meta Llama 3 | Low accuracy, free | β Not Used |
| PaddleOCR | OpenAI GPT-4o | Fast but unreliable on complex layouts | β Not Used |
| Google Document AI | Google Gemini 1.5 | Good integration, high cost | β Not Used |
| Azure OCR Intelligence | OpenAI GPT-4o-mini | High accuracy, low cost, fast | β Selected |
- OCR: Azure OCR Intelligence
- AI/NLP: OpenAI GPT-4o-mini
- Backend: Python 3.8+
- Database: PostgreSQL/SQLite
- Containerization: Docker
- Cloud Platform: AWS/Azure/GCP (configurable)
- azure-ai-formrecognizer
- openai
- pandas
- pillow
- opencv-python
- openpyxl
- sqlalchemy
- fastapi
- uvicorn
- Python 3.8 or higher
- Docker (optional, for containerized deployment)
- Azure Account with OCR Intelligence enabled
- OpenAI API Key
-
Clone the repository
git clone https://github.com/HrushikeshAnandSarangi/paygo.git cd paygo -
Create a virtual environment
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies
pip install -r requirements.txt
-
Configure environment variables
cp .env.example .env # Edit .env with your API keysRequired environment variables:
AZURE_OCR_KEY=your_azure_ocr_key AZURE_OCR_ENDPOINT=your_azure_endpoint OPENAI_API_KEY=your_openai_api_key DATABASE_URL=your_database_url -
Run database migrations
python manage.py migrate
-
Start the application
python app.py # Or using uvicorn for FastAPI uvicorn main:app --reload
-
Build the Docker image
docker build -t paygo:latest . -
Run the container
docker run -d \ -p 8000:8000 \ -e AZURE_OCR_KEY=your_key \ -e OPENAI_API_KEY=your_key \ --name paygo-app \ paygo:latest
-
Using Docker Compose
docker-compose up -d
- Access the application at
http://localhost:8000 - Upload invoice files (PDF, JPG, PNG)
- Wait for AI processing
- Review flagged fields if any
- Download processed data in your preferred format
curl -X POST http://localhost:8000/api/upload \
-F "file=@invoice.pdf"curl http://localhost:8000/api/status/{invoice_id}# JSON format
curl http://localhost:8000/api/download/{invoice_id}?format=json
# CSV format
curl http://localhost:8000/api/download/{invoice_id}?format=csv
# Excel format
curl http://localhost:8000/api/download/{invoice_id}?format=xlsx
# Text format
curl http://localhost:8000/api/download/{invoice_id}?format=txtfrom paygo import InvoiceProcessor
# Initialize processor
processor = InvoiceProcessor(
azure_key="your_azure_key",
openai_key="your_openai_key"
)
# Process invoice
result = processor.process_invoice("path/to/invoice.pdf")
# Access extracted data
print(result.invoice_id)
print(result.vendor_name)
print(result.total_amount)
# Export to different formats
result.to_json("output.json")
result.to_csv("output.csv")
result.to_excel("output.xlsx")Upload and process an invoice.
Request:
- Method:
POST - Content-Type:
multipart/form-data - Body:
file(PDF, JPG, PNG)
Response:
{
"invoice_id": "uuid",
"status": "processing",
"message": "Invoice uploaded successfully"
}Get processing status of an invoice.
Response:
{
"invoice_id": "uuid",
"status": "completed",
"confidence": 0.95,
"requires_review": false
}Retrieve extracted invoice data.
Response:
{
"invoice_id": "INV-2024-001",
"vendor_name": "Acme Corp",
"invoice_date": "2024-01-15",
"due_date": "2024-02-15",
"total_amount": 1250.00,
"currency": "USD",
"tax_amount": 125.00,
"line_items": [
{
"description": "Product A",
"quantity": 10,
"unit_price": 100.00,
"total": 1000.00
}
],
"confidence_scores": {
"invoice_id": 0.98,
"total_amount": 0.97,
"vendor_name": 0.95
}
}Download processed invoice data.
Query Parameters:
format:json,csv,xlsx,txt
- Processing Time: 2-5 seconds per invoice (average)
- Accuracy: 95%+ on diverse invoice formats
- Throughput: 100+ invoices per minute (with parallelization)
- Confidence Threshold: 85% for auto-validation
-
Export Formats
- β Before: JSON and CSV only
- β After: JSON, CSV, TXT, and XLSX
-
Deployment
- β Before: Not deployment-ready, scaling issues
- β After: Dockerized and cloud-deployed
-
Feature Scope
- β Before: Off-topic features (due date notifications, balance sheets)
- β After: Focused core functionality
- Core invoice processing
- Azure OCR integration
- OpenAI GPT-4o-mini parsing
- Multiple export formats
- Docker containerization
- Enhanced validation UI
- Batch upload interface
- Advanced reporting dashboard
- API authentication
- Rate limiting
- ERP system integrations (SAP, QuickBooks, Xero)
- Multi-language support
- Custom field training
- Mobile app
- Webhook notifications
- Advanced analytics
We welcome contributions! Please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
- Follow PEP 8 style guide for Python code
- Write unit tests for new features
- Update documentation as needed
- Ensure all tests pass before submitting PR
Team Meta Cognition
National Institute of Technology, Rourkela
- Sujal Kumar Agarwal - Team Leader
- Kunal Kushwaha - Member
- Istaprasad Patra - Member
- Hrushikesh Anand Sarangi - Member
This project is licensed under the MIT License - see the LICENSE file for details.
- Azure OCR Intelligence for powerful text extraction
- OpenAI for advanced language understanding
- National Institute of Technology, Rourkela
- All contributors and testers
For questions, issues, or feature requests, please:
- Check the Issues page
- Create a new issue if your question isn't already addressed
- Contact the team at [email]