Skip to content

HrushikeshAnandSarangi/paygo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

33 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

PayGo - AI-Powered Invoice Processing System

License Python Azure OpenAI

An intelligent invoice processing system that automates data extraction, validation, and standardization for Accounts Payable teams.

Team Meta Cognition | National Institute of Technology, Rourkela


πŸ“‹ Table of Contents


🎯 Overview

PayGo is an AI-powered invoice processing system designed to solve the challenges faced by Accounts Payable (AP) teams when dealing with diverse invoice formats. By leveraging OCR and Large Language Models, PayGo automatically extracts, validates, and standardizes invoice data, reducing manual workload and eliminating processing errors.

Key Highlights

  • 95%+ Accuracy in data extraction across diverse invoice formats
  • 10x Faster processing compared to manual data entry
  • Multi-format Export (JSON, CSV, TXT, XLSX)
  • Human-in-the-Loop validation for quality assurance
  • Cloud-Ready containerized deployment
  • Audit Trail for compliance and tracking

🚨 Problem Statement

The Challenges

  1. Invoice Diversity

    • Varied layouts make fixed rules ineffective
    • AP teams struggle to maintain speed and accuracy
    • Scalability issues with increasing invoice volumes
  2. Manual Processing

    • Slow and error-prone human-driven workflows
    • Incorrect payments and lost discounts
    • Rising operational expenses
  3. Lack of Standardization

    • Inconsistent date formats, vendor names, and currencies
    • Limited audit trails create compliance risks
    • Difficult system integration

✨ Solution

PayGo addresses these challenges through three core components:

1. AI-Powered Invoice Understanding

  • OCR Technology: Azure OCR Intelligence for text extraction
  • NLP/LLM Processing: OpenAI GPT-4o-mini for intelligent parsing
  • Format Agnostic: Handles any invoice layout or language
  • Field Extraction: Automatically identifies Invoice ID, Date, Vendor, Amount, Currency, and more

2. Human-in-the-Loop Validation

  • Auto-validation: High-confidence fields processed automatically
  • Smart Flagging: Low-confidence extractions flagged for review
  • Quality Assurance: Guarantees accuracy while reducing manual workload
  • Confidence Scoring: Transparency in extraction reliability

3. Automated Data Standardization

  • Format Canonicalization: Standardizes dates, vendor names, and currencies
  • Structured Output: Converts data to JSON, CSV, TXT, or XLSX
  • ERP Integration: Seamless data push to accounting systems
  • Audit Trail: Complete processing history for compliance

πŸš€ Features

Core Functionality

  • βœ… Multi-format Invoice Support - PDF, JPG, PNG, and more
  • βœ… Intelligent Field Extraction - Invoice ID, Date, Vendor, Amount, Tax, Currency
  • βœ… Confidence Scoring - AI-powered reliability indicators
  • βœ… Validation Workflow - Human review for uncertain extractions
  • βœ… Data Standardization - Consistent format across all outputs
  • βœ… Multiple Export Formats - JSON, CSV, TXT, XLSX
  • βœ… Batch Processing - Handle multiple invoices simultaneously
  • βœ… Audit Logging - Complete processing history

Technical Features

  • πŸ”§ Parallelized Processing - Multi-threaded invoice handling
  • 🐳 Docker Support - Containerized for easy deployment
  • ☁️ Cloud-Ready - Scalable architecture
  • πŸ”’ Secure - Data encryption and secure storage
  • πŸ“Š Database Integration - Persistent storage of processed data
  • πŸ”Œ API-First Design - RESTful API for integration

πŸ—οΈ Architecture

Processing Workflow

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  User Uploads   β”‚
β”‚  Invoice (PDF/  β”‚
β”‚  JPG/PNG)       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Image          │─────▢│  OCR Model   │─────▢│  Parser     β”‚
β”‚  Processing     β”‚      β”‚  (Azure)     β”‚      β”‚  Model      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚  (GPT-4o    β”‚
                                                β”‚  mini)      β”‚
                                                β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
                                                       β”‚
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Flagging       │◀─────│  Format      │◀─────│  Database   β”‚
β”‚  (Low           β”‚      β”‚  Conversion  β”‚      β”‚  Storage    β”‚
β”‚  Confidence)    β”‚      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β”‚
         β”‚                       β”‚
         β–Ό                       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Human Review   β”‚      β”‚  Available to        β”‚
β”‚  & Validation   β”‚      β”‚  Download (JSON/CSV/ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚  XLSX/TXT)          β”‚
                         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Parallelization Architecture

Invoice Capture β†’ Image Processing β†’ Multithreading Manager
                                              β”‚
                        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                        β–Ό                     β–Ό                     β–Ό
                   [Thread 1]            [Thread 2]            [Thread 3]
                        β”‚                     β”‚                     β”‚
                        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                              β–Ό
                                    Aggregation & Structured
                                    Output Generation
                                              β”‚
                                              β–Ό
                                    Validation and Review
                                              β”‚
                                              β–Ό
                                    Integration with ERP

πŸ› οΈ Technology Stack

OCR Models Evaluated

OCR Engine Parsing Model Performance Status
Tesseract Meta Llama 3 Low accuracy, free ❌ Not Used
PaddleOCR OpenAI GPT-4o Fast but unreliable on complex layouts ❌ Not Used
Google Document AI Google Gemini 1.5 Good integration, high cost ❌ Not Used
Azure OCR Intelligence OpenAI GPT-4o-mini High accuracy, low cost, fast βœ… Selected

Core Technologies

  • OCR: Azure OCR Intelligence
  • AI/NLP: OpenAI GPT-4o-mini
  • Backend: Python 3.8+
  • Database: PostgreSQL/SQLite
  • Containerization: Docker
  • Cloud Platform: AWS/Azure/GCP (configurable)

Dependencies

- azure-ai-formrecognizer
- openai
- pandas
- pillow
- opencv-python
- openpyxl
- sqlalchemy
- fastapi
- uvicorn

πŸ“¦ Installation

Prerequisites

  • Python 3.8 or higher
  • Docker (optional, for containerized deployment)
  • Azure Account with OCR Intelligence enabled
  • OpenAI API Key

Local Setup

  1. Clone the repository

    git clone https://github.com/HrushikeshAnandSarangi/paygo.git
    cd paygo
  2. Create a virtual environment

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install dependencies

    pip install -r requirements.txt
  4. Configure environment variables

    cp .env.example .env
    # Edit .env with your API keys

    Required environment variables:

    AZURE_OCR_KEY=your_azure_ocr_key
    AZURE_OCR_ENDPOINT=your_azure_endpoint
    OPENAI_API_KEY=your_openai_api_key
    DATABASE_URL=your_database_url
    
  5. Run database migrations

    python manage.py migrate
  6. Start the application

    python app.py
    # Or using uvicorn for FastAPI
    uvicorn main:app --reload

Docker Deployment

  1. Build the Docker image

    docker build -t paygo:latest .
  2. Run the container

    docker run -d \
      -p 8000:8000 \
      -e AZURE_OCR_KEY=your_key \
      -e OPENAI_API_KEY=your_key \
      --name paygo-app \
      paygo:latest
  3. Using Docker Compose

    docker-compose up -d

πŸ’» Usage

Web Interface

  1. Access the application at http://localhost:8000
  2. Upload invoice files (PDF, JPG, PNG)
  3. Wait for AI processing
  4. Review flagged fields if any
  5. Download processed data in your preferred format

API Usage

Upload Invoice

curl -X POST http://localhost:8000/api/upload \
  -F "file=@invoice.pdf"

Get Processing Status

curl http://localhost:8000/api/status/{invoice_id}

Download Processed Data

# JSON format
curl http://localhost:8000/api/download/{invoice_id}?format=json

# CSV format
curl http://localhost:8000/api/download/{invoice_id}?format=csv

# Excel format
curl http://localhost:8000/api/download/{invoice_id}?format=xlsx

# Text format
curl http://localhost:8000/api/download/{invoice_id}?format=txt

Python SDK

from paygo import InvoiceProcessor

# Initialize processor
processor = InvoiceProcessor(
    azure_key="your_azure_key",
    openai_key="your_openai_key"
)

# Process invoice
result = processor.process_invoice("path/to/invoice.pdf")

# Access extracted data
print(result.invoice_id)
print(result.vendor_name)
print(result.total_amount)

# Export to different formats
result.to_json("output.json")
result.to_csv("output.csv")
result.to_excel("output.xlsx")

πŸ“š API Documentation

Endpoints

POST /api/upload

Upload and process an invoice.

Request:

  • Method: POST
  • Content-Type: multipart/form-data
  • Body: file (PDF, JPG, PNG)

Response:

{
  "invoice_id": "uuid",
  "status": "processing",
  "message": "Invoice uploaded successfully"
}

GET /api/status/{invoice_id}

Get processing status of an invoice.

Response:

{
  "invoice_id": "uuid",
  "status": "completed",
  "confidence": 0.95,
  "requires_review": false
}

GET /api/invoice/{invoice_id}

Retrieve extracted invoice data.

Response:

{
  "invoice_id": "INV-2024-001",
  "vendor_name": "Acme Corp",
  "invoice_date": "2024-01-15",
  "due_date": "2024-02-15",
  "total_amount": 1250.00,
  "currency": "USD",
  "tax_amount": 125.00,
  "line_items": [
    {
      "description": "Product A",
      "quantity": 10,
      "unit_price": 100.00,
      "total": 1000.00
    }
  ],
  "confidence_scores": {
    "invoice_id": 0.98,
    "total_amount": 0.97,
    "vendor_name": 0.95
  }
}

GET /api/download/{invoice_id}

Download processed invoice data.

Query Parameters:

  • format: json, csv, xlsx, txt

πŸ“Š Performance

Benchmarks

  • Processing Time: 2-5 seconds per invoice (average)
  • Accuracy: 95%+ on diverse invoice formats
  • Throughput: 100+ invoices per minute (with parallelization)
  • Confidence Threshold: 85% for auto-validation

Improvements from Feedback

  1. Export Formats

    • ❌ Before: JSON and CSV only
    • βœ… After: JSON, CSV, TXT, and XLSX
  2. Deployment

    • ❌ Before: Not deployment-ready, scaling issues
    • βœ… After: Dockerized and cloud-deployed
  3. Feature Scope

    • ❌ Before: Off-topic features (due date notifications, balance sheets)
    • βœ… After: Focused core functionality

πŸ—ΊοΈ Roadmap

Phase 1 (Completed) βœ…

  • Core invoice processing
  • Azure OCR integration
  • OpenAI GPT-4o-mini parsing
  • Multiple export formats
  • Docker containerization

Phase 2 (In Progress) 🚧

  • Enhanced validation UI
  • Batch upload interface
  • Advanced reporting dashboard
  • API authentication
  • Rate limiting

Phase 3 (Planned) πŸ“‹

  • ERP system integrations (SAP, QuickBooks, Xero)
  • Multi-language support
  • Custom field training
  • Mobile app
  • Webhook notifications
  • Advanced analytics

🀝 Contributing

We welcome contributions! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Development Guidelines

  • Follow PEP 8 style guide for Python code
  • Write unit tests for new features
  • Update documentation as needed
  • Ensure all tests pass before submitting PR

πŸ‘₯ Team

Team Meta Cognition
National Institute of Technology, Rourkela

  • Sujal Kumar Agarwal - Team Leader
  • Kunal Kushwaha - Member
  • Istaprasad Patra - Member
  • Hrushikesh Anand Sarangi - Member

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ™ Acknowledgments

  • Azure OCR Intelligence for powerful text extraction
  • OpenAI for advanced language understanding
  • National Institute of Technology, Rourkela
  • All contributors and testers

πŸ“ž Support

For questions, issues, or feature requests, please:

  1. Check the Issues page
  2. Create a new issue if your question isn't already addressed
  3. Contact the team at [email]

πŸ”— Links


#PowerdByMeta_Cognition

Made with ❀️ by Team Meta Cognition

⬆ back to top

About

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors