Intelligent document comparison powered by Microsoft Agent Framework and Azure OpenAI
This application uses specialized AI agents to perform intelligent comparison of PDF documents (e.g., drug documentation, legal contracts, technical specifications) and outputs detailed difference reports with word-level precision.
- 🤖 Multi-Agent Architecture: Two specialized agents orchestrated by Microsoft Agent Framework
- Extraction Agent: Extracts structured content from PDFs with page and section information
- Comparison Agent: Hybrid two-phase approach for optimal accuracy and cost
- ⚡ Hybrid Comparison Approach: Best of both worlds!
- Phase 1: Deterministic diff algorithm finds ALL differences (free, instant, 100% accurate)
- Phase 2: AI adds semantic context and meaning (minimal cost, only for differences found)
- 📄 Dual PDF Processing:
pdfplumber: Fast, local extraction (default, no cost)- Azure Document Intelligence: Advanced extraction with better structure detection (optional)
- 💰 Cost-Effective: 90% cheaper than pure AI comparison - only sends differences to LLM, not full documents
- 📊 Structured Output: Generates comparison tables with page numbers, sections, and specific differences
- 🎯 Three Difference Types: Added, Removed, and Modified content detection
- ✅ Deterministic: Same input always produces same differences (unlike pure LLM approaches)
- 💻 No UI Required: Run directly from command line or IDE
┌──────────────────────────────────────────────────────────────┐
│ PDF Comparison Workflow (Hybrid Approach) │
│ Microsoft Agent Framework + Azure OpenAI │
└──────────────────────────────────────────────────────────────┘
│
▼
┌────────────────────────────────────┐
│ Agent 1: PDF Extraction │
│ • Extract text from both PDFs │
│ • Identify pages & sections │
│ • Create structured JSON │
└────────────────────────────────────┘
│
▼
┌────────────────────────────────────┐
│ Agent 2: Hybrid Comparison │
│ │
│ Phase 1: Deterministic Diff │
│ • difflib algorithm (FREE) │
│ • Find ALL differences │
│ • 100% accurate & reproducible │
│ ↓ │
│ Phase 2: LLM Enhancement │
│ • Azure OpenAI (minimal cost) │
│ • Add semantic context │
│ • Explain meaning & impact │
└────────────────────────────────────┘
│
▼
Output: JSON + CSV files
(Differences + AI Context)
agentic-text-comparison/
├── main.py # Entry point
├── requirements.txt # Dependencies
├── .env.example # Configuration template
├── .gitignore # Git ignore rules
├── setup.sh # Setup script
│
├── input/ # Place your PDFs here
├── output/ # Results saved here
│
└── src/
├── config.py # Configuration management
├── models.py # Data models
├── pdf_extractor.py # PDF extraction logic
├── diff_tool.py # Deterministic diff algorithm
├── agents.py # AI agents (hybrid comparison)
└── workflow.py # Workflow orchestration
- ✅ Python 3.9+
- ✅ Azure OpenAI account with deployed model
- ✅ Two PDF files to compare
# Create virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies (--pre flag required for Agent Framework)
pip install --pre -r requirements.txtCreate .env file from template:
cp .env.example .envEdit .env with your Azure credentials:
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_KEY=your-api-key-here
AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4o
AZURE_OPENAI_API_VERSION=2024-08-01-preview
# Optional: For advanced PDF extraction
AZURE_DOCUMENT_INTELLIGENCE_ENDPOINT=https://your-resource.cognitiveservices.azure.com/
AZURE_DOCUMENT_INTELLIGENCE_API_KEY=your-key-hereHow to get Azure credentials:
- Go to Azure Portal
- Navigate to your Azure OpenAI resource
- Go to "Keys and Endpoint"
- Copy the endpoint and one of the keys
- Go to "Model deployments" to see your deployment name
Place two PDF files in the input/ folder:
ls input/
# Should show your two PDF filespython main.pyThe application will:
- ✓ Load your Azure configuration
- ✓ Find the 2 PDFs in input/ folder
- ✓ Extract content using pdfplumber (free, local)
- ✓ Phase 1: Run deterministic diff algorithm (finds ALL differences, free)
- ✓ Phase 2: Enhance differences with AI context (minimal Azure OpenAI cost)
- ✓ Generate results in output/ folder
The application creates two files in the output/ folder:
Spreadsheet-friendly table format:
| page_number | section | difference_type | original_text | new_text | context |
|---|---|---|---|---|---|
| 1 | Introduction | modified | "version 1.0" | "version 2.0" | "This is version..." |
| 2 | Dosage | added | "" | "New dosage info" | "Section 2.1..." |
Detailed JSON with complete analysis:
{
"pdf1_name": "document_v1.pdf",
"pdf2_name": "document_v2.pdf",
"total_differences": 42,
"differences": [
{
"page_number": 1,
"section": "Introduction",
"difference_type": "modified",
"original_text": "version 1.0",
"new_text": "version 2.0",
"context": "This is version 1.0 of the document"
}
]
}For complex PDFs with tables, forms, or intricate layouts:
-
Add credentials to
.env:AZURE_DOCUMENT_INTELLIGENCE_ENDPOINT=https://your-resource.cognitiveservices.azure.com/ AZURE_DOCUMENT_INTELLIGENCE_API_KEY=your-key-here
-
Modify
src/agents.pyline 85:extraction1 = self.pdf_extractor.extract(pdf1_path, use_document_intelligence=True)
Update in .env:
AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4o-mini # or gpt-4, gpt-4o, etc.Recommended models:
- gpt-4o: Best quality, higher cost
- gpt-4o-mini: Faster, cheaper, good quality
- gpt-4: Good balance
Edit agent instructions in src/agents.py (line 115-151) to customize:
- Comparison focus (semantic vs lexical)
- Level of detail
- Types of differences to detect
pdfplumber (default):
- ✅ Free - runs locally
- ✅ Fast - no API calls
- ✅ Good for text-based PDFs
- Used by default for all extractions
Azure Document Intelligence (optional):
- ✅ Better structure recognition
- ✅ Handles complex layouts, tables
- ✅ OCR for scanned documents
- ❌ Costs money (Azure service)
The system uses a two-phase process for optimal results:
- Uses Python's
difflibalgorithm - Finds 100% of all differences between documents
- Line-by-line comparison with similarity detection
- Cost: $0 (runs locally)
- Time: Instant (milliseconds)
- Accuracy: Perfect - same results every time
- Output: Raw differences (Added, Removed, Modified)
- Only processes differences found in Phase 1
- Uses Azure OpenAI to add semantic context
- Explains the meaning and impact of each change
- Groups related differences for efficient processing
- Cost: ~$0.002-$0.01 per comparison (90% cheaper than full-document AI)
- Time: Seconds (depends on number of differences)
- Settings: Temperature=0.0 for consistent explanations
Cost Comparison:
- ❌ Traditional AI approach: ~15,000 tokens → $0.05-$0.10 per run
- ✅ Hybrid approach: ~500-1,500 tokens → $0.002-$0.01 per run
- Savings: 90% reduction in AI costs while maintaining 100% accuracy
Why This Works Better:
- ✅ Guaranteed to find ALL differences (unlike pure LLM)
- ✅ Deterministic results (same input = same output)
- ✅ Cost-effective (only pay for context enhancement)
- ✅ Fast (diff algorithm is instant, minimal LLM calls)
- Ensure
.envfile exists in project root - Verify all Azure credentials are correct
- Check endpoint format:
https://your-resource.openai.azure.com/
- Verify
input/folder contains at least 2.pdffiles - Check file extensions are lowercase
.pdf
- Try simpler PDFs first to test setup
- Check if PDFs are text-based (not scanned images)
- For scanned PDFs, enable Azure Document Intelligence
- Verify PDFs aren't password-protected or encrypted
- Large PDFs may take time (be patient)
- Check your Azure OpenAI quota limits
- Consider splitting very large documents
- Use
gpt-4o-minifor faster processing
- Verify Azure OpenAI endpoint is correct
- Check API key is valid and not expired
- Ensure your Azure subscription is active
⚠️ Never commit.envfile to version control- 🔒 Keep Azure API keys secure
- 👥 Use Azure RBAC for production deployments
- 🔄 Rotate keys regularly
- 📝 Audit access logs in Azure Portal
Microsoft Agent Framework (Python)
- Latest preview version with Azure AI integration
- Multi-agent orchestration with WorkflowBuilder
- Async execution with streaming support
- Flexible executor pattern for custom agents
Azure OpenAI Service
- GPT-4o/GPT-4 models for intelligent comparison
- Handles complex document analysis
- Identifies semantic and lexical differences
PDF Processing
- pdfplumber: Fast, local extraction (default)
- Azure Document Intelligence: Advanced extraction (optional)
- Structured data models for comparison
Output Formats
- JSON: Complete detailed analysis
- CSV: Spreadsheet-friendly table
✅ Multi-agent architecture with Microsoft Agent Framework
✅ PDF extraction with pdfplumber (fast, free)
✅ Optional Azure Document Intelligence integration
✅ Azure OpenAI-powered intelligent comparison
✅ Structured JSON output with page & section info
✅ CSV export for spreadsheet applications
✅ Word-level difference detection
✅ Three types of differences: added, removed, modified
✅ Error handling and validation
✅ Colored console output
✅ Async/await pattern throughout
✅ Environment-based configuration
- Type hints throughout
- Dataclasses for models
- Async/await for I/O operations
- Context managers for resources
- Separation of concerns
- Configuration management
- Microsoft Agent Framework - Official documentation
- Azure OpenAI Service - Service overview
- Azure Document Intelligence - Advanced PDF extraction
- pdfplumber Documentation - PDF extraction library
MIT
Built with ❤️ using Microsoft Agent Framework and Azure OpenAI