Production-grade AI-powered email management system optimized for USA, India, and Germany markets with zero false positive guarantee on critical communications.
- Multi-Regional Support: Optimized protected domain lists for USA, India, and Germany
- Bilingual Classification: Native English and German language support
- 5-Gate Safety System: ALL gates must pass before deletion
- Dual-Agent Verification: Classifier + Verifier for promotional emails
- Three-Tier Confidence: High (≥90%), Medium (70-89%), Low (<70%)
- Human-in-the-Loop: Medium confidence emails flagged for manual review
- Precision-Recall Optimization: Data-driven threshold tuning
- Comprehensive Audit Trail: Full decision logging with AI reasoning
✅ ZERO investment/brokerage emails deleted (Zerodha, Schwab, Trade Republic, etc.) ✅ ZERO banking emails deleted (HDFC, Chase, Deutsche Bank, etc.) ✅ ZERO government emails deleted (.gov, .gov.in, Finanzamt) ✅ ZERO starred/important emails deleted ✅ ZERO deletions without dual-agent verification
# Clone repository
cd email
# Create virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt- Go to Google Cloud Console
- Create a new project or select existing one
- Enable the Gmail API:
- Navigate to "APIs & Services" > "Library"
- Search for "Gmail API"
- Click "Enable"
- Go to "APIs & Services" > "Credentials"
- Click "Create Credentials" > "OAuth client ID"
- Configure consent screen (if prompted):
- User type: External (for personal use)
- App name: "Gmail Email Classifier"
- Add your email as test user
- Application type: Desktop app
- Download credentials JSON
- Save as
credentials.jsonin project root
# First run will open browser for OAuth consent
python gmail_client.py
# This creates token.pickle for future use- Get API key from Google AI Studio
- Add to
.env:
GEMINI_API_KEY=your-gemini-api-key- Get API key from Anthropic Console
- Add to
.env:
ANTHROPIC_API_KEY=your-anthropic-api-keyCopy .env.example to .env and configure:
cp .env.example .envEdit .env:
# Gmail API
GMAIL_CLIENT_ID=your-client-id.apps.googleusercontent.com
GMAIL_CLIENT_SECRET=your-client-secret
# AI Provider (choose one or both)
GEMINI_API_KEY=your-gemini-key
ANTHROPIC_API_KEY=your-claude-key # Optional
# Market & Language
TARGET_MARKET=all # usa, india, germany, or all
SUPPORTED_LANGUAGES=both # en, de, or both
# Safety Settings
DEFAULT_CONFIDENCE_THRESHOLD=90
ENABLE_HUMAN_REVIEW=true
AUTO_DELETE_HIGH_CONFIDENCE=false# Process up to 100 emails (default: no deletion)
python email_classifier.py --max-emails 100
# Process all unread emails
python email_classifier.py --query "is:unread"
# Optimize for specific market
python email_classifier.py --market india --max-emails 50# Show what would be deleted (dry-run)
python email_classifier.py --max-emails 100
# Actually delete approved emails (moves to trash, recoverable)
python email_classifier.py --max-emails 100 --delete
# Only delete high confidence (≥90%) emails
python email_classifier.py --delete-high-confidence --max-emails 100# USA market optimization
python email_classifier.py --market usa --language en
# India market optimization
python email_classifier.py --market india --language en
# Germany market optimization (bilingual)
python email_classifier.py --market germany --language both# Use Gemini (default - faster, cheaper)
python email_classifier.py --provider gemini
# Use Claude (higher quality, more expensive)
python email_classifier.py --provider anthropic# Custom confidence threshold
python email_classifier.py --confidence-threshold 95
# Disable human review for medium confidence
python email_classifier.py --disable-human-review
# Debug mode with verbose logging
python email_classifier.py --debug --max-emails 10# Production-ready command for India market
python email_classifier.py \
--market india \
--language en \
--provider gemini \
--confidence-threshold 90 \
--max-emails 500 \
--delete-high-confidence-
PROMOTIONAL (deletion eligible)
- Marketing emails, newsletters, sales, discounts
-
TRANSACTIONAL (protected)
- Receipts, shipping confirmations, payment notifications
-
SYSTEM_SECURITY (protected)
- Password resets, 2FA codes, security alerts
-
SOCIAL_PLATFORM (protected)
- Social network notifications, friend requests
-
PERSONAL_HUMAN (protected)
- Direct correspondence, work emails, personal communications
ALL gates must pass for deletion approval:
- Gate 1: Category Check - Must be PROMOTIONAL
- Gate 2: Verification Check - Dual-agent verification passed
- Gate 3: Confidence Threshold - Meets confidence requirement (default ≥90%)
- Gate 4: Protected Domain Check - NOT from protected domain
- Gate 5: Manual Flags Check - NOT starred or marked important
Agent 1 (Classifier):
- Batch classify emails in English/German
- Categorize into 5 categories with confidence scores
Agent 2 (Verifier):
- Review ALL promotional classifications
- Focus on investment/banking/government false positives
- Correct misclassifications
USA:
- Investment: Schwab, Fidelity, Robinhood, E*TRADE, Vanguard
- Banking: Chase, Bank of America, Wells Fargo, Citi
- Government: .gov domains, IRS, USPS
- Healthcare: UnitedHealthcare, Anthem, Cigna
India:
- Investment: Zerodha, Groww, Upstox, Angel One
- Banking: HDFC, ICICI, SBI, Axis, Kotak
- Government: .gov.in, income tax, EPFO
- Digital: Paytm, PhonePe, Google Pay
Germany:
- Investment: Trade Republic, Scalable Capital, Flatex
- Banking: Deutsche Bank, Sparkasse, Commerzbank
- Government: .gov.de, Finanzamt
- Healthcare: TK, AOK, Barmer (Krankenkassen)
# Test decision engine
python test_decision_engine.py
# Test individual components
python domain_checker.py
python decision_engine.py
python gmail_client.pyThe test suite includes critical safety tests for:
- Investment platform false positives (USA, India, Germany)
- Government email false positives
- Banking email false positives
- Starred/important email protection
- All 5 gates functioning correctly
# Run classifier to generate validation data first
python email_classifier.py --max-emails 500
# Analyze confidence thresholds
python confidence_analyzer.pyplots/precision_recall_curve.png- PR curveplots/f1_scores.png- F1 scores by thresholdplots/deletion_rate.png- Deletion rate vs thresholdplots/threshold_analysis.txt- Detailed report
email/
├── config.py # Regional config & protected domains
├── domain_checker.py # Domain protection & market ID
├── decision_engine.py # 5-gate safety system
├── gmail_client.py # Gmail API OAuth & email fetching
├── ai_classifier.py # Dual-agent AI classification
├── confidence_analyzer.py # Precision-recall optimization
├── email_classifier.py # Main CLI application
├── test_decision_engine.py # Comprehensive test suite
├── requirements.txt # Python dependencies
├── .env.example # Environment template
├── credentials.json # Gmail OAuth credentials (gitignored)
├── token.pickle # Gmail access token (gitignored)
└── classification_results/ # Output directory
- Real-time progress with color-coded status
- Category breakdown by market
- Decision engine statistics
- Gate failure analysis
Results saved to classification_results/classification_results_YYYYMMDD_HHMMSS.json:
{
"email_id": "...",
"from": "deals@shop.com",
"subject": "50% OFF Sale",
"category": "promotional",
"confidence": 95.0,
"language": "en",
"verified": true,
"decision": "approved",
"confidence_level": "high",
"final_reason": "All 5 safety gates passed",
"gates": [...]
}- Run in dry-run mode first (default)
- Review flagged emails manually
- Check protected domain coverage for your region
- Start with small batches (--max-emails 50)
- Use high confidence threshold (≥90%)
- Week 1: Dry-run only, analyze results
- Week 2: Delete high confidence only (≥95%)
- Week 3: Lower to 90% if no false positives
- Week 4+: Enable medium confidence with human review
- Check
classification_results/regularly - Monitor false positive rate (<5% target)
- Update protected domains as needed
- Review correction rate from verifier agent
# Delete token and re-authenticate
rm token.pickle
python gmail_client.py- Gemini: 60 requests/minute (default)
- Claude: 50 requests/minute (tier-dependent)
- Gmail API: 250 queries/minute
Adjust in config.py if needed.
Add to config.py in appropriate market/category:
PROTECTED_DOMAINS["india"]["investment_brokerage"].append("newbroker.com")- Stop deletion immediately
- Review
classification_results/for pattern - Add domain to protected list
- Increase confidence threshold
- Re-run in dry-run mode
- Edit
config.py - Add to appropriate market + category
- Run validation:
python config.py - Test:
python domain_checker.py
- Edit prompts in
ai_classifier.py - Test with sample emails
- Verify no regression in safety tests
This is a safety-critical system. Use at your own risk. Always start with dry-run mode and thoroughly test with your specific email patterns before enabling deletion.
For issues, feature requests, or questions:
- Check existing classification results in
classification_results/ - Run tests:
python test_decision_engine.py - Enable debug mode:
--debug - Review logs for specific error messages
- Web UI for human review workflow
- Adaptive learning from user feedback
- Multi-account support
- Email scheduling (run daily/weekly)
- Slack/email notifications for flagged emails
- A/B testing for prompt optimization
- Export to CSV/Excel for analysis
Remember: When in doubt, DO NOT DELETE. This system prioritizes safety over deletion efficiency.