Multi-modal Misinformation & Deepfake Detection Platform
A Next.js 14 + LangChain/LangGraph platform that analyzes text, images, audio, and video through specialized AI agents to detect both misinformation (false claims) and AI-generated content (deepfakes).
The exponential rise of AI-generated misinformation threatens the fabric of informed society. As generative AI becomes more sophisticated, distinguishing authentic content from synthetic or manipulated media has become nearly impossible for average users. This crisis manifests in:
- Deepfake videos of political figures making fabricated statements
- AI-generated images used to spread false narratives during elections and crises
- Voice cloning enabling fraud and impersonation at scale
- Synthetic news articles flooding social media feeds with fabricated claims
- Out-of-context media repurposed to mislead audiences
| Impact Area | Consequence |
|---|---|
| Democracy | Manipulated media undermines elections and public trust in institutions |
| Public Health | Medical misinformation leads to vaccine hesitancy and dangerous "cures" |
| Financial Markets | Fake news can manipulate stock prices and cause economic harm |
| Personal Safety | Deepfakes enable harassment, fraud, and non-consensual content |
| Journalism | Authentic reporting becomes indistinguishable from fabrication |
The key insight: Content can be AI-generated but factually true, OR real media spreading false claims. MisIntel scores both independently:
| Detection Type | What It Detects | Output |
|---|---|---|
| Misinformation | False claims, out-of-context content, manipulated facts | isMisinfo, misinfoConfidence |
| Deepfake/AI | AI-generated images/audio/video, voice cloning, synthetic media | isAiGenerated, aiConfidence |
┌─────────────────────────────────────────────────────────────────────┐
│ INPUT LAYER │
│ Text | URL | Image | Video | Audio │
└───────────────────────┬─────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ PREPROCESSING LAYER │
│ • Extract content (OCR, transcription, web scraping) │
│ • Extract metadata (EXIF, timestamps, source info) │
│ • Normalize format │
└───────────────────────┬─────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ VECTOR DATABASE CHECK (QDRANT) │
│ • Search similar content embeddings │
│ • Check fact-check cache │
│ • Check AI pattern cache │
│ • Check source credibility cache │
│ │
│ IF HIGH SIMILARITY (90%+) → Return cached result │
│ IF LOW SIMILARITY → Continue to agents │
└───────────────────────┬─────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ SPECIALIZED AGENT SYSTEM │
│ (All agents run in parallel) │
│ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ MISINFORMATION DETECTION AGENTS │ │
│ │ • Fact-Check Agent → Google Fact Check API │ │
│ │ • Source Credibility Agent → Domain/author analysis │ │
│ │ • Safety Agent → URL security check │ │
│ │ • Cross-Reference Agent → Multiple source comparison │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ AI GENERATION DETECTION AGENTS │ │
│ │ • Image Deepfake Agent → HuggingFace ViT + Gemini Vision │ │
│ │ • Video Deepfake Agent → Gemini Files API + temporal check │ │
│ │ • Audio Deepfake Agent → Gemini audio forensics │ │
│ │ • Source Verification Agent → Reverse search │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
│ Each agent returns: │
│ • Verdict (true/false/uncertain) │
│ • Confidence (0-100) — used as voting weight │
│ • Evidence list │
└───────────────────────┬─────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ VOTING SYSTEM │
│ │
│ • Collect all agent verdicts │
│ • Weighted voting (confidence = weight) │
│ • Calculate consensus score │
│ │
│ IF CONSENSUS ≥ 70% → Return result │
│ IF CONSENSUS < 70% → Trigger adversarial debate (TODO) │
└───────────────────────┬─────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ RESULT SYNTHESIS │
│ │
│ • Combine voting results + debate outcome │
│ • Calculate final scores: │
│ - Misinformation Score (0-100) │
│ - AI Generation Score (0-100) │
│ • Generate evidence report │
│ • Store in Vector DB for future cache │
└─────────────────────────────────────────────────────────────────────┘
The system is built on LangGraph for stateful agent orchestration:
┌─────────────┐ ┌──────────────┐ ┌────────────┐ ┌────────┐ ┌──────────┐
│ Preprocess │───▶│ Vector DB │───▶│ Run Agents │───▶│ Voting │───▶│Synthesize│
│ (embedding) │ │ (90%+ cache) │ │ (parallel) │ │(<70%?) │ │ (store) │
└─────────────┘ └──────────────┘ └────────────┘ └────┬───┘ └──────────┘
│ │
▼ ▼
[Use Cached Result] [Adversarial Debate]
Qdrant is the backbone of our caching and similarity search system:
| Use Case | How Qdrant Helps |
|---|---|
| Avoid Re-analysis | 90%+ similar content returns cached verdict instantly |
| Claim Deduplication | Same misinformation claim → reuse fact-check result |
| Pattern Matching | Find similar deepfake patterns across analyzed media |
| Source Credibility | Cache domain/author reputation scores |
| Scalability | Sub-second search across millions of embeddings |
Collections Used:
COLLECTIONS = {
FACT_CHECKS: 'fact_checks', // Cached claim verdicts
AI_PATTERNS: 'ai_patterns', // Known deepfake signatures
SOURCE_CREDIBILITY: 'source_credibility', // Domain reputation
IMAGES: 'images', // Analyzed image results
VIDEOS: 'videos', // Analyzed video results
AUDIO: 'audio', // Analyzed audio results
AGENT_RESULTS: 'agent_results', // Full agent verdict cache
}Similarity Threshold: 92% cosine similarity triggers cache hit (tuned to balance speed vs accuracy)
| Input Type | Preprocessing | Detection Focus |
|---|---|---|
| Text | Claim extraction, entity recognition | Misinformation only |
| URL | Web scraping, content extraction | Misinformation + source credibility |
| Image | OCR, EXIF extraction, base64 encoding | Deepfake + Misinformation (smart routing) |
| Audio | Waveform analysis, transcription | Deepfake (voice cloning) + Misinformation (claims) |
| Video | Frame extraction, audio separation | Deepfake + Misinformation (parallel analysis) |
Text Embeddings (768 dimensions):
// Using HuggingFace sentence-transformers
const EMBEDDING_MODEL = 'sentence-transformers/all-mpnet-base-v2';
async function generateEmbedding(text: string): Promise<number[]> {
const result = await hfClient.featureExtraction({
model: EMBEDDING_MODEL,
inputs: truncatedText, // Max ~2000 chars
});
return result; // 768-dimensional vector
}Image Embeddings:
- With description: Use semantic embedding of extracted text/scene description
- Without: SHA-256 hash-based embedding for exact match detection
- Future: CLIP embeddings for true visual similarity
Audio/Video Embeddings:
- Content hash (SHA-256) converted to 768-dim vector
- Cached with full analysis results for fast retrieval
// Search with cosine similarity
const results = await qdrantClient.search(collectionName, {
vector: embedding,
limit: 5,
score_threshold: 0.92, // 92% similarity threshold
});
// If match found → return cached verdict
// If no match → run full agent analysis → store result┌──────────────────────────────────────────────────────────────────┐
│ RETRIEVAL FLOW │
├──────────────────────────────────────────────────────────────────┤
│ │
│ 1. EMBED INPUT │
│ └─→ Generate 768-dim vector from content │
│ │
│ 2. SEARCH QDRANT │
│ └─→ Query relevant collection(s) with embedding │
│ └─→ Return top-5 matches above 0.92 threshold │
│ │
│ 3. EVALUATE MATCHES │
│ ├─→ Score ≥ 0.92: Cache HIT → Return stored verdict │
│ └─→ Score < 0.92: Cache MISS → Continue to agents │
│ │
│ 4. STORE NEW RESULTS │
│ └─→ After agent analysis, upsert with payload: │
│ { verdict, confidence, evidence, timestamp, inputHash } │
│ │
└──────────────────────────────────────────────────────────────────┘
Storage Schema:
// Qdrant point structure
{
id: number, // Hash-based or timestamp
vector: number[768], // Semantic embedding
payload: {
verdict: 'true' | 'false' | 'uncertain',
confidence: number, // 0-100
isAiGenerated: boolean,
aiConfidence: number,
isMisinfo: boolean,
misinfoConfidence: number,
evidence: string[],
sources: string[],
inputHash: string, // For exact match
timestamp: string,
agentResults: AgentVerdict[], // Full breakdown
}
}Update Strategy:
- Upsert: Same content hash → updates existing record
- TTL: No automatic expiration (fact-checks remain valid)
- Manual refresh: Re-analysis overrides cached result
Reuse Patterns:
| Scenario | Action |
|---|---|
| Identical content | Instant return (100% match) |
| Near-duplicate (>92%) | Return cached with "similar content" flag |
| Rephrased claim | Semantic match finds original verdict |
| New content | Full analysis → store for future |
| Metric | Value |
|---|---|
| Cache hit latency | <100ms |
| Full analysis latency | 3-8 seconds |
| Cache hit rate (production) | ~40% for viral content |
| Storage efficiency | ~1KB per analyzed item |
| Limitation | Description | Mitigation |
|---|---|---|
| Novel deepfake techniques | New AI models may evade detection | Regular model updates, ensemble approach |
| Low-quality input | Heavily compressed media reduces accuracy | Warn users when quality is too low |
| Adversarial attacks | Crafted inputs designed to fool detectors | Multiple independent tools, voting system |
| Language coverage | Best accuracy for English content | Translation layer for other languages |
| Satire/Parody | May flag intentional fiction as "misinfo" | Context signals, disclaimer detection |
| Breaking news | No fact-checks available yet | Low confidence score, "unverified" verdict |
| Partial deepfakes | Only face swapped in otherwise real video | Per-element analysis (face, voice, background) |
Bias Risks:
| Bias Type | Concern | Mitigation |
|---|---|---|
| Training data bias | HuggingFace models trained on Western media | Diverse model ensemble, confidence thresholds |
| Political bias | Fact-check sources may have editorial slant | Multiple cross-referenced sources |
| False positive harm | Incorrectly flagging authentic content | High threshold (>70% confidence), uncertainty option |
| Automation bias | Users may over-trust AI verdicts | Always show evidence, encourage critical thinking |
Privacy Considerations:
- ✅ No user data stored beyond analysis cache
- ✅ Content embeddings are non-reversible (can't reconstruct original)
- ✅ No PII extraction or storage
⚠️ Uploaded media temporarily processed (deleted after analysis)⚠️ Qdrant caches analyzed content (configurable retention)
Safety Measures:
| Measure | Implementation |
|---|---|
| Rate limiting | Prevent abuse of API endpoints |
| Safe browsing check | Flag malicious URLs before analysis |
| Content moderation | Gemini refuses to analyze illegal content |
| Transparency | Full evidence trail for every verdict |
| Appeal mechanism | Users can request manual review (future) |
Ethical Principles:
- Transparency over opacity — Show why a verdict was reached
- Uncertainty is valid — "Uncertain" is better than a wrong answer
- Human in the loop — AI assists, humans decide
- No censorship — Detect and inform, don't block or remove
- Open methodology — Document how detection works
# Install dependencies
npm install
# Start Qdrant (required for caching)
docker run -p 6333:6333 qdrant/qdrant
# Set environment variables
cp .env.example .env.local
# Add: FACT_GEMINI_API_KEY, HUGGINGFACE_API_KEY, etc.
# Run development server
npm run devText → Preprocessing → Vector DB (check similar claims)
→ [Fact-Check Agent] → Voting → Result
Image → Classify (faces? text?) → Route to focus
→ [Deepfake Tools | Misinfo Tools | Both] → Voting → Result
Video → Upload to Gemini Files API → Parallel analysis
→ [Deepfake Analysis + Content Analysis] → Combine → Result
Audio → Gemini audio analysis (parallel)
→ [Deepfake Detection + Transcript Fact-Check] → Combine → Result
| Component | Technology |
|---|---|
| Framework | Next.js 14 (App Router) |
| AI Orchestration | LangChain + LangGraph |
| Vector Database | Qdrant |
| LLM | Google Gemini 2.5 Flash |
| ML Models | HuggingFace Inference API |
| Embeddings | sentence-transformers/all-mpnet-base-v2 |
| Deployment | Docker + Cloud Run |
MIT