Uncovering stories hidden in court records through automated intelligence.
This platform empowers investigative journalists to analyze thousands of legal documents instantly. By combining automated PACER scraping with state-of-the-art Large Language Models (LLMs), it extracts key entities, analyzes sentiment, and visualizes hidden connections between cases—turning raw court filings into actionable leads.
Your command center for legal investigation. View real-time stats on cases, processed documents, and discovered entities.
Seamlessly search and import federal court cases directly from the PACER system. No manual downloading required.
Instant summaries, key point extraction, and sentiment analysis for complex legal filings.
Visualize relationships between people, organizations, and corporations across different cases.
- PACER Scraping: Secure, automated browser agents (Selenium) log in to federal court systems to search and retrieve case dockets.
- Document Pipeline: Automatically downloads, parses (OCR/Text Extraction), and indexes PDF filings.
- Multi-Model Support: Plug-and-play support for Ollama (Local), Google Gemini, Groq, or HuggingFace.
- Sentiment Analysis: Detects aggressive legal posturing, judicial frustration, or whistleblower urgency.
- Entity Extraction (NER): Identifies and categorizes Attorneys, Judges, Companies, and Monetary Amounts.
- Trend Analysis: Track the sentiment of a case over time—spot when a legal battle turns ugly.
- Cross-Case Linking: Find that one obscure shell company mentioned in five different lawsuits.
- Journalist-Ready Reports: Export clean, cited summaries ready for editorial review.
The system is built as a modern full-stack application with a modular AI service layer.
graph TD
User[Journalist] -->|Web UI| Frontend[React + Vite]
Frontend -->|REST API| Backend[Flask API]
subgraph Backend Services
Backend -->|Scraping| PACER["PACER Service (Selenium)"]
Backend -->|Inference| LLM[LLM Service Layer]
Backend -->|Storage| DB[(SQLAlchemy / SQLite)]
end
subgraph AI Providers
LLM -->|Local| Ollama["Ollama (Llama 3)"]
LLM -->|Cloud| Gemini[Google Gemini Pro]
LLM -->|Fast| Groq[Groq API]
end
PACER -->|Downloads| Docs[Document Store]
Backend -->|OCR/Parse| Docs
- Frontend: React 18, Recharts (Visualization), Lucide (UI Icons), CSS Variables (Theming).
- Backend: Python Flask, SQLAlchemy (ORM), Selenium (Web Automation).
- AI/ML: LangChain-style prompting, integration with local and cloud LLM providers.
- Python 3.10+
- Node.js 18+
- (Optional) Chrome (for PACER scraping)
- (Optional) Ollama (for local AI)
git clone https://github.com/yourusername/journalist-sentiment-platform.git
cd journalist-sentiment-platform
# Backend Setup
cd backend
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
# Frontend Setup
cd ../frontend
npm installCreate a .env file in the backend/ directory:
# PACER Credentials (Optional)
PACER_USERNAME=your_username
PACER_PASSWORD=your_password
# Choose your AI Provider
LLM_PROVIDER=gemini
GOOGLE_API_KEY=your_key_hereStart the backend server:
# In terminal 1
cd backend
python app.pyStart the frontend interface:
# In terminal 2
cd frontend
npm run devVisit http://localhost:3000 to begin your investigation.
- Search & Import: Use the "Import from PACER" button to find a case by party name (e.g., "Enron").
- Analyze: Click on a downloaded document and select "Analyze with AI".
- Review: Read the generated Executive Summary and Key Points.
- Explore: Switch to the "Entities" tab to see who is involved and how they connect to other cases.
-- Built for investigative journalists seeking to uncover truth in legal records -- Powered by open-source AI and the legal transparency movement
Helping journalists find the stories that matter.




