⚖️ Legal AI Platform for Investigative Journalism

Uncovering stories hidden in court records through automated intelligence.

This platform empowers investigative journalists to analyze thousands of legal documents instantly. By combining automated PACER scraping with state-of-the-art Large Language Models (LLMs), it extracts key entities, analyzes sentiment, and visualizes hidden connections between cases—turning raw court filings into actionable leads.

📸 Application Overview

1. The Dashboard

Your command center for legal investigation. View real-time stats on cases, processed documents, and discovered entities.

2. Automated PACER Integration

Seamlessly search and import federal court cases directly from the PACER system. No manual downloading required.

3. AI Analysis & Insights

Instant summaries, key point extraction, and sentiment analysis for complex legal filings.

4. Entity Network Graph

Visualize relationships between people, organizations, and corporations across different cases.

✨ Key Features

🔍 Automated Source Discovery

PACER Scraping: Secure, automated browser agents (Selenium) log in to federal court systems to search and retrieve case dockets.
Document Pipeline: Automatically downloads, parses (OCR/Text Extraction), and indexes PDF filings.

🧠 Advanced AI Intelligence

Multi-Model Support: Plug-and-play support for Ollama (Local), Google Gemini, Groq, or HuggingFace.
Sentiment Analysis: Detects aggressive legal posturing, judicial frustration, or whistleblower urgency.
Entity Extraction (NER): Identifies and categorizes Attorneys, Judges, Companies, and Monetary Amounts.

📊 Investigative Tools

Trend Analysis: Track the sentiment of a case over time—spot when a legal battle turns ugly.
Cross-Case Linking: Find that one obscure shell company mentioned in five different lawsuits.
Journalist-Ready Reports: Export clean, cited summaries ready for editorial review.

🛠️ Technical Architecture

The system is built as a modern full-stack application with a modular AI service layer.

graph TD
    User[Journalist] -->|Web UI| Frontend[React + Vite]
    Frontend -->|REST API| Backend[Flask API]
    
    subgraph Backend Services
        Backend -->|Scraping| PACER["PACER Service (Selenium)"]
        Backend -->|Inference| LLM[LLM Service Layer]
        Backend -->|Storage| DB[(SQLAlchemy / SQLite)]
    end
    
    subgraph AI Providers
        LLM -->|Local| Ollama["Ollama (Llama 3)"]
        LLM -->|Cloud| Gemini[Google Gemini Pro]
        LLM -->|Fast| Groq[Groq API]
    end
    
    PACER -->|Downloads| Docs[Document Store]
    Backend -->|OCR/Parse| Docs

Stack Details

Frontend: React 18, Recharts (Visualization), Lucide (UI Icons), CSS Variables (Theming).
Backend: Python Flask, SQLAlchemy (ORM), Selenium (Web Automation).
AI/ML: LangChain-style prompting, integration with local and cloud LLM providers.

🚀 Getting Started

Prerequisites

Python 3.10+
Node.js 18+
(Optional) Chrome (for PACER scraping)
(Optional) Ollama (for local AI)

1. Clone & Install

git clone https://github.com/yourusername/journalist-sentiment-platform.git
cd journalist-sentiment-platform

# Backend Setup
cd backend
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# Frontend Setup
cd ../frontend
npm install

2. Configuration

Create a .env file in the backend/ directory:

# PACER Credentials (Optional)
PACER_USERNAME=your_username
PACER_PASSWORD=your_password

# Choose your AI Provider
LLM_PROVIDER=gemini
GOOGLE_API_KEY=your_key_here

3. Run the Application

Start the backend server:

# In terminal 1
cd backend
python app.py

Start the frontend interface:

# In terminal 2
cd frontend
npm run dev

Visit http://localhost:3000 to begin your investigation.

💡 Usage Workflow

Search & Import: Use the "Import from PACER" button to find a case by party name (e.g., "Enron").
Analyze: Click on a downloaded document and select "Analyze with AI".
Review: Read the generated Executive Summary and Key Points.
Explore: Switch to the "Entities" tab to see who is involved and how they connect to other cases.

-- Built for investigative journalists seeking to uncover truth in legal records -- Powered by open-source AI and the legal transparency movement

Helping journalists find the stories that matter.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
PACER_Scraper		PACER_Scraper
backend		backend
frontend		frontend
screenshots		screenshots
.DS_Store		.DS_Store
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
from juriscraper.py		from juriscraper.py
image.png		image.png
seed_doc.py		seed_doc.py
test_upload.txt		test_upload.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⚖️ Legal AI Platform for Investigative Journalism

📸 Application Overview

1. The Dashboard

2. Automated PACER Integration

3. AI Analysis & Insights

4. Entity Network Graph

✨ Key Features

🔍 Automated Source Discovery

🧠 Advanced AI Intelligence

📊 Investigative Tools

🛠️ Technical Architecture

Stack Details

🚀 Getting Started

Prerequisites

1. Clone & Install

2. Configuration

3. Run the Application

💡 Usage Workflow

About

Uh oh!

Releases

Packages

Uh oh!

Languages

aaronmunford/CourtScraper

Folders and files

Latest commit

History

Repository files navigation

⚖️ Legal AI Platform for Investigative Journalism

📸 Application Overview

1. The Dashboard

2. Automated PACER Integration

3. AI Analysis & Insights

4. Entity Network Graph

✨ Key Features

🔍 Automated Source Discovery

🧠 Advanced AI Intelligence

📊 Investigative Tools

🛠️ Technical Architecture

Stack Details

🚀 Getting Started

Prerequisites

1. Clone & Install

2. Configuration

3. Run the Application

💡 Usage Workflow

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages