Skip to content

aaronmunford/CourtScraper

Repository files navigation

⚖️ Legal AI Platform for Investigative Journalism

License Python React AI Powered

Uncovering stories hidden in court records through automated intelligence.

This platform empowers investigative journalists to analyze thousands of legal documents instantly. By combining automated PACER scraping with state-of-the-art Large Language Models (LLMs), it extracts key entities, analyzes sentiment, and visualizes hidden connections between cases—turning raw court filings into actionable leads.


📸 Application Overview

1. The Dashboard

Your command center for legal investigation. View real-time stats on cases, processed documents, and discovered entities.

Dashboard

2. Automated PACER Integration

Seamlessly search and import federal court cases directly from the PACER system. No manual downloading required.

PACER Import

3. AI Analysis & Insights

Instant summaries, key point extraction, and sentiment analysis for complex legal filings.

AI Analysis 1 AI Analysis 2

4. Entity Network Graph

Visualize relationships between people, organizations, and corporations across different cases.

Entity Network


✨ Key Features

🔍 Automated Source Discovery

  • PACER Scraping: Secure, automated browser agents (Selenium) log in to federal court systems to search and retrieve case dockets.
  • Document Pipeline: Automatically downloads, parses (OCR/Text Extraction), and indexes PDF filings.

🧠 Advanced AI Intelligence

  • Multi-Model Support: Plug-and-play support for Ollama (Local), Google Gemini, Groq, or HuggingFace.
  • Sentiment Analysis: Detects aggressive legal posturing, judicial frustration, or whistleblower urgency.
  • Entity Extraction (NER): Identifies and categorizes Attorneys, Judges, Companies, and Monetary Amounts.

📊 Investigative Tools

  • Trend Analysis: Track the sentiment of a case over time—spot when a legal battle turns ugly.
  • Cross-Case Linking: Find that one obscure shell company mentioned in five different lawsuits.
  • Journalist-Ready Reports: Export clean, cited summaries ready for editorial review.

🛠️ Technical Architecture

The system is built as a modern full-stack application with a modular AI service layer.

graph TD
    User[Journalist] -->|Web UI| Frontend[React + Vite]
    Frontend -->|REST API| Backend[Flask API]
    
    subgraph Backend Services
        Backend -->|Scraping| PACER["PACER Service (Selenium)"]
        Backend -->|Inference| LLM[LLM Service Layer]
        Backend -->|Storage| DB[(SQLAlchemy / SQLite)]
    end
    
    subgraph AI Providers
        LLM -->|Local| Ollama["Ollama (Llama 3)"]
        LLM -->|Cloud| Gemini[Google Gemini Pro]
        LLM -->|Fast| Groq[Groq API]
    end
    
    PACER -->|Downloads| Docs[Document Store]
    Backend -->|OCR/Parse| Docs
Loading

Stack Details

  • Frontend: React 18, Recharts (Visualization), Lucide (UI Icons), CSS Variables (Theming).
  • Backend: Python Flask, SQLAlchemy (ORM), Selenium (Web Automation).
  • AI/ML: LangChain-style prompting, integration with local and cloud LLM providers.

🚀 Getting Started

Prerequisites

  • Python 3.10+
  • Node.js 18+
  • (Optional) Chrome (for PACER scraping)
  • (Optional) Ollama (for local AI)

1. Clone & Install

git clone https://github.com/yourusername/journalist-sentiment-platform.git
cd journalist-sentiment-platform

# Backend Setup
cd backend
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# Frontend Setup
cd ../frontend
npm install

2. Configuration

Create a .env file in the backend/ directory:

# PACER Credentials (Optional)
PACER_USERNAME=your_username
PACER_PASSWORD=your_password

# Choose your AI Provider
LLM_PROVIDER=gemini
GOOGLE_API_KEY=your_key_here

3. Run the Application

Start the backend server:

# In terminal 1
cd backend
python app.py

Start the frontend interface:

# In terminal 2
cd frontend
npm run dev

Visit http://localhost:3000 to begin your investigation.


💡 Usage Workflow

  1. Search & Import: Use the "Import from PACER" button to find a case by party name (e.g., "Enron").
  2. Analyze: Click on a downloaded document and select "Analyze with AI".
  3. Review: Read the generated Executive Summary and Key Points.
  4. Explore: Switch to the "Entities" tab to see who is involved and how they connect to other cases.

-- Built for investigative journalists seeking to uncover truth in legal records -- Powered by open-source AI and the legal transparency movement


Helping journalists find the stories that matter.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published