Reference document for Wilson deployment requirements. Updated: April 2, 2026
Base URL: https://www.courtlistener.com/api/rest/v4/ Auth: Token-based (free registration at courtlistener.com) Docs: https://www.courtlistener.com/api/rest/docs/
| Endpoint | Method | Purpose | Status |
|---|---|---|---|
| /citation-lookup/ | POST | Verify citation existence, get cluster ID and case name | ✓ Working |
| /opinions/ | GET | Fetch full opinion text (html_with_citations field) | ✓ Working |
| /clusters/ | GET | Case metadata, judges, dates, citations | ✓ Working |
| /search/ | GET | Full-text search across opinions and dockets | ✓ Working |
| /courts/ | GET | Court metadata | ✓ Working |
| Endpoint | Method | Purpose | Status |
|---|---|---|---|
| /docket-entries/ | GET | List documents filed in a case | ✗ PACER required |
| /recap-documents/ | GET | Access RECAP-collected PACER documents | ✗ PACER required |
- Citation lookup returns cluster ID, case name, docket ID, sub-opinion URLs
- Opinion text via html_with_citations includes hyperlinked citations with data-id attributes
- Full opinion text available for most federal cases — quality varies by source
- Rate limiting applies — Wilson uses 0.5s delay between API calls
- Bulk data download available separately (see below)
URL: https://com-courtlistener-storage.s3-us-west-2.amazonaws.com/bulk-data/ License: CC BY-ND — attribution required, no derivative databases Auth: None required — public S3 bucket
| File | Size | Records | Purpose |
|---|---|---|---|
| citations-2026-03-31.csv | 1.9GB | 18,116,834 | Offline citation existence verification |
| courts-2026-03-31.csv | 748KB | — | Court metadata |
| load-bulk-data-2026-03-31.sh | 18KB | — | Load script |
id, volume, reporter, page, type, cluster_id, date_created, date_modified
cluster_id links to full case data via API. This is the bridge between
offline existence checking and online full-text retrieval.
Wilson can verify citation existence entirely offline against the bulk CSV. No API calls required for existence checking. API required for:
- Case name verification (catching misattributed citations)
- Full opinion text retrieval (quote verification)
URL: https://case.law / https://huggingface.co/datasets/free-law/Caselaw_Access_Project License: CC0 — no restrictions, commercial use permitted Auth: Hugging Face account required for bulk download
- 6.7 million US court decisions
- Federal and state courts
- 1658 through 2020 (not updated after 2020)
- All jurisdictions including territories
- Not yet downloaded — deferred pending need
- CourtListener API covers post-2020 cases and is more practical for the Charlotin dataset (mostly recent cases)
- CAP useful for historical case research and offline state court coverage
- CourtListener API: Recent federal cases, online operation
- CAP bulk download: Historical cases, air-gap operation, state courts
- Both: Cross-validation, coverage gaps
URL: https://pacer.uscourts.gov Auth: PACER account required (free registration) Cost: $0.10/page for documents (fee waiver available for researchers)
- /docket-entries/ — list of all documents filed in a case
- /recap-documents/ — RECAP-collected copies of PACER documents
- Access to original filed briefs, motions, and exhibits
PACER access enables Wilson to audit the original filing rather than reconstructing citations from secondary sources. This is the gold standard for pre-filing audit and post-filing forensic reconstruction.
- PACER registration: PENDING
- CourtListener PACER link: NOT CONFIGURED
- Workaround: Using published opinions and known citation lists for testing
URL: https://www.damiencharlotin.com/hallucinations/ License: Unspecified — research use, do not redistribute as derivative Auth: None — public website, CSV download available
- 1,222 total cases (as of download date)
- 304 US lawyer cases (Wilson's immediate scope)
- Fields: Case Name, Court, Date, Party, AI Tool, Hallucination Items, Outcome, Monetary Penalty, Professional Sanction
| Type | Count | Wilson Coverage |
|---|---|---|
| Fabricated | 538 | ✓ Phase 1 — existence checking |
| Misrepresented | 361 | ✗ Phase 3 — requires LLM reasoning |
| False Quotes | 297 | ✓ Phase 2 — quote verification |
| Outdated Advice | 11 | ✗ Phase 4 — requires citator |
Wilson addresses 835/1,207 hallucination items (69%) with current pipeline.
Charlotin documents that hallucinations occurred but does not always preserve the exact fabricated citation text. Many entries describe hallucinations in prose rather than providing parseable citation strings. Original court filings (via PACER) provide the actual citation text.
Repo: https://github.com/freelawproject/eyecite License: BSD — attribution required, commercial use permitted Auth: None — local Python library
- Extracts legal citations from any text
- Parses volume, reporter, page, court, year, parties
- Handles standard reporters (F.3d, U.S., S. Ct., WL, etc.)
- Returns structured FullCaseCitation objects
- Cannot parse citations without reporter information
- Name-only citations (e.g., "McIntyre v. Phx. Newspapers") not parseable
- Westlaw (WL) citations parseable but not verifiable via CourtListener bulk data
Requirements: CourtListener API token (free), bulk CSV download Capabilities:
- Citation existence verification against 18M records (offline)
- Case name mismatch detection (online)
- Quote verification against opinion text (online)
- Verdicts: FABRICATED, MISATTRIBUTED, EXISTS
Requirements: Tier 1 + PACER account Capabilities: All of Tier 1 plus:
- Original filing retrieval
- Complete document audit without reconstruction
- Retroactive case reconstruction from filed documents
Requirements: Tier 2 + LLM inference capability Capabilities: All of Tier 2 plus:
- Does the cited case actually support the cited proposition?
- Holding verification
- Argument coherence analysis
This document is maintained as part of Wilson's open source commitment. Auditability requires transparency about data sources and access requirements.