A terminal application for searching case law using the CourtListener API with dual search strategy (keyword + semantic).
🌐 Try the web version - Test Yevrah in your browser (slower than local terminal, but no setup required)
Yevrah takes a natural language query, interprets it using an LLM, and executes dual searches against CourtListener's database of millions of court opinions:
- User inputs query - Natural language describing their legal research need
- LLM interprets query - Groq (groq/compound) extracts search parameters and formulates two queries
- Dual search executes - Runs keyword (BM25) and semantic (vector) searches in parallel
- Results displayed - Top 5 keyword + top 5 semantic results (10 total) with source tags
- Optional opinion analysis - User can select any result to fetch and analyze the full opinion text
The LLM receives your query and extracts:
- Semantic query - Natural language for semantic search
- Keyword query - Boolean operators and exact terms for keyword search
- Jurisdiction - Court codes (e.g., "California" →
cal calctapp ca9 cand cacd casd) - Date range - Parsed from natural language (e.g., "last 5 years" →
filed_after: 2020-01-10)
Boolean Operator Handling:
If you use CourtListener Boolean operators (AND, OR, NOT, &, %) in your query:
- Your exact query with operators is used for keyword search (preserves Boolean logic)
- A natural language version is automatically generated for semantic search (better vector matching)
Example: "employment AND discrimination" becomes:
- Keyword:
"employment AND discrimination"(exact Boolean) - Semantic:
"employment discrimination"(natural language)
This ensures both search strategies get queries in their optimal format.
Both searches run in parallel:
Keyword Search (BM25):
- Uses Boolean operators and exact term matching
- Returns 20 results from CourtListener
- Shows top 5 by BM25 relevance score
- No reranking - trusts CourtListener's scoring
Semantic Search (Vector):
- Uses natural language and concept matching
- Returns 20 results from CourtListener's vector index
- Reranks all 20 results using Cohere rerank-v4.0-fast (if API key available)
- Shows top 5 after reranking
- Falls back to CourtListener's order if no Cohere key
- 10 total results (5 keyword + 5 semantic)
- Each result tagged with source: KEYWORD (yellow) or SEMANTIC (green)
- No deduplication - see separate results from each approach
- Results display: case name, court, date, citation, snippet, relevance scores
- Python 3.8+
- GROQ_API_KEY - Required (get from https://console.groq.com/)
- COURTLISTENER_API_KEY - Required (get from https://www.courtlistener.com/api/rest-info/)
- COHERE_API_KEY - Optional (get from https://cohere.com/) - enables semantic reranking
-
Clone and navigate to directory:
git clone https://github.com/legaltextai/yevrah_terminal.git cd yevrah_terminal -
Create virtual environment:
python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Create
.envfile with API keys:GROQ_API_KEY=your_groq_key_here COURTLISTENER_API_KEY=your_courtlistener_key_here COHERE_API_KEY=your_cohere_key_here # Optional - for semantic reranking -
Run:
python main.py
You: customer slip fall store florida last 3 years
[Dual search executes in parallel]
Keyword Search API: https://www.courtlistener.com/api/rest/v4/search/?q=...&semantic=false
Semantic Search API: https://www.courtlistener.com/api/rest/v4/search/?q=...&semantic=true
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Dual (Keyword + Semantic) Search
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
┌─ #1 KEYWORD ─────────────────────────────────┐
│ Case Name: Smith v. Publix Super Markets │
│ Court: Florida Supreme Court │
│ Date Filed: 2022-03-15 │
│ Citation: 350 So. 3d 123 │
│ Snippet: Customer slipped on wet floor... │
│ Relevance: BM25: 15.32 │
└──────────────────────────────────────────────┘
[... 4 more keyword results ...]
┌─ #6 SEMANTIC ────────────────────────────────┐
│ Case Name: Jones v. Walmart Stores, Inc. │
│ Court: Florida District Court of Appeal │
│ Date Filed: 2023-01-20 │
│ Citation: 360 So. 3d 456 │
│ Snippet: Premises liability for customer... │
│ Relevance: Semantic: 0.87 │
└──────────────────────────────────────────────┘
[... 4 more semantic results ...]
Enter a number (1-10) to analyze any opinion in detail, or describe what else you'd like to search for.
You: employment AND discrimination, california
[System detects Boolean operators and optimizes for each search type]
╭─────────────────────── Search Parameters ───────────────────────╮
│ Dual (Keyword + Semantic) Search │
│ Keyword Query: "employment AND discrimination" │
│ Semantic Query: "employment discrimination" │
│ Jurisdiction: cal calctapp ca9 cacd caed cand casd │
╰──────────────────────────────────────────────────────────────────╯
[Results display with 5 keyword + 5 semantic results]
Note: When you use Boolean operators (AND, OR, NOT, &, %), the system automatically:
- Preserves your exact Boolean query for keyword search
- Converts it to natural language for semantic search
- This gives you the best of both approaches!
| Command | Action |
|---|---|
exit, quit, q |
Exit application |
new, reset |
Start new search session |
help |
Show help information |
jurisdictions |
List all available court codes |
1-10 |
Analyze full opinion for result #1-10 |
The LLM automatically maps natural language to CourtListener court codes:
| You Say | Court Codes |
|---|---|
| "California" | cal calctapp ca9 cand cacd casd caed |
| "California state" | cal calctapp |
| "Ninth Circuit" | ca9 |
| "New York federal" | ca2 nysd edny ndny wdny |
| "Supreme Court" | scotus |
| "Florida" | fla flactapp ca11 flsd flmd flnd |
See jurisdictions.py for the complete list of court codes.
Yevrah is designed with an idea to become in the future a modular system that can be extended and adapted to different use cases. The goal is to allow for:
- Multiple tool integrations - Connect various APIs and data sources beyond CourtListener
- Multi-agent orchestration - Coordinate specialized agents with different skills and capabilities
- Evaluation mechanisms - Built-in testing and validation frameworks for quality assurance
- Pluggable components - Swap out LLM providers, search backends, or reranking services
This modular approach should enable researchers, developers, and legal tech teams to customize the system for their specific needs while maintaining a clean, maintainable codebase.
User Input
↓
Groq LLM (groq/compound)
├─ Extracts: semantic query, keyword query, jurisdiction, dates
└─ Calls: execute_search_case_law tool
↓
CourtListener API (parallel execution)
├─ Keyword Search (20 results) → Top 5 by BM25
└─ Semantic Search (20 results) → Cohere rerank → Top 5
↓
Display: 10 results with source tags (KEYWORD/SEMANTIC)
↓
User selects result # → Fetch full opinion → LLM analysis
| File | Purpose |
|---|---|
main.py |
Entry point, main conversation loop |
llm_client.py |
Groq API client, system prompts, tool calling |
tools.py |
Tool definitions, dual search logic, jurisdiction mapping, date parsing |
courtlistener.py |
CourtListener API client (keyword & semantic search) |
reranker.py |
Cohere rerank integration for semantic results |
formatter.py |
Terminal UI formatting with Rich library |
jurisdictions.py |
Court code reference (200+ courts) |
| Component | Model | Purpose |
|---|---|---|
| Query interpretation | Groq groq/compound | Extract search params, formulate queries |
| Opinion analysis | Groq groq/compound | Analyze full opinion text |
| Keyword search | CourtListener BM25 | Traditional Boolean + exact term matching |
| Semantic search | CourtListener vectors | Concept-based similarity search |
| Semantic reranking | Cohere rerank-v4.0-fast | Reorder semantic results by relevance |
- CourtListener API Documentation
- CourtListener Search Endpoint
- CourtListener Jurisdictions
- Groq API Documentation
- Cohere Rerank Documentation
Future development will focus on building specialized agents to enhance the research workflow:
Takes the user's legal issue and intelligently identifies whether it's a state or federal matter, then recommends the most relevant courts for the query. This improves search precision by targeting the right jurisdictions automatically.
Provides deeper analysis of search results by summarizing each case and evaluating how relevant it is to the user's specific research query. Helps researchers quickly identify the most useful cases from search results.
Takes a case and searches for the most recent opinions that cite it, then analyzes whether the case has been followed, distinguished, criticized, or overruled. Essential for ensuring cited precedents remain valid authority.
We welcome contributions to Yevrah! Whether you're interested in:
- Core legal logic - Improving jurisdiction mapping, date parsing, or search strategies
- New integrations - Adding support for additional legal databases or APIs
- Multi-agent features - Building specialized agents for different legal research tasks
- Testing and evaluation - Enhancing the test suite and validation mechanisms
- Documentation - Improving guides and examples
If you'd like to contribute or have feedback about the project, please reach out to admin@legaltext.ai
This repository is available under the permissive BSD license, making it easy and safe to incorporate in your own libraries. Links and mentions of LegalTextLab / legaltext.dev are much appreciated.
