[FEAT] Comprehensive Court Decision Fetching - All Court Types (2022-2024)

## Feature Description

Add comprehensive court decision fetching capabilities to Law7, covering all Russian court types (arbitration, general jurisdiction, supreme/constitutional) for the last 2 years (2022-2024).

## Problem Statement

Law7 currently has **no court decisions** in the database. Court decisions show how laws are **actually interpreted and applied** in practice, which is invaluable for:

- **AI Assistance**: Better understanding of how legal articles work in real cases
- **Legal Research**: Finding precedents for specific articles
- **Article Context**: Seeing practical applications of legal codes
- **Historical Tracking**: How court interpretations change over time

## Related Work

This feature **expands significantly** on Phase 7C (Issue #22), which currently covers only:
- Supreme Court + Constitutional Court only (~1K-2K docs)

This issue adds:
- **Arbitration courts** (kad.arbitr.ru) - economic disputes
- **General jurisdiction** (sudrf.ru) - civil, criminal, administrative cases
- **Supreme/Constitutional courts** (vsrf.ru, ksrf.ru) - high-level precedents
- **Time scope**: Last 2 years (2022-2024) instead of all-time

## Proposed Solution

**Hybrid approach** (balance reliability and coverage):
1. **Start with pravo.gov.ru API** (official, stable) - quick wins
2. **Add scraping for comprehensive coverage**:
   - kad.arbitr.ru (arbitration courts)
   - sudrf.ru (general jurisdiction)
   - vsrf.ru / ksrf.ru (supreme/constitutional)

### Architecture

Follow the established `country_modules` pattern from Phase 7A:

```
scripts/country_modules/russia/scrapers/
├── court_scraper.py          # Base court scraper (extend BaseScraper)
├── kad_scraper.py            # Arbitration courts scraper
├── sudrf_scraper.py          # General jurisdiction scraper
└── supreme_scraper.py        # Supreme + Constitutional courts
```

### Database Schema

```sql
-- Court decisions metadata
CREATE TABLE court_decisions (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    country_id VARCHAR(3) REFERENCES countries(id),
    case_number VARCHAR(255) UNIQUE NOT NULL,
    decision_date DATE NOT NULL,
    court_name TEXT NOT NULL,
    court_code VARCHAR(50),
    case_type VARCHAR(100),
    instance VARCHAR(50), -- first, appeal, cassation, supreme
    decision_text TEXT,
    source_url TEXT,
    created_at TIMESTAMP DEFAULT NOW(),
    updated_at TIMESTAMP DEFAULT NOW()
);

-- Link court decisions to articles they interpret
CREATE TABLE court_decision_article_references (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    court_decision_id UUID REFERENCES court_decisions(id) ON DELETE CASCADE,
    code_id VARCHAR(50), -- e.g., 'GK_RF', 'UK_RF'
    article_number VARCHAR(50), -- e.g., '123', '124.1'
    reference_context TEXT, -- excerpt showing how article was interpreted
    reference_type VARCHAR(50), -- 'cited', 'interpreted', 'applied'
    created_at TIMESTAMP DEFAULT NOW()
);

-- Court metadata reference table
CREATE TABLE courts (
    id VARCHAR(50) PRIMARY KEY, -- court code
    name TEXT NOT NULL,
    court_type VARCHAR(50), -- 'arbitration', 'general', 'supreme', 'constitutional'
    url TEXT,
    jurisdiction TEXT
);
```

### Article Reference Extraction

Parse court decisions to extract article citations using regex patterns:

```python
# Russian court decision citation patterns
patterns = [
    r'(?:ст\.?\s*|статья\s+)(\d+(?:\.\d+)*)\s+([А-ЯЁA-Z]{2,}(?:\s+[А-ЯЁA-Z]{2,})?(?:\s*РФ)?)',  # "ст. 15 ГК РФ"
    r'(?:п\.?\s*|пункт\s+)(\d+(?:\.\d+)*)\s*(?:ст\.?\s*|статья\s+)(\d+)',  # "п. 2 ст. 15"
]
```

### MCP Tool

New MCP tool: `get-court-decisions-for-article`

```typescript
{
  name: "get-court-decisions-for-article",
  description: "Get court decisions that interpret or apply a specific legal article",
  inputSchema: {
    code_id: "string", // e.g., "GK_RF"
    article_number: "string", // e.g., "123"
    court_type?: "string", // optional filter
    limit: "number" // default 10
  }
}
```

## Alternatives Considered

1. **Only pravo.gov.ru API** - simpler but limited coverage
2. **Only commercial APIs** (ConsultantPlus, Garant) - rejected, official sources only
3. **Scraping only** - comprehensive but high maintenance burden
4. **Hybrid approach (selected)** - starts with API, adds scraping for coverage

## Official Sources Only

**Constraint**: Use only official government sources
- ✅ pravo.gov.ru API (official legal publication portal)
- ✅ kad.arbitr.ru (arbitration courts database)
- ✅ sudrf.ru (general jurisdiction courts database)
- ✅ vsrf.ru (Supreme Court official site)
- ✅ ksrf.ru (Constitutional Court official site)
- ❌ ConsultantPlus, Garant, Sudact (commercial - excluded)

## Additional Context

**Current Database**: 157K+ legal documents with full consolidation history (2011-present)

**Target**: Court decisions for last 2 years (2022-2024) with:
- Article reference links
- Partial embeddings (summaries only) for semantic search
- Metadata filtering by court, case type, date

**Related Files**:
- `scripts/country_modules/base/scraper.py` - BaseScraper ABC to extend
- `scripts/country_modules/registry.py` - Register new scrapers
- `docker/postgres/init.sql` - Database schema
- `src/server.ts` - MCP server tool registration

## Implementation Ideas (Optional)

### Implementation Timeline (~8 weeks)

**Week 1: Foundation**
- Create database schema (court_decisions, article_references, courts)
- Extend PravoApiClient for court decision endpoints
- Implement court decision parser (article reference extraction)

**Week 2: Official API Integration**
- Fetch court decisions from pravo.gov.ru API (last 2 years)
- Parse and store in database
- Extract article references
- Generate partial embeddings (summaries only)

**Week 3-4: Arbitration Courts**
- Implement kad_scraper.py (arbitration courts)
- Fetch last 2 years of decisions
- Parse and merge with existing data

**Week 5-6: General Jurisdiction**
- Implement sudrf_scraper.py (general courts)
- Fetch last 2 years of decisions
- Parse and merge with existing data

**Week 7: Supreme/Constitutional Courts**
- Implement vsrf/ksrf scrapers
- Fetch last 2 years of high-level decisions

**Week 8: MCP Tool & Search**
- Implement get-court-decisions-for-article tool
- Add semantic search for summaries
- Test end-to-end functionality

### Reference Scrapers

Leverage existing GitHub implementations as reference:
- yuglebov/kad_arbitr_ru - KAD scraper
- tochno-st/sudrfscraper - SUDRF scraper

Add respectful rate limiting (10-30s delays) per AI_WORKFLOW.md guidelines.

## Reference

- **Related**: Issue #22 - Phase 7C: Priority 1 Enhancements (Supreme + Constitutional courts only)
- **Expands**: Phase 7C scope to all court types
- **Follows**: AI_WORKFLOW.md guidelines (official sources only, batch operations, bias mitigation)
- **Uses**: country_modules architecture from Phase 7A (Issue #15-#21)

## Priority

**HIGH** - Valuable context for AI and users, official sources available, architecture ready

## Success Criteria

1. Database schema created for court decisions and article references
2. Scrapers implemented for all 4 official court sources
3. Court decisions fetched for last 2 years (2022-2024)
4. Article references extracted and linked
5. MCP tool functional for querying decisions by article
6. Partial embeddings generated for semantic search
7. End-to-end test: Query court decisions for article 15 of Civil Code

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEAT] Comprehensive Court Decision Fetching - All Court Types (2022-2024) #25

Feature Description

Problem Statement

Related Work

Proposed Solution

Architecture

Database Schema

Article Reference Extraction

MCP Tool

Alternatives Considered

Official Sources Only

Additional Context

Implementation Ideas (Optional)

Implementation Timeline (~8 weeks)

Reference Scrapers

Reference

Priority

Success Criteria

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[FEAT] Comprehensive Court Decision Fetching - All Court Types (2022-2024) #25

Description

Feature Description

Problem Statement

Related Work

Proposed Solution

Architecture

Database Schema

Article Reference Extraction

MCP Tool

Alternatives Considered

Official Sources Only

Additional Context

Implementation Ideas (Optional)

Implementation Timeline (~8 weeks)

Reference Scrapers

Reference

Priority

Success Criteria

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions