-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Labels
criticalenhancementNew feature or requestNew feature or requestphase-7Structure RefactoringStructure Refactoring
Description
Task
Restructure codebase to support pluggable country modules, enabling future multi-country expansion. This refactoring is informed by P2P research to ensure architecture can support both centralized and decentralized modes.
Overview
Duration: Months 3-4 (parallel with Phase 3)
Current codebase is Russia-specific. This phase refactors it into country-agnostic core with pluggable country-specific modules.
7.1 Pluggable Country Module Architecture
Current Structure
scripts/
├── crawler/ # pravo.gov.ru API (Russia-specific)
├── parser/ # Russian legal document parser
├── consolidation/ # Russian code consolidation
├── sync/ # Russian data sync
└── import/ # Russian legal codes
Target Structure
scripts/
├── core/ # Country-independent (existing, expand)
│ ├── config.py
│ ├── db.py
│ └── batch_saver.py
│
├── country_modules/ # Country-specific modules (NEW)
│ ├── base/ # Abstract base classes
│ │ ├── scraper.py # BaseScraper interface
│ │ ├── parser.py # BaseParser interface
│ │ ├── consolidator.py # BaseConsolidator interface
│ │ └── schema.py # Base schema definitions
│ │
│ ├── russia/ # Russian Federation (refactor existing)
│ │ ├── scrapers/
│ │ ├── parsers/
│ │ ├── consolidation/
│ │ └── schemas/
│ │
│ └── germany/ # Germany (future)
│
├── legal_systems/ # Legal system adapters (NEW)
│ ├── civil_law/ # Code-based systems (Russia, Germany, France)
│ └── common_law/ # Case law systems (UK, USA, Canada)
│
└── indexer/ # Country-agnostic (unchanged)
Files to Create
scripts/country_modules/base/scraper.py- Abstract base class for scrapersscripts/country_modules/base/parser.py- Abstract base class for parsersscripts/country_modules/base/consolidator.py- Abstract base class for consolidationscripts/legal_systems/civil_law/schema.py- Civil law common schemascripts/legal_systems/common_law/schema.py- Common law common schema
Files to Refactor
scripts/crawler/pravo_api_client.py→scripts/country_modules/russia/scrapers/pravo_api_client.pyscripts/parser/html_parser.py→scripts/country_modules/russia/parsers/html_parser.pyscripts/consolidation/consolidate.py→scripts/country_modules/russia/consolidation/consolidate.py
7.2 Country Registry and Configuration
Create Country Registry
# scripts/country_modules/registry.py
class CountryModule:
"""Country-specific module configuration"""
def __init__(
self,
country_id: str, # ISO 3166-1 alpha-3 (e.g., "RUS", "DEU")
country_name: str,
legal_system: str, # "civil_law", "common_law", "mixed"
scraper_class: Type[BaseScraper],
parser_class: Type[BaseParser],
data_sources: Dict[str, str],
jurisdiction_levels: list,
):
...
# Country registry
COUNTRIES: Dict[str, CountryModule] = {
"RUS": CountryModule(
country_id="RUS",
country_name="Russia",
legal_system="civil_law",
scraper_class=RussiaPravoScraper,
parser_class=RussiaHtmlParser,
data_sources={
"federal": "http://pravo.gov.ru",
"supreme_court": "https://vsrf.ru",
"constitutional_court": "http://www.ksrf.ru",
},
jurisdiction_levels=["federal", "regional", "municipal"],
),
}7.3 Database Schema for Multi-Country
Schema Updates
-- Add country_id to existing tables
ALTER TABLE documents ADD COLUMN country_id VARCHAR(3) NOT NULL DEFAULT 'RUS';
ALTER TABLE documents ADD CONSTRAINT fk_country
FOREIGN KEY (country_id) REFERENCES countries(id);
ALTER TABLE documents ADD COLUMN jurisdiction_level VARCHAR(20);
ALTER TABLE documents ADD COLUMN jurisdiction_id VARCHAR(100);
-- Update countries table
CREATE TABLE countries (
id VARCHAR(3) PRIMARY KEY, -- ISO 3166-1 alpha-3
name_en VARCHAR(100),
name_native VARCHAR(100),
legal_system_type VARCHAR(50), -- 'civil_law', 'common_law', 'mixed'
federal_structure BOOLEAN,
official_languages VARCHAR(100)[],
data_sources JSONB,
scraper_config JSONB,
parser_config JSONB,
is_active BOOLEAN DEFAULT TRUE,
created_at TIMESTAMP DEFAULT NOW()
);7.4 MCP Server Country Parameter
Update MCP Tools
// src/tools/query-laws.ts
export const queryLawsTool = {
name: "query-laws",
description: "Search legal documents by country",
inputSchema: {
country?: "string", // NEW: Country code (default: "RUS")
query: "string",
filters?: "SearchFilters",
use_hybrid?: "boolean"
}
};7.5 Migration Path for Russia Module
Migration Steps
- Create new structure without touching existing code
- Move Russia module to
country_modules/russia/ - Create shims for backward compatibility
- Update imports gradually
- Remove shims after all imports updated
Backward Compatibility
- All existing scripts continue to work
- Gradual migration via shims
- No breaking changes to MCP tools
- Database migration uses default
country_id='RUS'
Deliverables
- Refactored codebase with country modules
- Country registry and configuration
- Multi-country database schema
- MCP server country parameter support
- Backward-compatible migration completed
Timeline
Month 3: Create base classes, refactor core modules
Month 4: Move Russia module, create shims, test migration
Reference
- See PHASE7_STRUCTURE_REFACTORING.md for detailed requirements
- Informed by: Phase 0: P2P Research
- Parallel with: Phase 3
- Enables: Phase 4 (country architecture foundation)
Priority
HIGH - Enables multi-country expansion
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
criticalenhancementNew feature or requestNew feature or requestphase-7Structure RefactoringStructure Refactoring
Projects
Status
Todo