-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Overview
Transform CodeWeaver's architecture through integrated Dependency Injection and monorepo structure, reducing circular dependencies by 75% and enabling clean package separation.
The Discovery
Critical Insight: The DI architecture eliminates most circular dependencies that block monorepo split. These should be implemented together, not sequentially.
Key Finding: DI factories replace manual registry access, which is the root cause of provider → engine → config circular dependencies.
The Problem
Current architecture has:
- 164 circular dependencies between modules
- Manual provider instantiation scattered throughout codebase
- 40+ providers with verbose setup
- Hard to test (manual mocking required)
- Will get worse as provider count grows to 100+
The Solution: Integrated DI + Monorepo
Phase 1: DI Foundation + Prep (Week 1)
- Build core DI infrastructure (container, Depends(), factories)
- Extract ready packages (tokenizers, daemon)
- Move SearchResult types to core
- Outcome: DI system ready, 2 packages extracted
Phase 2: DI Integration (Week 2)
- Migrate core services to DI
- Eliminates 120-130 violations (75%)
- Breaks circular dependencies
- Validates package boundaries
- Outcome: Clean dependency flow established
Phase 3: Monorepo Structure (Week 3)
- Organize into 9 packages with uv workspace
- All packages build independently
- Remaining violations: ~30-40 (down from 164)
- Outcome: Clean monorepo structure
How DI Solves Monorepo Blockers
Problem: Registry Circular Dependencies
# Before: Manual registry creates coupling
from codeweaver.common.registry import get_provider_registry
registry = get_provider_registry()
provider = registry.get_embedding_provider()
# After: DI factory handles complexity
from codeweaver.di.providers import EmbeddingDep
class MyService:
def __init__(self, embedding: EmbeddingDep):
self.embedding = embedding # Clean!
Result: ✅ Eliminates providers → engine dependency entirely
Violations Eliminated by DI
Original: 164 violations
After DI: ~30-40 violations (75% reduction)
Eliminated:
- providers → engine (20 imports)
- telemetry → engine (3 imports)
- telemetry → config (3 imports)
- engine → CLI (5 imports)
- config → CLI (multiple)
Remaining (manual fixes):
- core → utils (9 imports) - Move utilities
- semantic → utils (4 imports)
- providers → agent_api (4 imports) - Move types
Implementation Phases
Phase 1: Foundation + Prep (#117)
- Core DI infrastructure
- Extract tokenizers, daemon
- Move SearchResult to core
- Risk: Low | Duration: 5-7 days
Phase 2: Core Migration (#118)
- Migrate Indexer, search services
- Break circular dependencies
- Update tests with DI overrides
- Risk: Medium | Duration: 7-10 days
Phase 3: Monorepo Structure (NEW ISSUE NEEDED)
- Organize into packages/
- Build system setup
- Final validation
- Risk: Low | Duration: 5-7 days
Phase 4: pydantic-ai Integration (#119)
- Agent DI support
- Data provider injection
- Risk: Medium
Phase 5: Advanced Features (#120)
- Health checks, telemetry
- Plugin system
- Risk: Low
Phase 6: Cleanup (#121)
- Deprecate old patterns
- Simplify/eliminate registry
- Risk: Low
Timeline
Total Duration: 2-3 weeks for core transformation
Extended: Additional 2-3 weeks for advanced features
Week 1: DI Foundation + Prep
Week 2: DI Integration (breaks dependencies)
Week 3: Monorepo Structure (natural organization)
Benefits
Immediate (After Phase 2)
- 60-70% less boilerplate in dependency management
- 75% reduction in circular dependencies
- Better testing with clean override mechanism
- Backward compatible during migration
Long-term
- Scalability: Adding providers becomes mechanical
- Maintainability: Clear package boundaries
- Type safety: Full inference and checking
- FastAPI alignment: Familiar patterns
Monorepo Structure (After Phase 3)
packages/
codeweaver-core/ # Types, exceptions, DI
codeweaver-tokenizers/ # ✅ Extracted
codeweaver-daemon/ # ✅ Extracted
codeweaver-utils/ # Common utilities
codeweaver-semantic/ # Semantic chunking
codeweaver-telemetry/ # Telemetry (DI-enabled)
codeweaver-providers/ # All provider implementations
codeweaver-engine/ # Engine, config (DI-enabled)
codeweaver/ # CLI, server, MCP, agent_api
Success Criteria
Technical
- DI container fully functional
- All core services use DI
- Dependency violations < 50 (down from 164)
- All packages build independently
- Zero circular dependencies between packages
- Tests pass with 100% success rate
Quality
- 60-70% reduction in boilerplate
- Test code 50% less verbose
- New provider integration < 1 hour
- Performance within 5% of baseline
Related Issues
This work will obsolete or significantly simplify:
- Register default services in services registry #84 (Register default services - DI handles)
- Improve EmbeddingRegistry: configurable size, dynamic sizing, persistence #100 (EmbeddingRegistry improvements - registry becomes thin layer)
- Integrate Indexer with services registry and implement persistence loading #103 (Indexer service integration - DI replaces approach)
Documentation
See comprehensive analysis in monorepo planning docs:
INTEGRATED_DI_MONOREPO_STRATEGY.md- Main strategyDI_IMPACT_VISUALIZATION.md- Visual impactplans/dependency-injection-architecture-plan.md- Original DI plan
Constitutional Compliance
✅ Proven Patterns (Principle II): Channels FastAPI's DI system
✅ Evidence-Based (Principle III): Incremental, testable phases
✅ Simplicity (Principle V): Hides complexity in factories
✅ AI-First Context (Principle I): Self-documenting dependencies
Key Insight
Your DI architecture plan is the missing piece that makes monorepo split elegant.
Combined Strategy:
- DI eliminates 75% of violations
- Monorepo split becomes trivial
- Better architecture (DI is valuable independently)
- 2-3 weeks instead of 3-4 weeks
Recommendation: PROCEED with integrated DI + Monorepo strategy