Skip to content

[EPIC] DI + Monorepo: Unified architecture transformation #116

@bashandbone

Description

@bashandbone

Overview

Transform CodeWeaver's architecture through integrated Dependency Injection and monorepo structure, reducing circular dependencies by 75% and enabling clean package separation.

The Discovery

Critical Insight: The DI architecture eliminates most circular dependencies that block monorepo split. These should be implemented together, not sequentially.

Key Finding: DI factories replace manual registry access, which is the root cause of provider → engine → config circular dependencies.

The Problem

Current architecture has:

  • 164 circular dependencies between modules
  • Manual provider instantiation scattered throughout codebase
  • 40+ providers with verbose setup
  • Hard to test (manual mocking required)
  • Will get worse as provider count grows to 100+

The Solution: Integrated DI + Monorepo

Phase 1: DI Foundation + Prep (Week 1)

  • Build core DI infrastructure (container, Depends(), factories)
  • Extract ready packages (tokenizers, daemon)
  • Move SearchResult types to core
  • Outcome: DI system ready, 2 packages extracted

Phase 2: DI Integration (Week 2)

  • Migrate core services to DI
  • Eliminates 120-130 violations (75%)
  • Breaks circular dependencies
  • Validates package boundaries
  • Outcome: Clean dependency flow established

Phase 3: Monorepo Structure (Week 3)

  • Organize into 9 packages with uv workspace
  • All packages build independently
  • Remaining violations: ~30-40 (down from 164)
  • Outcome: Clean monorepo structure

How DI Solves Monorepo Blockers

Problem: Registry Circular Dependencies

# Before: Manual registry creates coupling
from codeweaver.common.registry import get_provider_registry
registry = get_provider_registry()
provider = registry.get_embedding_provider()

# After: DI factory handles complexity
from codeweaver.di.providers import EmbeddingDep

class MyService:
    def __init__(self, embedding: EmbeddingDep):
        self.embedding = embedding  # Clean!

Result:Eliminates providers → engine dependency entirely

Violations Eliminated by DI

Original: 164 violations
After DI: ~30-40 violations (75% reduction)

Eliminated:

  • providers → engine (20 imports)
  • telemetry → engine (3 imports)
  • telemetry → config (3 imports)
  • engine → CLI (5 imports)
  • config → CLI (multiple)

Remaining (manual fixes):

  • core → utils (9 imports) - Move utilities
  • semantic → utils (4 imports)
  • providers → agent_api (4 imports) - Move types

Implementation Phases

Phase 1: Foundation + Prep (#117)

  • Core DI infrastructure
  • Extract tokenizers, daemon
  • Move SearchResult to core
  • Risk: Low | Duration: 5-7 days

Phase 2: Core Migration (#118)

  • Migrate Indexer, search services
  • Break circular dependencies
  • Update tests with DI overrides
  • Risk: Medium | Duration: 7-10 days

Phase 3: Monorepo Structure (NEW ISSUE NEEDED)

  • Organize into packages/
  • Build system setup
  • Final validation
  • Risk: Low | Duration: 5-7 days

Phase 4: pydantic-ai Integration (#119)

  • Agent DI support
  • Data provider injection
  • Risk: Medium

Phase 5: Advanced Features (#120)

  • Health checks, telemetry
  • Plugin system
  • Risk: Low

Phase 6: Cleanup (#121)

  • Deprecate old patterns
  • Simplify/eliminate registry
  • Risk: Low

Timeline

Total Duration: 2-3 weeks for core transformation
Extended: Additional 2-3 weeks for advanced features

Week 1: DI Foundation + Prep
Week 2: DI Integration (breaks dependencies)
Week 3: Monorepo Structure (natural organization)

Benefits

Immediate (After Phase 2)

  • 60-70% less boilerplate in dependency management
  • 75% reduction in circular dependencies
  • Better testing with clean override mechanism
  • Backward compatible during migration

Long-term

  • Scalability: Adding providers becomes mechanical
  • Maintainability: Clear package boundaries
  • Type safety: Full inference and checking
  • FastAPI alignment: Familiar patterns

Monorepo Structure (After Phase 3)

packages/
  codeweaver-core/          # Types, exceptions, DI
  codeweaver-tokenizers/    # ✅ Extracted
  codeweaver-daemon/        # ✅ Extracted
  codeweaver-utils/         # Common utilities
  codeweaver-semantic/      # Semantic chunking
  codeweaver-telemetry/     # Telemetry (DI-enabled)
  codeweaver-providers/     # All provider implementations
  codeweaver-engine/        # Engine, config (DI-enabled)
  codeweaver/               # CLI, server, MCP, agent_api

Success Criteria

Technical

  • DI container fully functional
  • All core services use DI
  • Dependency violations < 50 (down from 164)
  • All packages build independently
  • Zero circular dependencies between packages
  • Tests pass with 100% success rate

Quality

  • 60-70% reduction in boilerplate
  • Test code 50% less verbose
  • New provider integration < 1 hour
  • Performance within 5% of baseline

Related Issues

This work will obsolete or significantly simplify:

Documentation

See comprehensive analysis in monorepo planning docs:

  • INTEGRATED_DI_MONOREPO_STRATEGY.md - Main strategy
  • DI_IMPACT_VISUALIZATION.md - Visual impact
  • plans/dependency-injection-architecture-plan.md - Original DI plan

Constitutional Compliance

Proven Patterns (Principle II): Channels FastAPI's DI system
Evidence-Based (Principle III): Incremental, testable phases
Simplicity (Principle V): Hides complexity in factories
AI-First Context (Principle I): Self-documenting dependencies

Key Insight

Your DI architecture plan is the missing piece that makes monorepo split elegant.

Combined Strategy:

  • DI eliminates 75% of violations
  • Monorepo split becomes trivial
  • Better architecture (DI is valuable independently)
  • 2-3 weeks instead of 3-4 weeks

Recommendation: PROCEED with integrated DI + Monorepo strategy

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions