-
Notifications
You must be signed in to change notification settings - Fork 10
Description
DSPy Feature Gap Implementation Roadmap
This is a tracking issue for implementing missing DSPy features in Desiru (Ruby DSPy). All features have been analyzed and categorized by priority based on the feature gap analysis.
Overview
This roadmap is based on the comprehensive feature gap analysis comparing Desiru to Python DSPy.
🔴 High Priority Features
These features are critical for core functionality and should be implemented first.
Modules
- Implement ProgramOfThought module #2 ProgramOfThought - Generate executable code for problem-solving
- Implement MultiChainComparison module #3 MultiChainComparison - Compare multiple reasoning paths
- Implement BestOfN module #4 BestOfN - Sample N outputs and select best
Optimizers
- Implement MIPROv2 optimizer #7 MIPROv2 - Advanced Bayesian optimization (CRITICAL)
- Implement COPRO optimizer #8 COPRO - Collaborative prompt optimization
- Implement BootstrapFewShotWithRandomSearch optimizer #9 BootstrapFewShotWithRandomSearch - Hyperparameter optimization
Core Features
- Implement Example and Prediction classes #12 Example and Prediction classes - Core data containers
- Implement Trace Collection System #13 Trace Collection System - Execution tracking for optimization
- Implement Compilation Infrastructure #14 Compilation Infrastructure - Program compilation pipeline
- Implement Typed Predictors #15 Typed Predictors - Type-safe field handling
Utilities
- Implement Multi-provider LLM Abstractions #17 Multi-provider LLM Abstractions - Support beyond OpenAI
- Implement Data Loaders and Dataset Management #18 Data Loaders - CSV, JSON, HuggingFace support
🟡 Medium Priority Features
Important enhancements that improve functionality.
Modules
- Implement Refine module #5 Refine - Iterative output refinement
- Implement ChainOfThoughtWithHint module #6 ChainOfThoughtWithHint - Guided reasoning
Optimizers
- Implement SignatureOptimizer #10 SignatureOptimizer - Optimize field descriptions
- Implement KNNFewShot optimizer #11 KNNFewShot - Dynamic example selection
Core Features
- Implement Suggestions (Soft Constraints) #16 Suggestions (Soft Constraints) - Guide optimization
Utilities
- Implement Advanced Metrics and Evaluation System #19 Advanced Metrics - F1, BLEU, LLM-as-Judge
- Implement Streaming Support for LLM Outputs #20 Streaming Support - Token-by-token output
- Implement Program Serialization and Versioning #21 Serialization - Save/load compiled programs
🟢 Low Priority Features
Nice-to-have features that can be implemented later.
Optimizers
- LabeledFewShot - Simple baseline optimizer
- BootstrapFinetune - Generate finetuning data
- Ensemble - Combine multiple programs
- BayesianSignatureOptimizer - Bayesian signature optimization
Utilities
- ColBERTv2 Integration - Advanced retrieval
- Advanced Caching - Semantic caching
- Settings Management - Global configuration
Implementation Phases
Phase 1: Core Functionality (1-2 months)
Focus on high-priority core features that enable basic DSPy functionality:
- Example/Prediction classes (Implement Example and Prediction classes #12)
- ProgramOfThought module (Implement ProgramOfThought module #2)
- MIPROv2 optimizer (Implement MIPROv2 optimizer #7)
- Trace collection system (Implement Trace Collection System #13)
Phase 2: Enhanced Optimization (2-3 months)
Add critical modules and optimization capabilities:
- MultiChainComparison and BestOfN modules (Implement MultiChainComparison module #3, Implement BestOfN module #4)
- COPRO and signature optimizers (Implement COPRO optimizer #8, Implement SignatureOptimizer #10)
- Compilation pipeline (Implement Compilation Infrastructure #14)
- Suggestions system (Implement Suggestions (Soft Constraints) #16)
Phase 3: Ecosystem Integration (3-4 months)
Expand provider support and utilities:
- Multi-provider LLM support (Implement Multi-provider LLM Abstractions #17)
- Advanced metrics and evaluation (Implement Advanced Metrics and Evaluation System #19)
- Data loaders and transformers (Implement Data Loaders and Dataset Management #18)
- Serialization framework (Implement Program Serialization and Versioning #21)
Phase 4: Advanced Features (4-6 months)
Complete remaining features:
- Remaining modules and optimizers
- ColBERTv2 and advanced retrieval
- Streaming support (Implement Streaming Support for LLM Outputs #20)
- Performance optimizations
Success Criteria
- All high-priority features implemented and tested
- Integration tests passing for all feature combinations
- Performance benchmarks meet or exceed Python DSPy
- Comprehensive documentation for all features
- Example applications demonstrating key capabilities
Resources
Note: This is a living document. As features are implemented, please check them off and update progress notes.