-
Notifications
You must be signed in to change notification settings - Fork 17
feat: B1 field inhomogeneity distribution and eigenvalue caching optimization #404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: B1 field inhomogeneity distribution and eigenvalue caching optimization #404
Conversation
…ced Pydantic validation
This major refactor introduces a plugin-based system for modeling B1 field
inhomogeneity in NMR experiments, replacing the previous fixed Gaussian
implementation with a flexible, extensible architecture. Additionally, Pydantic
usage has been significantly improved throughout the configuration system.
## B1 Distribution Plugin System
### New Distribution Framework (src/chemex/nmr/distributions/)
- Created plugin-based registry system for B1 distributions (~1400 lines)
- Auto-discovery and registration of distribution plugins via loader
- Each distribution is self-contained in its own module with:
* Generator function for creating the distribution
* Pydantic config class for TOML schema validation
* Registration function for plugin system
### Available Distribution Types
- **gaussian**: Original ChemEx distribution (backward compatible, default)
- **hermite**: Gauss-Hermite quadrature using polynomial roots for optimal integration
- **skewed**: Skew-normal distribution for asymmetric B1 profiles
- **truncated_skewed**: Truncated skew-normal with hard upper bound at nominal B1
- **beta**: Upper-bounded distribution for modeling B1 degradation from coil
inhomogeneity or sample loading effects
- **dephasing**: Extreme inhomogeneity mode (replaces old b1_inh_scale=inf hack)
- **custom**: User-defined distributions with explicit B1 values and weights
### Registry Architecture
- Central registry (src/chemex/nmr/distributions/registry.py)
- Dynamic discriminated union construction for Pydantic validation
- Type-safe distribution generator functions
- Easy extensibility: new distributions require only adding a module
## Enhanced Pydantic Configuration System
### New Configuration Modules
- **src/chemex/configuration/b1_config.py**: B1 field configuration handler
* Supports both pw90 (pulse width) and value (frequency) specifications
* Flat TOML schema: all distribution parameters at top level
* Dynamic union of distribution configs built from registry
* XOR validation: exactly one of value or pw90 must be specified
- **src/chemex/configuration/types.py**: Common type aliases with constraints
* PositiveFloat, NonNegativeFloat, PositiveInt, NonNegativeInt
* Physics-specific: PulseWidth, Frequency, Temperature, B1Field
* NMR-specific: ChemicalShift, ExchangeRate, Population, RelaxationTime
* Consistent validation with Field constraints and AfterValidator
* Reduces boilerplate and ensures type consistency
### Updated Experiment Configuration
- **src/chemex/configuration/experiment.py**:
* New `B1InhomogeneityMixin` for unified B1 handling across experiments
* Mixin provides `get_b1_distribution()` method for all experiments
* Supports both simple (float) and advanced (table) B1 configurations
* Backward compatible with existing configurations
## Experiment Catalog Refactoring
### Migration from Dataclasses to Pydantic
All experiment classes in src/chemex/experiments/catalog/ updated:
- Replaced `@dataclass` with Pydantic `BaseModel`
- Changed `@cached_property` to `@computed_field` for Pydantic compatibility
- Added type hints using new types.py aliases
- Experiments now inherit from `B1InhomogeneityMixin`
### Updated Experiments (37 files)
CEST experiments:
- cest_13c.py, cest_15n.py, cest_15n_cw.py, cest_15n_tr.py
- cest_1hn_ap.py, cest_1hn_ip_ap.py, cest_ch3_1h_ip_ap.py
CPMG experiments:
- cpmg_13c_ip.py, cpmg_13co_ap.py
- cpmg_15n_ip.py, cpmg_15n_ip_0013.py, cpmg_15n_tr.py, cpmg_15n_tr_0013.py
- cpmg_1hn_ap.py, cpmg_1hn_ap_0013.py
- cpmg_ch3_13c_h2c.py, cpmg_ch3_13c_h2c_0013.py
- cpmg_ch3_1h_dq.py, cpmg_ch3_1h_sq.py, cpmg_ch3_1h_tq.py, cpmg_ch3_1h_tq_diff.py
- cpmg_ch3_mq.py, cpmg_chd2_1h_ap.py, cpmg_hn_dq_zq.py
DCEST experiments:
- dcest_13c.py, dcest_15n.py, coscest_13c.py, coscest_1hn_ip_ap.py
Relaxation experiments:
- relaxation_hznz.py, relaxation_nz.py
Shift experiments:
- shift_13c_sq.py, shift_15n_sq.py, shift_15n_sqmq.py
Other:
- noesyfpgpph19.py
## Example Files Updated
All example TOML files updated (150+ files) to reflect new configuration schema:
- examples/Experiments/: All CEST, CPMG, DCEST, relaxation experiment examples
- examples/Combinations/: Multi-experiment fitting examples
- Backward compatible: old format still works
- New examples demonstrate distribution options
## Core NMR Updates
### src/chemex/nmr/
- **liouvillian.py**: Added `set_b1_i_distribution()` method
- **spectrometer.py**: Updated to work with distribution objects
- **constants.py**: Added `Distribution` named tuple type
### src/chemex/containers/
- **data.py**: Updated to handle B1 distribution integration
## Testing
### New Test Suites
- **tests/configuration/test_b1_config.py**: Comprehensive B1 config tests
* Tests for value vs pw90 specifications
* Flat TOML schema validation
* XOR validation (exactly one of value/pw90)
* Distribution generation from config
- **tests/nmr/test_distributions.py**: Distribution function tests
* Validation of each distribution type
* Weight normalization checks
* Parameter validation
## Documentation Updates
Updated experiment documentation (website/docs/experiments/):
- CEST experiment docs: cest_13c.md, cest_15n.md, cest_15n_cw.md, etc.
- DCEST experiment docs: dcest_13c.md, dcest_15n.md, coscest_1hn_ip_ap.md
- Documented new B1 distribution options and configuration syntax
## Key Improvements
### Architectural Benefits
1. **Extensibility**: Add new distributions by creating a single module
2. **Type Safety**: Pydantic validation catches configuration errors early
3. **Consistency**: Common type aliases ensure uniform validation
4. **Maintainability**: Each distribution is self-contained and testable
5. **Backward Compatibility**: Existing configurations continue to work
### User Benefits
1. **Physical Accuracy**: Multiple distributions for different experimental setups
2. **Flexibility**: Choose distribution matching actual B1 profile
3. **Custom Distributions**: Define empirical B1 profiles from measurements
4. **Better Error Messages**: Pydantic provides clear validation feedback
### Developer Benefits
1. **Less Boilerplate**: Reusable type aliases and mixins
2. **Better IDE Support**: Type hints enable autocomplete and type checking
3. **Easier Testing**: Each component is independently testable
4. **Clear Structure**: Plugin system with well-defined interfaces
## Migration Notes
### For Users
- Old configuration format remains supported (backward compatible)
- New format uses inline table syntax in TOML:
```toml
[experiment]
b1_frq = { value = 20.0, type = "beta", scale = 0.10, res = 11 }
```
- Or flat schema (all params at top level):
```toml
[experiment.b1_frq]
value = 20.0
type = "beta"
scale = 0.10
res = 11
```
### For Developers
- Experiments should inherit from `B1InhomogeneityMixin`
- Use `@computed_field` instead of `@cached_property` with Pydantic
- Import types from `chemex.configuration.types` for consistency
- Call `settings.get_b1_distribution()` to obtain Distribution object
## Related Changes
- Updated uv.lock for dependency synchronization
- Added .github/copilot-instructions.md for AI coding assistance
- Minor formatting and import organization improvements
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 5 to 6. - [Release notes](https://github.com/actions/download-artifact/releases) - [Commits](actions/download-artifact@v5...v6) --- updated-dependencies: - dependency-name: actions/download-artifact dependency-version: '6' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4 to 5. - [Release notes](https://github.com/actions/upload-artifact/releases) - [Commits](actions/upload-artifact@v4...v5) --- updated-dependencies: - dependency-name: actions/upload-artifact dependency-version: '5' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps the dev-tools group in /website with 1 update: [eslint](https://github.com/eslint/eslint). Updates `eslint` from 9.38.0 to 9.39.1 - [Release notes](https://github.com/eslint/eslint/releases) - [Commits](eslint/eslint@v9.38.0...v9.39.1) --- updated-dependencies: - dependency-name: eslint dependency-version: 9.39.1 dependency-type: direct:development update-type: version-update:semver-minor dependency-group: dev-tools ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Add detailed technical documentation for ChemEx optimization project: - LMFIT_INVESTIGATION_SUMMARY.md: Executive summary with migration strategy and effort estimates (400-600 hours development) - lmfit_analysis.md: Architecture deep dive, data flow, and integration points - lmfit_code_patterns.md: Code examples and API signatures for implementation - lmfit_methods_summary.md: Optimization methods and constraint system details - README_LMFIT_INVESTIGATION.md: Navigation guide for all documents Key findings: - lmfit usage well-encapsulated in 7 files (2 critical: minimizer.py, database.py) - 10 performance bottlenecks identified (eigenvalue decomposition is #1) - 5-phase migration strategy defined for scipy.optimize replacement - Parallelization opportunities in grid search and statistics calculations This documentation supports planning for: 1. Replacing lmfit with scipy.optimize or JAX 2. Performance optimization (2-5x speedup potential from quick wins) 3. Code modernization and dependency reduction
Add complete benchmarking infrastructure to measure baseline performance and validate optimizations. This completes Path 3 (Profiling) and establishes quantified baseline metrics before implementing Path 1 (Quick Wins). New files: - benchmarks/benchmark_framework.py: Core benchmarking utilities with Benchmark class, ProfilerContext, and comparison tools - benchmarks/benchmark_bottlenecks.py: Profiles specific bottlenecks including eigenvalue decomposition, matrix ops, CPMG calculations - benchmarks/benchmark_endtoend.py: End-to-end workflow tests with real example data (CPMG, CEST, simulation) - benchmarks/run_benchmarks.py: Main runner with --quick/--bottlenecks/--e2e - benchmarks/README.md: Comprehensive documentation and usage guide - benchmarks/baseline_results.txt: Baseline performance metrics - BASELINE_PERFORMANCE_ANALYSIS.md: Executive summary of findings Key findings from baseline benchmarking: - Eigenvalue decomposition: 2.55-3.10x speedup available using eigh vs eig for Hermitian matrices (CRITICAL priority) - Propagator calculation: 0.26ms per call, primary bottleneck - matrix_power: Already optimal, no changes needed - Quick win optimizations identified: 5-10x speedup potential in 10-15 days Changes to pyproject.toml: - Relaxed Python requirement to >=3.11 (from >=3.13) for broader development/testing compatibility - Added TODO to re-evaluate minimum version during optimization work - Maintains recommendation of 3.13+ for production Benchmark capabilities: - Quick matrix operation tests (~30 seconds) - Detailed bottleneck profiling (~2-3 minutes) - End-to-end workflow benchmarks (~5-10 minutes) - Save/compare results for before/after validation - cProfile integration for call stack analysis Usage: cd benchmarks python run_benchmarks.py --all --save results.txt Next steps: 1. Implement eigh optimization (2.5-3x speedup, 1-2 days) 2. Add propagator caching (2-5x speedup, 3-5 days) 3. Vectorize Liouvillian construction (2-5x speedup, 3-5 days) 4. Parallelize grid search (4-8x on 8 cores, 2-3 days) Total Path 1 estimated speedup: 5-10x on typical workflows Total Path 1 estimated effort: 10-15 days
…ices Implement Path 1 Optimization #1: Replace np.linalg.eig with np.linalg.eigh when Liouvillian matrices are Hermitian (2.5-3x faster for the operation). Changes: - src/chemex/nmr/spectrometer.py: Modified calculate_propagators() to check if Liouvillian is Hermitian and use eigh when true - benchmarks/check_hermitian.py: Added utility to verify Hermiticity of basis matrices and constructed Liouvillians - benchmarks/optimized_results.txt: Benchmark results with optimization Performance Impact: - Matrix-only speedup: 2.26-2.82x (eigh vs eig across different sizes) - Propagator calculation: 1.35x speedup (0.261ms → 0.194ms per call) - Baseline: 0.0261s / 100 iterations - Optimized: 0.0194s / 100 iterations - Lower than matrix-only due to: 1. Hermitian check (np.allclose) overhead 2. Other operations in calculate_propagators (matrix mult, etc.) Testing: - Verified all constructed Liouvillians are Hermitian in practice - Individual basis matrices may be non-Hermitian, but final sum is Hermitian - Tested with CPMG 15N IP simulation - works correctly - Safe fallback to eig for non-Hermitian matrices (defensive) This is a low-risk optimization that maintains correctness while improving performance. Further speedup possible by: 1. Caching propagator calculations (next optimization) 2. Removing Hermitian check if all Liouvillians proven Hermitian 3. Optimizing other operations in calculate_propagators Next: Implement propagator caching for additional 2-5x speedup
This reverts the eigh optimization because comprehensive testing revealed that ChemEx Liouvillian matrices are NOT Hermitian in practice. Testing Results: - Real-world test with CPMG simulation: 144/144 matrices are non-Hermitian (100%) - Both 4D arrays and their 2D slices failed Hermitian test - The optimization provided ZERO performance benefit (always fell back to eig) - The Hermitian check added overhead without any gain What Went Wrong: - Initial check_hermitian.py used simplified test parameters that didn't represent real Liouvillians from actual experiments - Didn't test with real simulation data before implementing - Physical insight: Liouvillian matrices typically contain: * Relaxation terms (dissipative, non-Hermitian) * Chemical exchange operators (non-Hermitian) * RF pulse and offset terms (often non-Hermitian) Lesson Learned: - Always test optimizations with real-world data, not just synthetic tests - Quantum mechanics theory confirms Liouvillians are generally non-Hermitian - User validation request was absolutely correct and caught the error Reverted Changes: - src/chemex/nmr/spectrometer.py: Removed eigh optimization and Hermitian check - Restored original np.linalg.eig implementation Added Test Files (for documentation): - benchmarks/test_liouvillian_hermiticity.py: Direct testing with real runs - benchmarks/verify_hermitian.py: Comprehensive Hermiticity checker Next Steps: - Focus on optimizations that don't require Hermiticity: 1. Caching propagator calculations (most promising) 2. Vectorizing Liouvillian construction 3. Parallelizing grid search - Consider alternative eigenvalue optimizations (sparse matrices, etc.)
Create detailed status report documenting all completed work, failed optimization attempt, lessons learned, and validated recommendations for future optimizations. Report Contents: - Executive summary of current state (Path 3 complete, Path 1 paused) - Complete inventory of work completed (~4,400 lines of code/docs) - Detailed analysis of failed eigh optimization and why it failed - Validated optimization opportunities with realistic expectations - Revised optimization roadmap with risk mitigation - Recommendations for next steps Key Sections: 1. Work Completed: - lmfit investigation (5 docs, ~2,000 lines) - Benchmark suite (10 files, ~2,100 lines) - Baseline performance metrics established - Python version requirement updated 2. Failed Optimization Analysis: - eigh attempt based on flawed Hermiticity assumption - Real-world validation showed 0% of Liouvillians are Hermitian - Physics explanation: Liouvillians describe open quantum systems - Lessons learned about validation importance 3. Validated Optimizations (No Hermiticity Required): - Priority 1: Propagator caching (2-5x gain, 3-5 days) - Priority 2: Vectorize Liouvillian construction (2-5x, 3-5 days) - Priority 3: Parallelize grid search (4-8x on 8 cores, 2-3 days) - Priority 4: Research alternative eigenvalue methods 4. Revised Roadmap: - Phase 1: Low-risk high-impact (2-3 weeks, 5-15x total gain) - Phase 2: Research & advanced (optional) - Phase 3: lmfit replacement (optional, 14-17 weeks) 5. Risk Mitigation: - Mandatory validation checklist for each optimization - Feature flags for safe rollback - Continuous benchmarking and numerical accuracy checks This report provides complete transparency about what worked, what didn't, and clear recommendations backed by profiling data. Ready for stakeholder review and decision on next steps. File: OPTIMIZATION_STATUS_REPORT.md (350+ lines) Status: Path 3 complete, validation phase complete, awaiting direction
Implement intelligent caching of expensive eigenvalue decompositions to improve performance when the same Liouvillian is used with different delays. Implementation Details: - Cache key: (liouvillian.tobytes(), dephasing) tuple - LRUCache with 256 entry limit (~2MB for typical matrices) - Feature flag: CHEMEX_CACHE_EIGEN (default=enabled, set to 0 to disable) - Safe fallback if caching fails - Cache statistics tracking for monitoring Performance Results: - Direct tests: 4.47x speedup (0.32ms → 0.07ms per call) - Cache hit rate: 96.2% on repeated Liouvillian usage - Time saved: 77.6% reduction in eigendecomposition time Numerical Validation: - Results are numerically identical with/without caching - Max difference < machine precision (< 1e-15) - Tested with CPMG 15N simulation workflow Real-World Performance: - Workflows using mostly single-delay calculations see minimal benefit (fast path via scipy.expm bypasses eigendecomposition) - Workflows with many delay arrays for same Liouvillian see ~4x speedup - CEST experiments with multiple offsets will benefit significantly Code Changes: - src/chemex/nmr/spectrometer.py: * Added _eigen_cache (LRUCache) and _cache_stats tracking * Created _compute_eigen_decomposition() helper function * Modified calculate_propagators() to check cache before computing * Added get_cache_stats() for monitoring * Added fast_path tracking for diagnostics Test Files Added: - benchmarks/test_cache_direct.py: Direct cache functionality tests - benchmarks/test_propagator_cache.py: Integration tests - benchmarks/validate_cache_accuracy.py: Numerical validation Usage: # Enable caching (default) chemex simulate ... # Disable caching (for comparison/debugging) CHEMEX_CACHE_EIGEN=0 chemex simulate ... # Get cache statistics (from Python) from chemex.nmr.spectrometer import get_cache_stats print(get_cache_stats()) Design Rationale: - Cache only eigendecomposition, not full result (delays vary frequently) - Use LRU eviction to limit memory usage - Feature flag allows easy disable if issues arise - Safe fallback ensures robustness Limitations: - Workflows using single-delay calculations bypass cache (use faster expm path) - Cache is per-process (not shared across parallel runs) - Memory overhead: ~8KB per cached 16x16 matrix Next Steps (Future Optimizations): - Vectorize Liouvillian construction (2-5x additional speedup) - Parallelize grid search (4-8x on multi-core) - Consider shared memory cache for parallel workflows
Comprehensive summary of all work completed: - Investigation and analysis (lmfit + benchmarking) - Failed attempt #1 (eigh - Hermitian assumption incorrect) - Successful attempt #2 (eigenvalue caching - 4.47x speedup) - Complete validation methodology - Future optimization roadmap Key achievements: - Path 3 (Profiling): 100% complete - Path 1 (Quick Wins): 1/3 optimizations complete - ~5,000 lines of code, tests, and documentation - Rigorous validation process established This summary provides complete transparency on what worked, what didn't, and clear next steps for future optimization work.
Added user-facing testing tools to validate eigenvalue cache performance with real ChemEx workflows before proceeding with additional optimizations. New files: - CACHE_TESTING_GUIDE.md: Comprehensive 15KB guide covering quick start, expected performance by workflow type (CEST: 2-4x, Grid: 2-5x), detailed testing scenarios, cache monitoring methods, tuning recommendations, and troubleshooting guidance - benchmarks/quick_cache_test.sh: Automated testing script that runs workflows with/without cache (3 iterations each), calculates speedup, and provides colored recommendations based on results Updated: - OPTIMIZATION_SUMMARY.md: Added cache testing infrastructure section, updated file counts (24 files, ~5,300 lines), added "How to Proceed" section with clear next steps for user validation Testing validation: - Direct cache test confirms 4.89x speedup with 96.2% hit rate - All infrastructure ready for user testing with real workflows Status: Path 3 complete, Path 1 optimization #1 deployed and tested, awaiting user validation before next optimization phase
Replace deprecated np.random.seed/randn with np.random.default_rng() and rng.standard_normal() to satisfy ruff NPY002 warnings. Also removed unused imports in benchmark_bottlenecks.py. All 100 tests pass with these changes.
PR Review: B1 Field Inhomogeneity Distribution System and Eigenvalue CachingSummaryThis is a substantial PR that introduces two major enhancements:
Overall, this is well-architected code that demonstrates strong software engineering practices. Below is my detailed analysis. Code Quality and Best PracticesStrengths1. Excellent Plugin Architecture (Distribution System)
2. Strong Type Safety
3. Backward Compatibility
4. Comprehensive Documentation
5. Robust Validation
Potential Issues and ConcernsHigh Priority1. Global Mutable Cache State ( _eigen_cache: LRUCache = LRUCache(maxsize=256)
_cache_stats = {"hits": 0, "misses": 0, "enabled": _CACHE_ENABLED}
2. Cache Key Memory Consumption ( cache_key = (liouv.tobytes(), dephasing)
3. Potential Division by Zero ( t = mu * (1 - mu) / (sigma**2) - 1
4. Nested Schema Rejection May Be Too Strict (
Medium Priority5. Registry Uses ClassVar for Mutable State ( _generators: ClassVar[dict[str, DistributionGenerator]] = {}
_configs: ClassVar[dict[str, type[Any]]] = {}
6. Benchmark Code Uses sys.path Manipulation ( sys.path.insert(0, str(Path(__file__).parent.parent / "src"))
7. Cache Statistics Type Annotation ( def get_cache_stats() -> dict[str, int | bool]:
Performance ConsiderationsPositive
Concerns
Security ConcernsNo significant security issues identified.
Test CoverageStrengths
Gaps
Additional Recommendations
def clear_eigen_cache() -> None:
"""Clear the eigenvalue decomposition cache."""
_eigen_cache.clear()
_cache_stats["hits"] = 0
_cache_stats["misses"] = 0
VerdictAPPROVE with minor suggestions This is high-quality work that adds significant functionality to ChemEx. The distribution system is well-designed with excellent extensibility, and the eigenvalue caching provides measurable performance improvements with proper validation. Key positives:
The concerns raised are mostly minor and can be addressed in follow-up PRs. The most important items to address are:
Excellent work on this substantial enhancement! 🤖 Generated with Claude Code |
Code Review SummaryI've conducted a thorough review of this PR which introduces two major features: the B1 field inhomogeneity distribution system and eigenvalue decomposition caching optimization. ✅ Strengths1. B1 Distribution System - Excellent Architecture
2. Eigenvalue Caching - Well-Implemented Performance Optimization
3. Test Coverage
|
Adds a wrapper script that tracks calculate_propagators calls during ChemEx execution to diagnose why eigenvalue cache may not be effective for certain workflows. Monitors: - Single vs multi-delay calls - Unique Liouvillians encountered - Cache hit/miss rates - Liouvillian reuse patterns Usage: python benchmarks/monitor_cache.py fit -e exp.toml -p params.toml
PR Review: B1 Field Inhomogeneity Distribution and Eigenvalue CachingSummaryThis is a substantial PR (+17,335/-7,271 lines) introducing two major enhancements:
Strengths1. Excellent Architecture Design
2. Solid Implementation
3. Good Test Coverage
Issues to AddressHigh Priority1. Silent Parameter Clamping in Beta Distribution (
|
After implementing and testing the eigenvalue decomposition cache, real-world testing revealed it to be completely ineffective: SYNTHETIC TEST RESULTS (misleading): - 4.47x speedup with 96.2% cache hit rate - Appeared very promising REAL-WORLD CPMG FITTING RESULTS: - 0.7% cache hit rate (96 hits out of 12,960 calls) - 12,864 unique Liouvillians out of 12,864 cache misses - No measurable speedup (identical timing) ROOT CAUSE: Every parameter update during optimization creates a NEW Liouvillian matrix. The cache key (liouv.tobytes(), dephasing) is unique for: - Each profile (different spin system parameters) - Each fitting iteration (parameters change: kex, pb, dw, etc.) - Each CEST offset (changes l_free component) Result: No data reuse opportunity exists in fitting scenarios. REMOVED: - Eigenvalue caching infrastructure from spectrometer.py (~89 lines) - Cache testing scripts and documentation (6 files) - Updated OPTIMIZATION_SUMMARY.md to document lessons learned LESSONS LEARNED: 1. Synthetic benchmarks can be misleading - always test with real workflows 2. Understanding the physics is critical (parameter changes → new Liouvillian) 3. Monitoring tools are essential for diagnosing performance issues 4. User feedback was invaluable for catching ineffective optimizations The benchmarking infrastructure, lmfit analysis, and lessons learned remain valuable for future optimization attempts. Key insight: focus on parallelization (profiles are independent) rather than caching.
Code Review: B1 Field Inhomogeneity Distribution and Eigenvalue CachingOverviewThis is a substantial PR that introduces two major features:
Overall Assessment: The B1 distribution system is well-designed and production-ready. The caching implementation has some concerns that should be addressed. The benchmark infrastructure is excellent. ✅ StrengthsB1 Distribution System
Benchmark Infrastructure
|
| Aspect | Rating | Notes |
|---|---|---|
| Architecture | ⭐⭐⭐⭐⭐ | Excellent plugin pattern |
| Type Safety | ⭐⭐⭐⭐ | Good overall, few cast issues |
| Testing | ⭐⭐⭐⭐⭐ | Comprehensive coverage |
| Documentation | ⭐⭐⭐⭐⭐ | Clear docstrings throughout |
| Error Handling | ⭐⭐⭐ | Missing fault tolerance |
| Memory Management | ⭐⭐ | Unbounded caches |
🔒 Security Considerations
No security vulnerabilities identified. The caching uses internal object references only and doesn't expose any attack surface.
🧪 Test Coverage
Well covered:
- ✅ Distribution registry and plugin system
- ✅ All 7 distribution types (Gaussian, Hermite, Skewed, Beta, Custom, etc.)
- ✅ Pydantic validation and error cases
- ✅ Weight normalization
- ✅ Backward compatibility
- ✅ Edge cases (scale=0, single point, mismatched lengths)
Missing tests:
- ❌ Cache behavior and eviction
- ❌ Memory growth under sustained load
- ❌ Thread safety
- ❌ Plugin loading error recovery
📈 Performance Considerations
Eigenvalue Caching Analysis
The benchmark infrastructure correctly identified that:
- Synthetic tests: 96% cache hit rate, 4.9x speedup
- Real CPMG fitting: 0.7% cache hit rate (parameters change each iteration)
This is excellent discovery methodology. The eigenvalue caching was appropriately abandoned based on real-world data.
B1 Distribution Performance
The distributions use efficient NumPy operations. Beta distribution uses scipy.stats.beta.ppf() for quantile sampling, which is appropriate.
📝 Recommendations (Priority Order)
Must Fix
- Replace unbounded cache dictionaries with
LRUCache(maxsize=N)to prevent memory leaks - Add error recovery to plugin loader for robustness
Should Fix
- Replace silent clamping with validation errors or logged warnings
- Add docstrings to cached functions in spectrometer.py
- Document mathematical approximations in skewed distributions
Nice to Have
- Add cache statistics/monitoring capabilities
- Use Protocol-based typing instead of
cast("Any", ...) - Add thread-safety if multiprocessing is used
🎯 Summary
This PR introduces a well-designed, professionally implemented B1 distribution system that significantly enhances ChemEx's capabilities. The plugin architecture is exemplary and the testing is comprehensive.
Main concerns:
- Unbounded caches need bounded replacements (critical for long-running jobs)
- Silent parameter clamping could confuse users
The benchmark infrastructure is excellent and demonstrates mature engineering practices - correctly identifying that eigenvalue caching wasn't effective in production scenarios.
Verdict: Ready to merge after addressing the cache bounding issue and considering the plugin loader error recovery.
Review performed by Claude Code
Replace inefficient Python sum() with generator in _build_base_liouvillian
with vectorized NumPy einsum operation.
BEFORE (inefficient):
```python
self._l_base = sum(
(self.basis.matrices[name] * self.par_values.get(name, 0.0)
for name in self.basis.matrices),
start=np.array(0.0),
)
```
AFTER (vectorized):
```python
param_array = np.array([self.par_values.get(name, 0.0) for name in self._matrix_names])
self._l_base = np.einsum("i,ijk->jk", param_array, self._stacked_matrices)
```
PERFORMANCE:
- Synthetic benchmark: 3.39x speedup (0.024ms → 0.007ms per call)
- Realistic workflow (96 profiles × 100 iterations): 5.26x speedup
- Updates per second: 219,028 vs 41,702 baseline
- All 100 tests pass
NUMERICAL ACCURACY:
- Max absolute difference: 0.00e+00 (perfect match)
- Results are bit-for-bit identical
IMPLEMENTATION:
- Pre-compute stacked matrices on first use (lazy initialization)
- Cache matrix names for consistent ordering
- No changes to public API or behavior
IMPACT:
Since _build_base_liouvillian is called for every profile at every
fitting iteration (e.g., 96 profiles × 1913 iterations = 183,648 calls
in typical CPMG fit), this optimization provides measurable real-world
speedup.
Added benchmark files:
- benchmarks/benchmark_liouvillian_construction.py
- benchmarks/validate_vectorization.py
Pull Request Review: B1 Field Inhomogeneity Distribution and OptimizationThank you for this substantial PR! I've conducted a thorough review of the changes. Here's my detailed feedback: ✅ Strengths1. Excellent Plugin ArchitectureThe B1 distribution system is well-designed with a clean plugin-based architecture:
2. Comprehensive Distribution Options
3. Strong ValidationThe
4. Good Test CoverageTests in
|
Adds cProfile-based profiling to identify actual bottlenecks in ChemEx CPMG fitting workflows. Key findings from profiling 48-profile CPMG fit (124.6s total): - Uncertainty calculation: 67.6s (54%) - lmfit overhead - Actual fitting: 42.2s (34%) - Parameter eval (asteval): 32.2s (26%) - CPMG sequence calculation: 19.5s (16%) - Liouvillian construction: 0.044s (0.035%) - Propagator calculation: 0.96s (0.8%) This reveals that lmfit parameter management and uncertainty calculation are the dominant costs, not numerical Liouvillian operations. The recent vectorization optimization (5.26x in synthetic tests) only affects 0.044s of the total runtime, explaining why no measurable speedup was observed. Implications: - Path 2 (lmfit replacement) would have the largest impact - Parallelization of profile calculations could help the 19.5s CPMG time - Focus on lmfit overhead rather than numerical optimizations
Benchmarks actual CPMG_15N_IP_0013 example to measure true impact of Liouvillian vectorization on real workflows. Key findings: - Liouvillian construction: 0.0046ms (new) vs 0.0241ms (old) = 4.81x speedup - pulse_sequence.calculate: 2.6ms (dominant cost per profile) - Liouvillian construction is only 0.18% of calculation time - Total estimated savings: 3.67 seconds per fit (6.1% of fitting phase) This explains why the 5x speedup in synthetic benchmarks shows no measurable difference in real workflows - the optimization targets a tiny fraction of the total computation. Real bottlenecks identified: 1. pulse_sequence.calculate: 2.6ms per profile (CPMG sequence logic) 2. Uncertainty calculation: ~57 seconds AFTER fit completes 3. Parameter evaluation: asteval overhead in lmfit The fitting itself (residual calculations) takes ~5-6 seconds, but uncertainty calculation dominates total runtime.
Code Review: PR #404 - B1 Field Inhomogeneity Distribution and Eigenvalue Caching OptimizationThank you for this substantial enhancement to ChemEx! After reviewing the code, here is my assessment: Strengths1. Excellent Architecture (B1 Distribution System)
2. Comprehensive Test Coverage
3. Solid Mathematical Implementations
4. Good Documentation
Potential Issues / Recommendations1. Memory Leak Risk in Caching (
|
Cache phase cycling patterns in _get_phases() for all CPMG experiments that use phase cycling. The phases only depend on ncyc values which are constant during fitting, so caching avoids redundant numpy array creation. Profiling showed _get_phases() calls took 0.11ms per calculate() call (5.9% of loop time). With caching: 0.004ms (0.2% of loop time) - a 29x speedup for that operation. Reverts Liouvillian vectorization (commit 6a0ef6b) as it only affected 0.18% of calculation time and showed no measurable real-world improvement. Also adds comprehensive benchmarking scripts for analyzing CPMG performance: - profile_cpmg_calculate.py: Profiles individual operations in calculate() - profile_loop_overhead.py: Measures Python overhead vs computation - analyze_cpmg_phases.py: Tests matrix_power optimization (rejected) - test_matmul_alternatives.py: Compares matmul strategies (reduce optimal)
Pull Request Review: B1 Field Inhomogeneity Distribution and Eigenvalue CachingOverall AssessmentThis is a substantial and well-designed PR that adds a comprehensive B1 field inhomogeneity distribution system with excellent extensibility. The implementation follows good software engineering practices, though the scale (+17,385/-7,351 lines) warrants careful review before merging. ✅ Strengths1. Excellent Plugin ArchitectureThe distribution system (
2. Comprehensive Distribution ImplementationsThe 8 distribution types cover a wide range of physical scenarios:
Each implementation includes excellent docstrings with physical interpretation, usage examples, and mathematical notes. 3. Strong Type Safety and Validation
4. Thorough Documentation
5. Good Test CoverageThe test suite (
|
Summary
This PR combines two significant enhancements to ChemEx:
1. B1 Field Inhomogeneity Distribution System
src/chemex/nmr/distributions/) with multiple distribution types:src/chemex/configuration/b1_config.py)2. Performance Optimization: Eigenvalue Decomposition Caching
CHEMEX_CACHE_EIGEN=0) for safe disable if neededKey Changes
src/chemex/nmr/spectrometer.pyTest plan
pytest tests/pytest tests/nmr/test_distributions.pypytest tests/configuration/test_b1_config.pypython benchmarks/test_cache_direct.pypython benchmarks/validate_cache_accuracy.pyruff check .pyrightbenchmarks/quick_cache_test.sh