-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Labels
enhancementNew feature or requestNew feature or request
Description
🚀 GPU Acceleration: Enable Running Upside Molecular Dynamics 2 on GPU
Summary
Implement the infrastructure and logic necessary to run Upside’s molecular simulations on NVIDIA GPUs using CUDA. This includes offloading computational hotspots, ensuring memory coherence, establishing a robust testing protocol, and modularizing the side chain handling.
Objectives
- Offload the most computationally intensive routines (interaction potentials and derivatives) to the GPU.
- Ensure unified host/device memory management using
DeviceObject<T>. - Refactor core computational logic into
__host__ __device__functions for code reuse. - Integrate asynchronous CUDA streams and, in later stages, CUDA graphs for performance.
- Maintain numerical correctness and reproducibility.
- Modularize side chain logic to support configurable side chain assignment:
- (a) Random assignment using normalized side chain probabilities (as currently implemented)
- (b) Loopy-BP algorithm to update side chain marginal probabilities
Tasks
1. Profiling & Hotspot Identification
- Profile current CPU implementation to identify top computational bottlenecks.
2. Kernel Development
PosNodes (Layer 0: Position Management)
Core Infrastructure - Always Required
- Pos (built-in) - Root atom positions, implemented in
deriv_engine.h
CoordNodes (Layer 1: Coordinate Calculation)
- DistCoord (easy) - 3D distance calculations between atom pairs
- AngleCoord (easy) - Bond angle calculations using dot products and vector normalization
- DihedralCoord (medium) - Dihedral angle calculations with complex jacobian derivatives
- InferHO (complex) - Virtual hydrogen and oxygen atom position inference
- AfineAlinement (unknown)
PotentialNodes (Layer 2: Simple Energy Calculation)
Phase 2A - Simple Force Calculations
- Spring (easy) - 1D harmonic restraint potentials (F = -k(x - x₀))
- BackbonePairs (critical) - N² pairwise steric clash prevention
- ProteinHBond (complex) - Backbone-backbone and backbone-sidechain H-bond detection
- EnvironmentCoverage (medium) - Solvation environment detection and coverage analysis
- HBondEnvironmentCoverage (medium) - H-bond donor/acceptor environment coverage
- RamaCoord (unknown)
- PlacementFixedPointVector (unknown)
- SpringBond (unknown)
CustomNodes (Intricate or Specialized Energy Calculation)
- SigmoidCoupling (medium) - Nonlinear coupling between protein and environmental effects
- HBondCoverage (complex) - Coverage analysis of hydrogen bonding potential
- BackboneSigmoidCoupling (medium) - Backbone-specific environmental coupling
- WeightedPos (medium) - Weighted coordinate transformations and position averaging
- RotamerSideChain (complex) - Sidechain conformation via Loopy Belief Propagation
- RamaMapPot (unknown)
- HbondEnergy (unknown)
- CatPosBBCovereage (unknown)
- HBCoverageHydrophobe (unknown)
- EnvironmentCoverage (unknown)
3. Memory Management
- Implement
DeviceObject<T>for automatic host/device data management. - Replace core data structures (positions, velocities, forces) with
DeviceObject<T>.
4. Asynchronous Execution
- Launch GPU kernels and memory transfers using CUDA streams.
- Integrate CUDA graphs to capture and optimize the simulation step sequence.
5. Testing & Validation
- Create tests to compare CPU and GPU results for numerical equivalence.
- Benchmark performance: report speedup vs. CPU version.
6. Documentation
- Update README and developer docs to describe GPU requirements, usage, and side chain assignment options.
Acceptance Criteria
- The simulation can be run fully on the GPU.
- Users can select between random or loopy-BP side chain assignment.
- Results are numerically consistent with the CPU version (within a set tolerance).
- Performance improvement is demonstrated and quantified.
- Code is clean, documented, and maintainable.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request