Releases: wang-xianghao/cuPyLMA
Releases · wang-xianghao/cuPyLMA
cuPyLMA Release 0.1
This is a beta release of cuPyLMA.
Changelog
- First release.
Known issues and Further Work
- The multi-GPU acceleration is restricted by kernel calls' overheads: we will explore CUDA graph to minimize the overheads.
- The optimizer does not inherit
torch.optim.Optimizerwhich brings extra work on migrating the existing code: we will reconstruct our optimizer to make it follow PyTorch optimizer's interface.