DinoML compiles machine learning models into optimized standalone modules.
| System | Support |
|---|---|
| Windows | ✅ |
| Linux | ✅ |
| Platform | Support |
|---|---|
| CUDA | ✅ |
| ROCm | Partial |
- Custom kernels for standard operators
- Custom fused kernels that combine several operations into a single kernel
- Graph transformations deduplicate and reduce the number of needed operations
- Graph transformations fuse common patterns of operations
- Profiling to find the best kernel for a problem size
- Intermediate workspace is reused to reduce memory usage
and more!
| Dynamic shape | Portability | Dynamic memory usage | |
|---|---|---|---|
| DinoML | ✅ | ✅ | ✅ |
| torch.compile | Limited | Limited | ❌ |
| TensorRT | Limited | Limited | ❌ |
