-
Notifications
You must be signed in to change notification settings - Fork 59
Description
Windows 10
VS 2017
Python 3.10
Aside from the missing instruction to install the CUDA Toolkit, which someone put a PR for here, using the default requirements fails to compile the CUDA kernel in Windows 10.
It returns the error: too few arguments for template template parameter "Tuple" detected, that many seem to be struggling with e.g. facebookresearch/pytorch3d#1024, for possibly the same reasons.
The combination that worked for me was CUDA-Toolkit 11.3.1 and torch==1.12.1+cu113 but even then the nvcc will fail to compile the kernel, complaining about mismatched parameters on atomicAdd in quant_cuda_kernel.cu.
I found that quick fix involves adding #include <THC/THCAtomics.cuh> to quant_cuda_kernel.cu, as per https://github.com/mit-han-lab/torchsparse/pull/75/files. Not sure what impact that has on performance, and perhaps atomicAdd would be there if correct platform flags are used for nvcc, but I wanted to see results fast and didn't bother.
In any case, this is a neat project, and looking forward to more updates.