Conversation
lixi-zhou
left a comment
There was a problem hiding this comment.
The CUDA code looks good to me. Please see the comments for the library configuration, thank you.
| endif | ||
|
|
||
| NUM_THREADS ?= $(shell getconf _NPROCESSORS_CONF 2>/dev/null || echo 1) | ||
| #NUM_THREADS ?= $(shell getconf _NPROCESSORS_CONF 2>/dev/null || echo 1) |
There was a problem hiding this comment.
Please discard the hard-coded num_threads, it should be set automatically
| target_link_libraries(velox_gpu_hash_table_test Folly::folly gflags::gflags) | ||
| set_target_properties(velox_gpu_hash_table_test PROPERTIES CUDA_ARCHITECTURES | ||
| native) | ||
| 75) |
There was a problem hiding this comment.
Could you please clarify the reason for the changes as well as the following similar changes.
|
|
||
| find_package(Torch REQUIRED) | ||
| find_package(xgboost REQUIRED) | ||
| find_package(CUDA REQUIRED) |
There was a problem hiding this comment.
This may lead to a compilation error when comping in a CPU-only option. I see Velox provides a flag, VELOX_ENABLE_GPU. I think it would be better to cooperate this configuration code with the flag, VELOX_ENABLE_GPU
| @@ -0,0 +1,53 @@ | |||
| #include "velox/ml_functions/gpufunctions.h" | |||
There was a problem hiding this comment.
Please move your .cu file to the ml_functions folder.
|
Hi Lixi,
In Makefile, I hard-coded the num-threads because I met the out-of-memory
problem when I used all the threads. I mistakenly uploaded this file, and I
can change it.
set_target_properties(velox_gpu_hash_table_test PROPERTIES CUDA_ARCHITECTURES
- native)
+ 75)
The reason why I changed "native" to '75' is because CMake does not support
'native' before version 3.24, The CMakeLists doesn't need to change if you
use CMake 3.24 or above.
https://cmake.org/cmake/help/latest/prop_tgt/CUDA_ARCHITECTURES.html
+find_package(CUDA REQUIRED)
For now, we used the 'use_gpu' flag and mixed the cuda code and CPU code,
To compile the GPU function, the cuda package is necessary.
If we want to compile in a CPU-only option, I think one way is to
separate the Matrix multiply class into two versions(CPU and GPU), another
way is we compile the GPU function first and provide it by a library.
…On Tue, Mar 19, 2024 at 10:22 AM Lixi Zhou ***@***.***> wrote:
***@***.**** commented on this pull request.
The CUDA code looks good to me. Please see the comments for the library
configuration, thank you.
------------------------------
In Makefile
<#42 (comment)>:
> @@ -63,7 +63,8 @@ GENERATOR += -DVELOX_FORCE_COLORED_OUTPUT=ON
endif
endif
-NUM_THREADS ?= $(shell getconf _NPROCESSORS_CONF 2>/dev/null || echo 1)
+#NUM_THREADS ?= $(shell getconf _NPROCESSORS_CONF 2>/dev/null || echo 1)
Please discard the hard-coded num_threads, it should be set automatically
------------------------------
In velox/experimental/gpu/tests/CMakeLists.txt
<#42 (comment)>:
> @@ -15,4 +15,4 @@
add_executable(velox_gpu_hash_table_test HashTableTest.cu)
target_link_libraries(velox_gpu_hash_table_test Folly::folly gflags::gflags)
set_target_properties(velox_gpu_hash_table_test PROPERTIES CUDA_ARCHITECTURES
- native)
+ 75)
Could you please clarify the reason for the changes as well as the
following similar changes.
------------------------------
In velox/ml_functions/CMakeLists.txt
<#42 (comment)>:
> @@ -19,10 +19,14 @@ set(CMAKE_PREFIX_PATH "$CONDA_PREFIX")
find_package(Torch REQUIRED)
find_package(xgboost REQUIRED)
+find_package(CUDA REQUIRED)
This may lead to a compilation error when comping in a CPU-only option. I
see Velox provides a flag, VELOX_ENABLE_GPU. I think it would be better to
cooperate this configuration code with the flag, VELOX_ENABLE_GPU
------------------------------
In velox/ml_functions/tests/GPUFunctions.cu
<#42 (comment)>:
> @@ -0,0 +1,53 @@
+#include "velox/ml_functions/gpufunctions.h"
Please move your .cu file to the ml_functions folder.
—
Reply to this email directly, view it on GitHub
<#42 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AXMZ6CJQGNKUUJFVFWFJM43YZBX5VAVCNFSM6AAAAABE4U2KQWVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTSNBWHA4DCMZYGA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Thanks for the feedback. In Velox, I think we can leverage this line's code: CMakeLists.txt#L371 to automatically set the cuda architecture.
Yes, I agree with you. Let me think about this and get back to you later about how to resolve it more conveniently. |
Yeah, but it did not work for me. CMake did not set the Cuda architecture, so I hard-coded it. |
This branch support GPU and need install CUDA and change libtorch-cpu to libtorch-gpu.