Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
build/
graphbolt/build/
dgl_sparse/build/
tensoradapter/pytorch/build/
python/build/
python/dist/
python/*.egg-info/
python/libdgl.so
1 change: 1 addition & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -634,6 +634,7 @@ if (BUILD_GRAPHBOLT)
ALL
${CMAKE_COMMAND} -E env
CMAKE_COMMAND=${CMAKE_CMD}
CMAKE_PREFIX_PATH="${CMAKE_PREFIX_PATH}"
PYTORCH_ROCM_ARCH=${CMAKE_HIP_ARCHITECTURES}
GPU_TARGETS=${CMAKE_HIP_ARCHITECTURES}
CMAKE_HIP_ARCHITECTURES=${CMAKE_HIP_ARCHITECTURES}
Expand Down
2 changes: 1 addition & 1 deletion CMakePresets.json
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
"CMAKE_CXX_FLAGS": "-fdiagnostics-color=always",
"CMAKE_HIP_FLAGS": "-ftime-trace -fdiagnostics-color=always",
"CMAKE_EXPORT_COMPILE_COMMANDS": "ON",
"CMAKE_PREFIX_PATH": "/opt/rocm/lib/cmake",
"CMAKE_PREFIX_PATH": "/opt/rocm/lib/cmake:/opt/rocm/lib/rapids/cmake",
"CMAKE_COLOR_DIAGNOSTICS": "ON"
}
},
Expand Down
9 changes: 1 addition & 8 deletions docker/Dockerfile.ci_gpu_rocm
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
# Licensed under the Apache License Version 2.0"

#############################################################################
ARG BASE_IMAGE=rocm/pytorch:rocm7.0_ubuntu24.04_py3.12_pytorch_release_2.6.0
ARG BASE_IMAGE=rocm/pytorch:rocm7.0.2_ubuntu24.04_py3.12_pytorch_release_2.6.0
FROM ${BASE_IMAGE} AS dgl_build

# NOTE: This dockerfile **assumes** that BASE_IMAGE comes with the appropriate
Expand All @@ -28,13 +28,6 @@ ENV DGL_SRC_DIR="/src/dgl"
RUN mkdir -p ${DGL_SRC_DIR}
COPY . ${DGL_SRC_DIR}

# Clean up remnants of any previous builds
RUN rm -rf ${DGL_SRC_DIR}/build
RUN rm -rf ${DGL_SRC_DIR}/graphbolt/build
RUN rm -rf ${DGL_SRC_DIR}/dgl_sparse/build
RUN rm -rf ${DGL_SRC_DIR}/tensoradapter/pytorch/build
RUN rm -rf ${DGL_SRC_DIR}/python/build ${DGL_SRC_DIR}/python/dist ${DGL_SRC_DIR}/python/*.egg-info ${DGL_SRC_DIR}/python/libdgl.so

# Set GPU build targets
ARG ARG_GPU_BUILD_TARGETS="gfx90a,gfx942"
ENV GPU_BUILD_TARGETS=${ARG_GPU_BUILD_TARGETS}
Expand Down
7 changes: 1 addition & 6 deletions graphbolt/src/cuda/cooperative_minibatching_utils.cu
Original file line number Diff line number Diff line change
Expand Up @@ -54,12 +54,7 @@ torch::Tensor RankAssignment(
THRUST_CALL(
transform, nodes_ptr, nodes_ptr + nodes.numel(), part_ids_ptr,

#ifdef GRAPHBOLT_USE_HIP
::proclaim_return_type
#else
::cuda::proclaim_return_type
#endif
<part_t>(
::cuda::proclaim_return_type<part_t>(
[rank = static_cast<uint32_t>(rank),
world_size = static_cast<uint32_t>(
world_size)] __device__(index_t id) -> part_t {
Expand Down
14 changes: 4 additions & 10 deletions graphbolt/src/cuda/extension/gpu_graph_cache.cu
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,7 @@

#include <cstddef>
#ifdef GRAPHBOLT_USE_HIP
#include <cuco/cuda_stream_ref.hpp>
#include <hipcub/hipcub.hpp>
namespace cuda {
using stream_ref = cuco::cuda_stream_ref;
}
#define C10_CUDA_KERNEL_LAUNCH_CHECK C10_HIP_KERNEL_LAUNCH_CHECK
#else
#include <cub/cub.cuh>
Expand Down Expand Up @@ -510,12 +506,10 @@ std::tuple<torch::Tensor, std::vector<torch::Tensor>> GpuGraphCache::Replace(
}
if (edge_id_offsets) {
// Append the edge ids as the last element of the output.
output_edge_tensors.push_back(
ops::IndptrEdgeIdsImpl(
output_indptr, output_indptr.scalar_type(),
*edge_id_offsets,
static_cast<int64_t>(
static_cast<indptr_t>(output_size))));
output_edge_tensors.push_back(ops::IndptrEdgeIdsImpl(
output_indptr, output_indptr.scalar_type(),
*edge_id_offsets,
static_cast<int64_t>(static_cast<indptr_t>(output_size))));
Comment on lines +509 to +512
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just a reformatting? Maybe revert, as it seems unrelated?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just reformatting, but it seems like it was incorrect before. When I made changes to the file, it then tripped the CI checks here so even though it's separate from the important changes here, I think we should keep it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think such linters should really only be looking at modified lines, but I see it comes from upstream, so no need to fight it.

}

{
Expand Down
33 changes: 5 additions & 28 deletions graphbolt/src/cuda/extension/unique_and_compact_map.cu
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,6 @@

#ifdef GRAPHBOLT_USE_HIP
#include <hipcub/hipcub.hpp>
#include <cuco/cuda_stream_ref.hpp>
namespace cuda{
using stream_ref = cuco::cuda_stream_ref;
}
#define C10_CUDA_KERNEL_LAUNCH_CHECK C10_HIP_KERNEL_LAUNCH_CHECK
#else
#include <cub/cub.cuh>
Expand Down Expand Up @@ -209,12 +205,8 @@ UniqueAndCompactBatchedHashMapBased(
cub::ArgIndexInputIterator index_it(indexes.data_ptr<int32_t>());
auto input_it = thrust::make_transform_iterator(
index_it,
#ifdef GRAPHBOLT_USE_HIP
::proclaim_return_type
#else
::cuda::proclaim_return_type
#endif
<::cuda::std::tuple<int64_t*, index_t, int32_t, bool>>(
::cuda::proclaim_return_type<
::cuda::std::tuple<int64_t*, index_t, int32_t, bool>>(
[=, map = map.ref(cuco::find)] __device__(auto it)
-> ::cuda::std::tuple<int64_t*, index_t, int32_t, bool> {
const auto i = it.key;
Expand Down Expand Up @@ -247,12 +239,7 @@ UniqueAndCompactBatchedHashMapBased(
auto unique_ids_offsets_dev_ptr =
unique_ids_offsets_dev.data_ptr<int64_t>();
auto output_it = thrust::make_tabulate_output_iterator(
#ifdef GRAPHBOLT_USE_HIP
::proclaim_return_type
#else
::cuda::proclaim_return_type
#endif
<void>(
::cuda::proclaim_return_type<void>(
[=, unique_ids_ptr = unique_ids.data_ptr<index_t>(),
part_ids_ptr =
part_ids ? part_ids->data_ptr<cuda::part_t>() : nullptr,
Expand All @@ -276,12 +263,7 @@ UniqueAndCompactBatchedHashMapBased(
DeviceSelect::If, input_it, output_it,
unique_ids_offsets_dev_ptr + num_batches,
offsets_ptr[2 * num_batches],
#ifdef GRAPHBOLT_USE_HIP
::proclaim_return_type
#else
::cuda::proclaim_return_type
#endif
<bool>([] __device__(const auto& t) {
::cuda::proclaim_return_type<bool>([] __device__(const auto& t) {
return ::cuda::std::get<3>(t);
}));
auto unique_ids_offsets = torch::empty(
Expand All @@ -300,12 +282,7 @@ UniqueAndCompactBatchedHashMapBased(
thrust::make_zip_iterator(
unique_ids_offsets_dev2.data_ptr<int64_t>(),
unique_ids_offsets.data_ptr<int64_t>()),
#ifdef GRAPHBOLT_USE_HIP
::proclaim_return_type
#else
::cuda::proclaim_return_type
#endif
<
::cuda::proclaim_return_type<
thrust::tuple<int64_t, int64_t>>(
[=] __device__(const auto x) {
return thrust::make_tuple(x, x);
Expand Down
85 changes: 30 additions & 55 deletions script/install_graphbolt_deps.sh
Original file line number Diff line number Diff line change
@@ -1,66 +1,41 @@
#!/usr/bin/env bash
export CC=/opt/rocm/llvm/bin/clang
export CXX=/opt/rocm/llvm/bin/clang++

# set the install prefix to the cwd/install
# INSTALL_PREFIX=$(pwd)/install
INSTALL_PREFIX=/opt/rocm
FILE_SOURCE_DIR=$(dirname $(realpath $0))
DEPS_DIR=$(pwd)
ROCM_ROOT=/opt/rocm

# Not installed by default
git clone https://github.com/ROCm/libhipcxx.git
cd libhipcxx
git checkout v2.2.0
cmake -B build \
-DCMAKE_INSTALL_PREFIX=${INSTALL_PREFIX}
cmake --build build --target install
cd ${DEPS_DIR}
export CC=${ROCM_ROOT}/llvm/bin/clang
export CXX=${ROCM_ROOT}/llvm/bin/clang++

# Need to patch for https://github.com/ROCm/rocm-libraries/issues/101.
# Should be fixed in
# https://github.com/ROCm/rocm-libraries/commit/e403601a2abe4a305cafd6526af2dc9bc69823e2#diff-7579081ee4dda43a07274a2397b8277bfa022af6d485ba086efc66a124ee8f5b
git clone https://github.com/tpopp/rocThrust.git
cd rocThrust
git checkout 613db9a025709fb18f2a676543a17850bd231b04
cmake -B build \
-DCMAKE_INSTALL_PREFIX=${INSTALL_PREFIX}
cmake --build build --target install
cd ${DEPS_DIR}
set -x
INSTALL_PREFIX=${ROCM_ROOT}
FILE_SOURCE_DIR=$(dirname $(realpath $0))
DEPS_DIR=$(pwd)
export CMAKE_PREFIX_PATH="/opt/rocm/hip/lib/cmake;/opt/rocm/lib/cmake"

# Need to patch for https://github.com/ROCm/hipCollections/issues/7, https://github.com/ROCm/hipCollections/issues/8, https://github.com/ROCm/hipCollections/issues/9
git clone https://github.com/tpopp/hipCollections.git
git clone https://github.com/ROCm/hipCollections.git -b release/rocmds-25.10
export RAPIDS_CMAKE_SCRIPT_BRANCH=release/rocmds-25.10
cd hipCollections
git checkout 6e31da8fd309f229d28adde8583a30bb4efaf1b7
cmake -B build \
-DCMAKE_INSTALL_PREFIX=${INSTALL_PREFIX} -DINSTALL_CUCO=ON -DBUILD_TESTS=OFF -DBUILD_BENCHMARKS=OFF -DBUILD_EXAMPLES=OFF
cmake --build build --target install
cd ${DEPS_DIR}


# if ROCM < 7.0 we also need to install rocThrust
ROCM_VERSION=$(/opt/rocm/bin/hipconfig --version)
#strip the major version from the ROCM_VERSION (before the dot)
ROCM_VERSION=${ROCM_VERSION%%.*}
echo "Working with ROCm Major Version: $ROCM_VERSION"
if [ "$ROCM_VERSION" -lt "7" ]; then

# Need to patch for https://github.com/ROCm/rocm-libraries/issues/94. Fixed in https://github.com/ROCm/rocm-libraries/commit/2539bb2e1cd17d287f532a65125b662bf0b658dc
git clone https://github.com/tpopp/hipCUB.git
cd hipCUB
git checkout f342111197dd020f1c4210b16aa550b08992e97b
cmake -B build \
-DCMAKE_INSTALL_PREFIX=${INSTALL_PREFIX}
cmake --build build --target install
cd ${DEPS_DIR}
else
echo "ROCm Major Version is 7.0 or higher, skipping hipCUB installation"
# TODO remove this once the patches are merged
# Right now we need to patch the rocPRIM headers to fix the build because these
# config headers are missing gfx942 (I've added them manually)
cp ${FILE_SOURCE_DIR}/*.hpp ${INSTALL_PREFIX}/include/rocprim/device/detail/config/.

fi



# TODO this is an unacceptable way to do this,
# see https://github.com/ROCm/libhipcxx/issues/10 for more details
# This was implicitly not allowed in previous releases we were using,
# but with v2.7.0 they are explicitly not allowed.

# We only use semaphores for a counter of IO operations in graphbolt,
# that only runs on the host (not on the device) so we should be "safe"
# to use this for now.
sed -i '/#error semaphore is not supported on AMD hardware and should not be included/d' ${INSTALL_PREFIX}/include/rapids/libhipcxx/cuda/semaphore
sed -i '/#error semaphore is not supported on AMD hardware and should not be included/d' ${INSTALL_PREFIX}/include/rapids/libhipcxx/hip/semaphore
sed -i '/#error semaphore is not supported on AMD hardware and should not be included/d' ${INSTALL_PREFIX}/include/rapids/libhipcxx/cuda/std/semaphore
sed -i '/#error semaphore is not supported on AMD hardware and should not be included/d' ${INSTALL_PREFIX}/include/rapids/libhipcxx/hip/std/semaphore

# TODO remove this once the patches are merged
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend tagging all TODOs with a corresponding GitHub issue, e.g. TODO(#123)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea this is a miss by myself when I added this todo a couple of weeks ago. It's tracked with an internal ticket and I'm trying to find the public reference. I think it's already been merged into the rocm-libraries, but those changes weren't included in several important releases of rocm (e.g. 7.0.0 and 7.1.0). I'll track down the links and add them here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it was merged into the develop branch on rocm-libraries here: ROCm/rocm-libraries#1883. Not sure when it will get included in the rocm release though. I've added a link to this PR and more comments here to better document this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was suggesting something like making an issue in this repo to fix this and linking this TODO to that issue. Same with the TODO above. So it would be

Suggested change
# TODO remove this once the patches are merged
# TODO(#1234) remove this once the patches are merged upstream

and then issue dmlc#1234 can have all the necessary detail. I think it would also be more self-explanatory if rather than copying all the lose header files in this directory, this applied this upstream commit as a patch, but anyways, this is all kind of unrelated to your change here.

# the patches for this were merged in https://github.com/ROCm/rocm-libraries/pull/1883
# but may take more time to be released.

# Right now we need to patch the rocPRIM headers to fix the build because these
# config headers are missing gfx942 (I've added them manually)
cp ${FILE_SOURCE_DIR}/*.hpp ${INSTALL_PREFIX}/include/rocprim/device/detail/config/.