-
Notifications
You must be signed in to change notification settings - Fork 520
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge `source/lib/src/cuda` and `source/lib/src/rocm` into `source/lib/src/gpu`. - Define macros `gpuGetLastError`, `gpuDeviceSynchronize`, `gpuMemcpy`, `gpuMemcpyDeviceToHost`, `gpuMemcpyHostToDevice`, and `gpuMemset` to make them available for both CUDA and ROCm. - Use `<<< >>> syntax` for both CUDA and ROCm. Per ROCm/HIP@cf78d85, it has been supported in HIP since 2018. - Fix several int const numbers that should be double or float. - For tabulate: - Fix `WARP_SIZE` for ROCm. Per pytorch/pytorch#64302, WARP_SIZE can be 32 or 64, so it should not be hardcoded to 64. - Add `GpuShuffleSync`. Per ROCm/HIP#1491, `__shfl_sync` is not supported by HIP. - After merging the code, #1274 should also work for ROCm. - Use the same `ii` for #830 and #2357. Although both of them work, `ii` has different meanings in these two PRs, but now it should be the same. - However, `ii` in `tabulate_fusion_se_a_fifth_order_polynomial` (rocm) added by #2532 is wrong. After merging the codes, it should be corrected. - Optimization in #830 was not applied to ROCm. - `__syncwarp` is not supported by ROCm. - After merging the code, #2661 will be applied to ROCm. Although TF ROCm stream is still blocking (https://github.com/tensorflow/tensorflow/blob/9d1262082e761cd85d6726bcbdfdef331d6d72c6/tensorflow/compiler/xla/stream_executor/rocm/rocm_driver.cc#L566), we don't know whether it will change to non-blocking. - There are several other differences between CUDA and ROCm. --------- Signed-off-by: Jinzhe Zeng <[email protected]>
- Loading branch information
Showing
41 changed files
with
490 additions
and
3,878 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,3 @@ | ||
[submodule "source/lib/src/cuda/cub"] | ||
path = source/lib/src/cuda/cub | ||
[submodule "source/lib/src/gpu/cub"] | ||
path = source/lib/src/gpu/cub | ||
url = https://github.com/NVIDIA/cub.git |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,95 @@ | ||
if(USE_CUDA_TOOLKIT) | ||
# required cmake version 3.23: CMAKE_CUDA_ARCHITECTURES all | ||
cmake_minimum_required(VERSION 3.23) | ||
# project name | ||
project(deepmd_op_cuda) | ||
set(GPU_LIB_NAME deepmd_op_cuda) | ||
|
||
set(CMAKE_CUDA_ARCHITECTURES all) | ||
enable_language(CUDA) | ||
set(CMAKE_CUDA_STANDARD 11) | ||
add_compile_definitions( | ||
"$<$<COMPILE_LANGUAGE:CUDA>:_GLIBCXX_USE_CXX11_ABI=${OP_CXX_ABI}>") | ||
|
||
find_package(CUDAToolkit REQUIRED) | ||
|
||
# take dynamic open cudart library replace of static one so it's not required | ||
# when using CPUs | ||
add_subdirectory(cudart) | ||
|
||
# nvcc -o libdeepmd_op_cuda.so -I/usr/local/cub-1.8.0 -rdc=true | ||
# -DHIGH_PREC=true -gencode arch=compute_61,code=sm_61 -shared -Xcompiler | ||
# -fPIC deepmd_op.cu -L/usr/local/cuda/lib64 -lcudadevrt very important here! | ||
# Include path to cub. for searching device compute capability, | ||
# https://developer.nvidia.com/cuda-gpus | ||
|
||
# cub has been included in CUDA Toolkit 11, we do not need to include it any | ||
# more see https://github.com/NVIDIA/cub | ||
if(${CMAKE_CUDA_COMPILER_VERSION} VERSION_LESS "11") | ||
include_directories(cub) | ||
endif() | ||
if(${CMAKE_CUDA_COMPILER_VERSION} VERSION_LESS "9") | ||
message(FATAL_ERROR "CUDA version must be >= 9.0") | ||
endif() | ||
|
||
message(STATUS "NVCC version is " ${CMAKE_CUDA_COMPILER_VERSION}) | ||
|
||
# arch will be configured by CMAKE_CUDA_ARCHITECTURES | ||
set(CMAKE_CUDA_FLAGS | ||
"${CMAKE_CUDA_FLAGS} -DCUB_IGNORE_DEPRECATED_CPP_DIALECT -DCUB_IGNORE_DEPRECATED_CPP_DIALECT" | ||
) | ||
|
||
file(GLOB SOURCE_FILES "*.cu") | ||
|
||
add_library(${GPU_LIB_NAME} SHARED ${SOURCE_FILES}) | ||
target_link_libraries(${GPU_LIB_NAME} PRIVATE deepmd_dyn_cudart) | ||
|
||
elseif(USE_ROCM_TOOLKIT) | ||
|
||
# required cmake version | ||
cmake_minimum_required(VERSION 3.21) | ||
# project name | ||
project(deepmd_op_rocm) | ||
set(GPU_LIB_NAME deepmd_op_rocm) | ||
set(CMAKE_LINK_WHAT_YOU_USE TRUE) | ||
|
||
# set c++ version c++11 | ||
set(CMAKE_CXX_STANDARD 14) | ||
set(CMAKE_HIP_STANDARD 14) | ||
add_definitions("-DCUB_IGNORE_DEPRECATED_CPP_DIALECT") | ||
add_definitions("-DCUB_IGNORE_DEPRECATED_CPP_DIALECT") | ||
|
||
message(STATUS "HIP major version is " ${HIP_VERSION_MAJOR}) | ||
|
||
set(HIP_HIPCC_FLAGS -fno-gpu-rdc; -fPIC --std=c++14 ${HIP_HIPCC_FLAGS} | ||
)# --amdgpu-target=gfx906 | ||
if(HIP_VERSION VERSION_LESS 3.5.1) | ||
set(HIP_HIPCC_FLAGS -hc; ${HIP_HIPCC_FLAGS}) | ||
endif() | ||
|
||
file(GLOB SOURCE_FILES "*.cu") | ||
|
||
hip_add_library(${GPU_LIB_NAME} SHARED ${SOURCE_FILES}) | ||
|
||
endif() | ||
|
||
target_include_directories( | ||
${GPU_LIB_NAME} | ||
PUBLIC $<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}/../../include/> | ||
$<INSTALL_INTERFACE:include>) | ||
target_precompile_headers(${GPU_LIB_NAME} PUBLIC [["device.h"]]) | ||
if(APPLE) | ||
set_target_properties(${GPU_LIB_NAME} PROPERTIES INSTALL_RPATH @loader_path) | ||
else() | ||
set_target_properties(${GPU_LIB_NAME} PROPERTIES INSTALL_RPATH "$ORIGIN") | ||
endif() | ||
|
||
if(BUILD_CPP_IF AND NOT BUILD_PY_IF) | ||
install( | ||
TARGETS ${GPU_LIB_NAME} | ||
EXPORT ${CMAKE_PROJECT_NAME}Targets | ||
DESTINATION lib/) | ||
endif(BUILD_CPP_IF AND NOT BUILD_PY_IF) | ||
if(BUILD_PY_IF) | ||
install(TARGETS ${GPU_LIB_NAME} DESTINATION deepmd/lib/) | ||
endif(BUILD_PY_IF) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Submodule cub
updated
from 000000 to b22981
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Oops, something went wrong.