You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Merge CUDA and ROCm codes into one file and distinguish them using macros.
Detailed Description
There are lots of duplicated codes between CUDA and ROCm. Most of them are the same. Adding a new GPU method is troublesome as we need to add twice to both CUDA and ROCm codes.
I notice that TensorFlow implements CUDA and ROCm in only one file, such as https://github.com/tensorflow/tensorflow/blob/00a17d7451a789a4df994dac7d616ce2f4438ff0/tensorflow/core/kernels/relu_op_gpu.cu.cc#L16. Some different codes between them are controlled using macros, i.e., GOOGLE_CUDA and TENSORFLOW_USE_ROCM. A universal launch function, GpuLaunchKernel, is used for CUDA and ROCm.
Remove all `_cuda` or `_rocm` suffixes in function names, as proposed in
#2838. They can be merged in the following PRs.
(Replace all: `gpu_cuda` -> `gpu`; `gpu_rocm` -> `gpu`)
Signed-off-by: Jinzhe Zeng <[email protected]>
Summary
Merge CUDA and ROCm codes into one file and distinguish them using macros.
Detailed Description
There are lots of duplicated codes between CUDA and ROCm. Most of them are the same. Adding a new GPU method is troublesome as we need to add twice to both CUDA and ROCm codes.
I notice that TensorFlow implements CUDA and ROCm in only one file, such as https://github.com/tensorflow/tensorflow/blob/00a17d7451a789a4df994dac7d616ce2f4438ff0/tensorflow/core/kernels/relu_op_gpu.cu.cc#L16. Some different codes between them are controlled using macros, i.e.,
GOOGLE_CUDA
andTENSORFLOW_USE_ROCM
. A universal launch function,GpuLaunchKernel
, is used for CUDA and ROCm.Below is the roadmap for DeePMD-kit:
gpu_cuda.h
andgpu_rocm.h
merge cuda and rocm files #2844_rocm
suffix anymore remove_cuda
or_rocm
suffix #2839GOOGLE_CUDA || TENSORFLOW_USE_ROCM
for the same codelib/cuda
andlib/rocm
merge cuda and rocm files #2844op
merge CUDA and ROCm codes in op #2847lib/tests
merge CUDA and ROCm codes in tests #2846lib/include
merge CUDA and ROCm in header files #2845Further Information, Files, and Links
No response
The text was updated successfully, but these errors were encountered: