-
Notifications
You must be signed in to change notification settings - Fork 84
Profile rocFFT kernels
in bash: "export HIP_TRACE_API=1" (reset by =0)
Launch your application, then it profiles every HIP APIs, including rocFFT kernels, memory copy and allocation/deallocation.
For more profiling tools, see Profiling and Debugging HIP Code
The IR and ISA can be dumped by setting the following environment variable before building and running the app.
export KMDUMPISA=1
export KMDUMPLLVM=1
export KMDUMPDIR=/path/to/dump
roprof is a tool very similar to nvprof. roprof is a command line tool to profile HIP kernels, roprof is located in /opt/rocm/profiler/bin
example usage
/opt/rocm/profiler/bin/rcprof -A ./your_executable
Then the dumped output apitrace.atp will be in your home directory.
View is with CodeXL GUI. Download and install CodeXL
Open CodeXL and create a project. Import the *.atp into the session. Notice: switch to profile mode and clock HSA mode (by default OpenCL mode) before importing the *.atp
/opt/rocm/profiler/bin/rcprof
--help for more options
"nvprof ./your_executable" to profile every CUDA runtime invocations including kernels, memory copy.