Skip to content
This repository has been archived by the owner on Jun 9, 2023. It is now read-only.

Add CPU kernel builders #25

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

maxhgerlach
Copy link

Currently, it is not possible to add NVTX tracing to ops that may be executed on CPU rather than GPU. In that case one would run into exceptions like
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation .../NvtxStart: Could not satisfy explicit device specification '/device:CPU:0' because no supported kernel for CPU devices is available.

This can be fixed in a straight-forward manner by registering CPU kernels for NvtxStart and NvtxEnd. As far as I can tell, NVTX tracing should work fine for non-CUDA code, so this feels generally useful to me.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant