-
Notifications
You must be signed in to change notification settings - Fork 475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PJRT] PJRT_Client_Compile ignores xla_gpu_cuda_data_dir #21428
Comments
I see that the CompilerProvider is being created at startup time, and I don't see the options being passed to do using compilation, unfortunately. I see it being created using |
This is highly likely the offending commit: 7d8d569 |
Hi Steeve, Since this is a debug option it is supposed to be provided through XLA_FLAGS. Can you explain a little more about your use case? Are you using the flag to change between different CUDA data dirs with each Compile call or is it about setting the data dir programmatically? |
This is the way it should be [1]. Thankfully, other plugins don't use it. 1. openxla/xla#21428
Hi, we are working on https://github.com/zml/zml. Would moving the cuda data dir out of debug options and into GpuClientOptions an acceptable solution ? xla/xla/pjrt/plugin/xla_gpu/xla_gpu_client_options.h Lines 30 to 40 in 3e87afa
|
Putting something into xla_gpu_client_options.h sounds good to me. But I'm afraid I don't think that's gonna be enough. The main problem is that currently there is no way to pass any kind of parameter to the constructor of the Compiler instance (see
Then it has to be plumbed through all the way from the PJRT Client to where the Compiler class is accessed. I haven't looked yet how feasible that is. I'm happy to review a PR for a change along those lines. A hacky but easier solution is probably to set XLA_FLAGS (the environment variable) before doing anything in XLA. Then on first usage it should parse the flags from there and instantiate the compiler correctly. |
Change of plans. Please hold off from investing any time into this right now. I got two other reports about the same issue (also today interestingly). Since the problem is a lot more widespread than I thought I will try to fix it (do some hacky thing) in NVPTXCompiler directly so that no downstream changes are needed until we can fully move to library compilation which is hopefully coming soon. I hope that I have a fix by tomorrow. Does that work for you? |
yes, it works for us. Our current mitigation is to set XLA_FLAGS early in the process like you suggested. |
Hey, Can you please verify that it works for you and then close the bug as fixed - or complain to me if it doesn't. :-) |
Hi,
Since 5f1fe6a, it seems
CompileOptions.executable_build_options.debug_options.xla_gpu_cuda_data_dir
in theCompileOptions
pb passed as parameter is ignored.We're trying to bisect the offending commit, but this is a rather (a few weeks/months ago) regression.
Setting it via
XLA_FLAGS
works fine.Regards,
The text was updated successfully, but these errors were encountered: