-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cross-platform native launchers for Python #275
base: main
Are you sure you want to change the base?
Changes from all commits
7aea8d8
a5fe5f7
7e3c166
ef27d65
b870229
a6fc56e
a54cb0d
b3d7357
c50aca9
91c779a
0d7c635
6c584e2
d3cdbe8
2368a55
72b21b5
f3ddda3
2c2156f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
--- | ||
created: 2022-09-12 | ||
last updated: 2022-09-12 | ||
status: To be reviewed | ||
reviewers: | ||
- TODO | ||
title: Cross-platform native launchers for Python | ||
authors: | ||
- groodt | ||
--- | ||
|
||
|
||
# Abstract | ||
|
||
This document describes an approach for launching `py_binary` artifacts hermetically using the resolved Python toolchain. | ||
|
||
|
||
# Background | ||
|
||
Currently, `py_binary` is non-hermetic and launches inconsistently between platforms. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think something worth mentioning is the "Python Launcher for Windows". Basically, a python.org Windows installs have a See |
||
|
||
On macos and Linux, there is a [python_stub](https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/build/lib/bazel/rules/python/python_stub_template.txt) | ||
that is non-hermetic and requires a bootstrap Python interpreter on the host. The "shebang" can be overridden, but | ||
a "shebang" is always dependent on the runtime host. | ||
|
||
On Windows, there is a [native launcher](https://github.com/meteorcloudy/bazel/blob/master/src/tools/launcher/python_launcher.cc) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you meant to link to bazelbuild here, not meteorcloudy? https://github.com/bazelbuild/bazel/blob/master/src/tools/launcher/python_launcher.cc |
||
that launches `python.exe` on the host which then launches the `py_binary` with the same `python_stub` as macos and Linux. | ||
|
||
Related issues: | ||
* [py_binary with hermetic toolchain requires a system interpreter](https://github.com/bazelbuild/rules_python/issues/691) | ||
* [Neither python_top nor python toolchain works with starlark actions on windows](https://github.com/bazelbuild/bazel/issues/7947#issuecomment-495265016) | ||
|
||
This situation is undesirable because it assumes that the target platform has a bootstrapping python interpreter | ||
available and makes the hermetic Python interpreters available with `rules_python` less useful. It is also surprising to | ||
users who expect Bazel to output self-contained binary artifacts for a target platform. | ||
|
||
The reason this situation exists is because of bootstrapping. Ultimately, *something* needs to find the Python | ||
interpreter in the runfiles and use that to launch the program. Currently, Bazel assumes the target platform will | ||
be able to provide the bootstrapping functionality. | ||
|
||
|
||
# Proposal | ||
|
||
Extend the native launcher functionality to all platforms and use it to locate the relevant Python interpreter and | ||
Python program in the `runfiles` tree to launch the `py_binary`. No assumptions should be made about the target platform. | ||
|
||
In pseudo-code, the proposal is as follows: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is pretty high-level psuedo-code :). Something a little more concrete would be better. e.g., it has to find the runfiles directory to resolve the relative path names. |
||
|
||
``` | ||
exec(env, runfiles-interpreter, ["interpreter_arg1",], "main.py", ["arg1",]) | ||
``` | ||
|
||
| Token | Description | | ||
| ---------------------- | ----------- | | ||
| env | Dictionary of key-value pairs for the environment of the process | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't see why env is one of the inputs? This basically implies that the launcher process may need to use a modified environment from the actual program -- what's the motivation case for this? Why would it not just inherit the existing environment? Ah, one case I just thought of: LD_PRELOAD (or equiv). Basically, a binary might require such a setting and we wouldn't want the launcher itself to use that (and by "might" i mean, we have this feature internally at Google) |
||
| runfiles-interpreter | The resolved python toolchain in runfiles | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The description here doesn't quite make sense with what the arg name implies. The arg name sounds like a string path. The description is "the python toolchain", which is a complex object. |
||
| ["interpreter_arg1",] | An array of arguments to provide to the python interpreter | | ||
| "main.py" | The python program to launch in runfiles | | ||
| ["arg1",] | An array of arguments to provide to the python program as sys.argv[1:] | | ||
|
||
This native launcher idea has been proposed a few times by bazel contributors and the community: | ||
* [Greg Roodt (Community)](https://github.com/bazelbuild/rules_python/issues/691#issuecomment-1174935972) | ||
* [Yun Peng (Google)](https://github.com/bazelbuild/bazel/issues/7947#issuecomment-495265016) | ||
* [Richard Levasseur (Google)](https://github.com/bazelbuild/rules_python/issues/691#issuecomment-1186379617) | ||
|
||
Some related work has been done that fixes Linux to Windows cross-builds of the Windows launcher. See: [Fix Linux to Windows cross compilation of py_binary, java_binary and sh_binary using MinGW](https://github.com/bazelbuild/bazel/pull/16019) | ||
This proposal would aim to go further and have these launchers available on all platforms, including cross_builds where appropriate toolchains are in place. | ||
|
||
Once this proposal is implemented, it would enable cross-builds of hermetic `py_binary` for all major platforms. It | ||
would also remove the complexity introduced by having so many chains of nested execution to launch a Python program. | ||
|
||
Finally, while this proposal is specific to Python, this solution could perhaps be reused for `java_binary`, `sh_binary` | ||
and perhaps be made available for any custom rules that require an interpreter to launch. | ||
|
||
|
||
# Backward-compatibility | ||
|
||
This proposal could require users to setup a cc toolchain for remote execution. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it required to be CC? e.g., what if someone wants to write a launcher in rust? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, does not need to be CC. I think the launcher needs to be compiled as native in some way. So Go, Zig, Rust, CC all come to mind. Whatever has the most minimal toolchain requirements on the user and gives the functionality we require I think. I think whatever is used as a launcher, needs to be standalone from Bazel once built. I don't want to ship bazel to run a binary. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I agree. The part I'm trying to think through is re-use of what is essentially the same binary (the launcher itself). For a target-config build target, I agree, yes, the launcher essentially needs to be self-contained and standalone. I don't see how to do it otherwise because the invocation of For a build tool, the situation is different[1] -- stuff run during the build doesn't need the stricter isolation requirements. For example, when Bazel runs an executable in an action, it could avoid having to build the launcher entirely by doing the exec() call itself when it runs the subprocess. Maybe a target could return e.g. This then leads me to think that, if a rule returned that, Bazel itself could invoke the launcher building instead of the rule having to do so. Which has a Just Works sort of appeal; but risks coupling behavior to the Bazel release (which might be more of a headache). [1] This case is particularly on my mind because Python is often used for build tools, and building the runtime and C dependencies is pretty expensive, so reuse of that is highly beneficial. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I'm +1 on the core idea.