diff --git a/.gitignore b/.gitignore index 4401ef1..5bac597 100644 --- a/.gitignore +++ b/.gitignore @@ -146,3 +146,5 @@ dmypy.json # Cython debug symbols cython_debug/ + +docs/site/ diff --git a/docs/docs/assets/extra.css b/docs/docs/assets/extra.css new file mode 100644 index 0000000..b755bd8 --- /dev/null +++ b/docs/docs/assets/extra.css @@ -0,0 +1,7 @@ +.md-typeset__table { + min-width: 100%; +} + +.md-typeset table:not([class]) { + display: table; +} diff --git a/docs/docs/index.md b/docs/docs/index.md index f9b7c18..c2dab23 100644 --- a/docs/docs/index.md +++ b/docs/docs/index.md @@ -2,4 +2,35 @@ Rules for creating conda environments in Bazel 💚 -TODO +See [here](usage/example.md) for usage example. + +## Who should use this? + +These rules allow you to download and install `conda`, create `conda` environments and register Python toolchain from environments. +This means you can achieve truly reproducible and hermetic local python environments. + +Pros: + +- easy to use +- no existing `conda` installation necessary +- no global `conda` installation, no global `PATH` modifications +- virtually impossible to corrupt your environment by mistake as it always reflects your `environment.yml` +- all Python targets will implicitly have access to the whole environment (the one registered in toolchain) + +Cons: + +- every time you update your environment configuration in `environment.yml`, the whole environment will be recreated from scratch (but cached package data can be reused) +- on Windows you need to add environment location to `PATH` or set `CONDA_DLL_SEARCH_MODIFICATION_ENABLE=1` during runtime, so DLLs can be loaded properly (more on that [here](usage/issues.md#path-issue)) + +So I think these rules suit you if: + +- you want to use Bazel (e.g. you fell into Python monorepo trap) +- you want to use `conda` for Python environment management +- you don't want to set up your Python environment manually or want your Python targets to _just work_ on clean systems +- you are okay with environments being recreated every time something changes + +## Requirements + +`rules_conda` don't have any strict requirements by themselves. + +Just make sure you are able to use [`conda`](https://docs.conda.io/en/latest/miniconda.html#system-requirements). diff --git a/docs/docs/usage/api.md b/docs/docs/usage/api.md new file mode 100644 index 0000000..51e0893 --- /dev/null +++ b/docs/docs/usage/api.md @@ -0,0 +1,45 @@ +# API + +## `load_conda` + +!!! quote "" + + Downloads `conda`. + + **Parameters:** + + | Name | Description | Default | + | ------------- | ------------------------------------------------------------------------------------------ | ---------------------- | + | `version` | Version of `conda` to download | `4.10.3` | + | `quiet` | `True` if `conda` output should be hidden | `True` | + | `timeout` | How many seconds each execute action can take | `3600` | + +## `conda_create` + +!!! quote "" + + Creates a `conda` environment. + + **Parameters:** + + | Name | Description | Default | + | ------------- | ------------------------------------------------------------------------------------------ | ---------------------- | + | `environment` | label pointing to environment configuration file (typically named `environment.yml`) | | + | `name` | Name of the environment | `my_env` | + | `quiet` | `True` if `conda` output should be hidden | `True` | + | `timeout` | How many seconds each execute action can take | `3600` | + | `clean` | `True` if `conda` cache should be cleaned (less space taken, but slower subsequent builds) | `False` | + +## `register_toolchain` + +!!! quote "" + + Register python toolchain from `conda` environment for all python targets to use. + Main environment is used in python3 toolchain. Optionally you can specify another one to use in python2 toolchain. + + **Parameters:** + + | Name | Description | Default | + | ------------- | ------------------------------------------------------------------------------------------ | ---------------------- | + | `py3_env` | Name of the environment to use | | + | `py2_env` | Name of the python2 environment to use (optional) | `None` | diff --git a/docs/docs/usage/example.md b/docs/docs/usage/example.md new file mode 100644 index 0000000..f775db0 --- /dev/null +++ b/docs/docs/usage/example.md @@ -0,0 +1,64 @@ +# Example usage + +Let's say you want to write some python code. + +The simplest structure would be something like this: + +``` +BUILD +environment.yml +main.py +WORKSPACE +``` + +First get familiar with [`rules_python`](https://github.com/bazelbuild/rules_python). +You should uses these rules to configure your Python project to work with Bazel. +I recommend that you first set everything up so that it works with your local python. +After that works you can move to using `rules_conda` for creating environments automatically. + +To use `rules_conda` you need to add that to your `WORKSPACE` file: + +```python +load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive") + +http_archive( + name = "rules_conda", + sha256 = "8298379474beb05f815afc33a42eb1732f8ebdab3aa639569473eae75e6e072b", + url = "https://github.com/spietras/rules_conda/releases/download/0.0.5/rules_conda-0.0.5.zip" +) + +load("@rules_conda//:defs.bzl", "conda_create", "load_conda", "register_toolchain") + +load_conda( + quiet = False, # use True to hide conda output + version = "4.10.3", # optional, defaults to 4.10.3 +) + +conda_create( + name = "my_env", + timeout = 600, # each execute action can take up to 600 seconds + clean = False, # use True if you want to clean conda cache (less space taken, but slower subsequent builds) + environment = "@//:environment.yml", # label pointing to environment.yml file + quiet = False, # use True to hide conda output +) + +register_toolchain( + py3_env = "my_env", +) +``` + +This will download `conda`, create your environment and register it so that all python targets can use it by default. + +Now if you configured everything correctly, you can run: + +```sh +bazel run main +``` + +This will run `main.py` inside the created environment. + +If environment configuration doesn't change then subsequent runs will simply reuse the environment. +Otherwise the environment will be recreated from scratch, so that it always reflects the configuration. +However, if you set the `clean` flag to `False` in `conda_create` then the downloaded package data will be reused so you don't need to download everything everytime. + +Also see [here](https://github.com/spietras/rules_conda/tree/main/example) for a complete example with all the code available. diff --git a/docs/docs/usage/issues.md b/docs/docs/usage/issues.md new file mode 100644 index 0000000..03b7a7c --- /dev/null +++ b/docs/docs/usage/issues.md @@ -0,0 +1,31 @@ +# Issues + +## `PATH` issue + +With usual `conda` usage, you should `activate` you environment before doing anything. Activating an environment prepends some paths to `PATH` variable. This is crucial on Windows, because some `conda` packages need to load DLLs, which are stored in `conda` environments and the path to them must be in `PATH` variable for Windows to properly load them. On Linux, it somehow works without having to modify `PATH`. + +But here comes the issue: at this moment, I'm not aware of any way to either `activate` an environment before launching Python targets or adding anything to `PATH` automatically by Bazel. + +So the user has to do something to resolve the `PATH` issue. There are two ways: + +- Modify `PATH` + + Before running the target, set the `PATH` to include the path to `your_env/Library/bin`. For example: + + ```cmd + cmd /C "set PATH={full path to workspace}\bazel-{name}\external\{env_name}\{env_name}\Library\bin;%PATH%&& bazel run {target}" + ``` + +- Use `CONDA_DLL_SEARCH_MODIFICATION_ENABLE` + + It originally stems from another issue, but Python from `conda` has the ability to automatically insert the correct entries to `PATH`. This is controlled by setting the `CONDA_DLL_SEARCH_MODIFICATION_ENABLE` to `1`. + + So you can for example do: + + ```cmd + cmd /C "set CONDA_DLL_SEARCH_MODIFICATION_ENABLE=1&& bazel run {target}" + ``` + + This method only works with newer Python builds. More information [here](https://docs.conda.io/projects/conda/en/latest/user-guide/troubleshooting.html#mkl-library). + +In the future I hope that either `conda` (or Python, or Windows DLL loading, whatever is responsible for that) will change to work without activation or it will be possible to set environmetal variables inside Bazel. diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml index aad6470..51bc6be 100644 --- a/docs/mkdocs.yml +++ b/docs/mkdocs.yml @@ -71,5 +71,12 @@ extra: - icon: material/github link: https://github.com/spietras/rules_conda +extra_css: + - assets/extra.css + nav: - Home: index.md + - Usage: + - Example: usage/example.md + - API: usage/api.md + - Issues: usage/issues.md