Skip to content

Commit

Permalink
Updated docs (#17)
Browse files Browse the repository at this point in the history
  • Loading branch information
Sebastian Pietras authored Oct 16, 2021
1 parent 160985f commit b6dd5fe
Show file tree
Hide file tree
Showing 7 changed files with 188 additions and 1 deletion.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -146,3 +146,5 @@ dmypy.json

# Cython debug symbols
cython_debug/

docs/site/
7 changes: 7 additions & 0 deletions docs/docs/assets/extra.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
.md-typeset__table {
min-width: 100%;
}

.md-typeset table:not([class]) {
display: table;
}
33 changes: 32 additions & 1 deletion docs/docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,35 @@

Rules for creating conda environments in Bazel 💚

TODO
See [here](usage/example.md) for usage example.

## Who should use this?

These rules allow you to download and install `conda`, create `conda` environments and register Python toolchain from environments.
This means you can achieve truly reproducible and hermetic local python environments.

Pros:

- easy to use
- no existing `conda` installation necessary
- no global `conda` installation, no global `PATH` modifications
- virtually impossible to corrupt your environment by mistake as it always reflects your `environment.yml`
- all Python targets will implicitly have access to the whole environment (the one registered in toolchain)

Cons:

- every time you update your environment configuration in `environment.yml`, the whole environment will be recreated from scratch (but cached package data can be reused)
- on Windows you need to add environment location to `PATH` or set `CONDA_DLL_SEARCH_MODIFICATION_ENABLE=1` during runtime, so DLLs can be loaded properly (more on that [here](usage/issues.md#path-issue))

So I think these rules suit you if:

- you want to use Bazel (e.g. you fell into Python monorepo trap)
- you want to use `conda` for Python environment management
- you don't want to set up your Python environment manually or want your Python targets to _just work_ on clean systems
- you are okay with environments being recreated every time something changes

## Requirements

`rules_conda` don't have any strict requirements by themselves.

Just make sure you are able to use [`conda`](https://docs.conda.io/en/latest/miniconda.html#system-requirements).
45 changes: 45 additions & 0 deletions docs/docs/usage/api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# API

## `load_conda`

!!! quote ""

Downloads `conda`.

**Parameters:**

| Name | Description | Default |
| ------------- | ------------------------------------------------------------------------------------------ | ---------------------- |
| `version` | Version of `conda` to download | `4.10.3` |
| `quiet` | `True` if `conda` output should be hidden | `True` |
| `timeout` | How many seconds each execute action can take | `3600` |

## `conda_create`

!!! quote ""

Creates a `conda` environment.

**Parameters:**

| Name | Description | Default |
| ------------- | ------------------------------------------------------------------------------------------ | ---------------------- |
| `environment` | label pointing to environment configuration file (typically named `environment.yml`) | |
| `name` | Name of the environment | `my_env` |
| `quiet` | `True` if `conda` output should be hidden | `True` |
| `timeout` | How many seconds each execute action can take | `3600` |
| `clean` | `True` if `conda` cache should be cleaned (less space taken, but slower subsequent builds) | `False` |

## `register_toolchain`

!!! quote ""

Register python toolchain from `conda` environment for all python targets to use.
Main environment is used in python3 toolchain. Optionally you can specify another one to use in python2 toolchain.

**Parameters:**

| Name | Description | Default |
| ------------- | ------------------------------------------------------------------------------------------ | ---------------------- |
| `py3_env` | Name of the environment to use | |
| `py2_env` | Name of the python2 environment to use (optional) | `None` |
64 changes: 64 additions & 0 deletions docs/docs/usage/example.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# Example usage

Let's say you want to write some python code.

The simplest structure would be something like this:

```
BUILD
environment.yml
main.py
WORKSPACE
```

First get familiar with [`rules_python`](https://github.com/bazelbuild/rules_python).
You should uses these rules to configure your Python project to work with Bazel.
I recommend that you first set everything up so that it works with your local python.
After that works you can move to using `rules_conda` for creating environments automatically.

To use `rules_conda` you need to add that to your `WORKSPACE` file:

```python
load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

http_archive(
name = "rules_conda",
sha256 = "8298379474beb05f815afc33a42eb1732f8ebdab3aa639569473eae75e6e072b",
url = "https://github.com/spietras/rules_conda/releases/download/0.0.5/rules_conda-0.0.5.zip"
)

load("@rules_conda//:defs.bzl", "conda_create", "load_conda", "register_toolchain")

load_conda(
quiet = False, # use True to hide conda output
version = "4.10.3", # optional, defaults to 4.10.3
)

conda_create(
name = "my_env",
timeout = 600, # each execute action can take up to 600 seconds
clean = False, # use True if you want to clean conda cache (less space taken, but slower subsequent builds)
environment = "@//:environment.yml", # label pointing to environment.yml file
quiet = False, # use True to hide conda output
)

register_toolchain(
py3_env = "my_env",
)
```

This will download `conda`, create your environment and register it so that all python targets can use it by default.

Now if you configured everything correctly, you can run:

```sh
bazel run main
```

This will run `main.py` inside the created environment.

If environment configuration doesn't change then subsequent runs will simply reuse the environment.
Otherwise the environment will be recreated from scratch, so that it always reflects the configuration.
However, if you set the `clean` flag to `False` in `conda_create` then the downloaded package data will be reused so you don't need to download everything everytime.

Also see [here](https://github.com/spietras/rules_conda/tree/main/example) for a complete example with all the code available.
31 changes: 31 additions & 0 deletions docs/docs/usage/issues.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Issues

## `PATH` issue

With usual `conda` usage, you should `activate` you environment before doing anything. Activating an environment prepends some paths to `PATH` variable. This is crucial on Windows, because some `conda` packages need to load DLLs, which are stored in `conda` environments and the path to them must be in `PATH` variable for Windows to properly load them. On Linux, it somehow works without having to modify `PATH`.

But here comes the issue: at this moment, I'm not aware of any way to either `activate` an environment before launching Python targets or adding anything to `PATH` automatically by Bazel.

So the user has to do something to resolve the `PATH` issue. There are two ways:

- Modify `PATH`

Before running the target, set the `PATH` to include the path to `your_env/Library/bin`. For example:

```cmd
cmd /C "set PATH={full path to workspace}\bazel-{name}\external\{env_name}\{env_name}\Library\bin;%PATH%&& bazel run {target}"
```
- Use `CONDA_DLL_SEARCH_MODIFICATION_ENABLE`
It originally stems from another issue, but Python from `conda` has the ability to automatically insert the correct entries to `PATH`. This is controlled by setting the `CONDA_DLL_SEARCH_MODIFICATION_ENABLE` to `1`.
So you can for example do:
```cmd
cmd /C "set CONDA_DLL_SEARCH_MODIFICATION_ENABLE=1&& bazel run {target}"
```
This method only works with newer Python builds. More information [here](https://docs.conda.io/projects/conda/en/latest/user-guide/troubleshooting.html#mkl-library).
In the future I hope that either `conda` (or Python, or Windows DLL loading, whatever is responsible for that) will change to work without activation or it will be possible to set environmetal variables inside Bazel.
7 changes: 7 additions & 0 deletions docs/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -71,5 +71,12 @@ extra:
- icon: material/github
link: https://github.com/spietras/rules_conda

extra_css:
- assets/extra.css

nav:
- Home: index.md
- Usage:
- Example: usage/example.md
- API: usage/api.md
- Issues: usage/issues.md

0 comments on commit b6dd5fe

Please sign in to comment.