From 1b3f20847e4211a484c877e19dae273042734f5e Mon Sep 17 00:00:00 2001 From: Sebastian Pietras Date: Sat, 16 Oct 2021 03:04:48 +0200 Subject: [PATCH] Updated README (#18) --- README.md | 107 ++++++++++++++-------------------------------- example/README.md | 8 ++-- 2 files changed, 37 insertions(+), 78 deletions(-) diff --git a/README.md b/README.md index 1f1e8a5..6c95d77 100644 --- a/README.md +++ b/README.md @@ -1,18 +1,29 @@ -# rules_conda +

rules_conda

-Rules for creating `conda` environments in Bazel :green_heart: +
+ +[![Running tests](https://github.com/spietras/rules_conda/actions/workflows/test.yml/badge.svg)](https://github.com/spietras/rules_conda/actions/workflows/test.yml) +[![Deploying docs](https://github.com/spietras/rules_conda/actions/workflows/docs.yml/badge.svg)](https://github.com/spietras/rules_conda/actions/workflows/docs.yml) + +
+ +--- + +Rules for creating conda environments in Bazel 💚 + +For more info see [the docs](https://spietras.github.io/rules_conda) or [the example](https://github.com/spietras/rules_conda/tree/main/example). ## Requirements `rules_conda` don't have any strict requirements by themselves. -Remember that some packages (e.g. `dlib`) are actually being compiled during installation and sometimes they need your local tools to compile (e.g. `g++`). +Just make sure you are able to use [`conda`](https://docs.conda.io/en/latest/miniconda.html#system-requirements). -## Usage +## Quickstart Add this to your `WORKSPACE` file: -```Starlark +```starlark load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive") http_archive( @@ -21,36 +32,23 @@ http_archive( url = "https://github.com/spietras/rules_conda/releases/download/0.0.5/rules_conda-0.0.5.zip" ) -load("@rules_conda//:defs.bzl", "load_conda", "conda_create", "register_toolchain") +load("@rules_conda//:defs.bzl", "conda_create", "load_conda", "register_toolchain") -# download and install conda load_conda( - version = "4.8.4", # optional, defaults to 4.8.4 - quiet = False # print output + quiet = False, # use True to hide conda output + version = "4.10.3", # optional, defaults to 4.10.3 ) -# create environment with python2 conda_create( - name = "py2_env", - environment = "@//third_party/conda:py2_environment.yml", # label pointing to environment.yml file - quiet = False, - clean = True, - timeout = 600 # each execute action can take up to 600 seconds + name = "my_env", + timeout = 600, # each execute action can take up to 600 seconds + clean = False, # use True if you want to clean conda cache (less space taken, but slower subsequent builds) + environment = "@//:environment.yml", # label pointing to environment.yml file + quiet = False, # use True to hide conda output ) -# create environment with python3 -conda_create( - name = "py3_env", - environment = "@//third_party/conda:py3_environment.yml", # label pointing to environment.yml file - quiet = False, - clean = True, - timeout = 600 -) - -# register pythons from environment as toolchain register_toolchain( - py2_env = "py2_env", # python2 is optional - py3_env = "py3_env" + py3_env = "my_env", ) ``` @@ -59,63 +57,24 @@ After that, all Python targets will use the environments specified in `register_ ## Who should use this? These rules allow you to download and install `conda`, create `conda` environments and register Python toolchain from environments. +This means you can achieve truly reproducible and hermetic local python environments. Pros: - easy to use -- no previous `conda` installation necessary -- no dependencies on `conda` side +- no existing `conda` installation necessary - no global `conda` installation, no global `PATH` modifications -- you can install packages from `conda` and from `pip` -- all Python targets will have access to the whole environment (the one registered in toolchain) +- virtually impossible to corrupt your environment by mistake as it always reflects your `environment.yml` +- all Python targets will implicitly have access to the whole environment (the one registered in toolchain) Cons: -- every time you update your environment configuration in `environment.yml`, the whole environment will be recreated from scratch -- on Windows you need to add environment location to `PATH` or set `CONDA_DLL_SEARCH_MODIFICATION_ENABLE=1` during runtime, so DLLs can be loaded properly (more on that below) +- every time you update your environment configuration in `environment.yml`, the whole environment will be recreated from scratch (but cached package data can be reused) +- on Windows you need to add environment location to `PATH` or set `CONDA_DLL_SEARCH_MODIFICATION_ENABLE=1` during runtime, so DLLs can be loaded properly (more on that [here](https://spietras.github.io/rules_conda/usage/issues/#path-issue)) So I think these rules suit you if: -- you want to use Bazel (e.g. for local package management) -- you want to use `conda` for third-party Python package management +- you want to use Bazel (e.g. you fell into Python monorepo trap) +- you want to use `conda` for Python environment management - you don't want to set up your Python environment manually or want your Python targets to _just work_ on clean systems -- you don't use a lot of third-party dependencies - you are okay with environments being recreated every time something changes - -## `PATH` issue - -With usual `conda` usage, you should `activate` you environment before doing anything. Activating an environment prepends some paths to `PATH` variable. This is crucial on Windows, because some `conda` packages need to load DLLs, which are stored in `conda` environments and the path to them must be in `PATH` variable for Windows to properly load them. On Linux, it somehow works without having to modify `PATH`. - -But here comes the issue: at this moment, I'm not aware of any way to either `activate` an environment before launching Python targets or adding anything to `PATH` automatically by Bazel. - -So the user has to do something to resolve the `PATH` issue. There are two ways: - -- Modify `PATH` - - Before running the target, set the `PATH` to include the path to `your_env/Library/bin`. For example: - - ```cmd - cmd /C "set PATH={full path to workspace}\bazel-{name}\external\{env_name}\{env_name}\Library\bin;%PATH%&& bazelw run {target}" - ``` - - Since we are running with `bazelw` you can instead put the variable setting there. You can also set the variable directly in your Python script. - -- Use `CONDA_DLL_SEARCH_MODIFICATION_ENABLE` - - It originally stems from another issue, but Python from `conda` has the ability to automatically insert the correct entries to `PATH`. This is controlled by setting the `CONDA_DLL_SEARCH_MODIFICATION_ENABLE` to `1`. - - So you can for example do: - - ```cmd - cmd /C "set CONDA_DLL_SEARCH_MODIFICATION_ENABLE=1&& bazelw run {target}" - ``` - - Since we are running with `bazelw` you can instead put the variable setting there. That's how it's done in the example. You can also set the variable directly in your Python script. - - This method only works with newer Python builds. More information [here](https://docs.conda.io/projects/conda/en/latest/user-guide/troubleshooting.html#mkl-library). - -In the future I hope that either `conda` (or Python, or Windows DLL loading, whatever is responsible for that) will change to work without activation or it will be possible to set environmetal variables inside Bazel. - -## TODO - -- don't recreate environments from scratch when configuration changes diff --git a/example/README.md b/example/README.md index cdaf9a8..c4538ad 100644 --- a/example/README.md +++ b/example/README.md @@ -6,13 +6,13 @@ Simple Python app demonstrating usage of `rules_conda` Linux: -- `glibc` -- any `python` -- any C compiler (like `gcc`) +- [`glibc`](https://stackoverflow.com/a/47191900/12861599) +- [any `python`](https://github.com/bazelbuild/bazel/issues/544#issuecomment-495307020) +- [any C compiler (like `gcc`)](https://github.com/bazelbuild/bazel/issues/8751) Windows: -- as far as I'm concerned: nothing +- not sure ## Usage