Enable remote nix builds #239

googleson78 · 2022-06-15T13:59:44Z

@layus said there are some new issues with this.

Also need to resolve all the TODOs and add tests.

Co-authored-by: Guillaume Maudoux <[email protected]>

layus · 2022-06-15T19:43:01Z

I would love feedback on this one. And maybe a test that uses nixbuild.net to ensure that it is at least usable.

As for the things that do not work, well

it is hard to specify the ENV_NIX_REMOTE_KEY_FILE when it is located in the repository. You need to expand %workspace% and bazel will not do that for environment variables, so you need wrapper script AFAIK.
The cache key needs special permissions for ssh to accept it, so that is another fixture that needs to happen in the wrapper.
The remote server is suddenly accessible to everyone, making it very insecure.
- We can maybe drop the ssh-ing part for now, and accept that the paths may get garbage collected.
This is mostly about remote nix execution, with the remote used as a cache. It may be beneficial to also setup a remote cache, but I do not know how these play together. And the commands we need here heavily depend on the architecture of the remote builders and the cache.

I guess the question is: is is okay to have this as an experimental feature ? How could we mark it as such ?

AleksanderGondek

I would like to suggest an shift in approach, which would take into consideration the hermeticity and configurability of the solution - I strongly feel, that rules_nixpkgs should be a ‘poster child’ for such approach and should shy away from hacks relying on host configuration / global environments etc.

Perhaps nixpkgs_package should contain two additional attributes:
Use_remote_nix_builder - boolean, which indicates if remote nix builder is to be used
Remote_nix_builder_config - a dictionary, containing all the configuration needed for the builder to properly run

In that way, we retain the composability of the solution, we do not break hermeticy and are not relying on global system state by default.

AleksanderGondek · 2022-06-15T19:20:05Z

core/nixpkgs.bzl

+        # are there issues with this?
+        # another idea is to have config_settings, and have attributes for this in nixpkgs_package
+        # but that seems clunkier (have to modify rules, introduce some non-building related logic to rules)
+        ENV_NIX_REMOTE = 'RULES_NIXPKGS_NIX_REMOTE'


This is de-facto a hidden, implict flow control.
Explicit is nearly always better than implicit, especially when dealing with Bazel rules which should clearly define it’s input and output.

If you use an env-var, then make sure to add it to the environ parameter of repository_rule so that Bazel knows when to invalidate and re-run the rule.
This is particularly relevant for this use-case, where failing to re-run the rule after enabling Nix-remote might cause missing store paths on the remote end.

Whether to use an env-var or parameters - we can consider the desired use-cases to see which is better suited.

Parameters provide explicit, checked-in configuration, that will be consistent for all contributors. But, they are difficult to change temporarily, e.g. through a command-line flag such as --config.

Parameters provide fine-grained, per repository rule control. But, they are difficult to set or override globally. E.g. if I need to work offline on transit and want to disable remote builds temporarily.

My understanding is that remote build configuration is something that should be consistent across all instances of nixpkgs_package, to avoid missing remote Nix store paths on individual packages where the parameter was forgotten (e.g. a transitive dependency). And that it is something that we may want to enable or disable via command-line flags or similar configuration, e.g. for temporary offline builds, or different developer and CI configuration.

Viewed that way, env-vars seem closer to what we want.

AleksanderGondek · 2022-06-15T19:43:46Z

core/nixpkgs.bzl

+            output_path = exec_result.stdout.splitlines()[-1]
+
+            # TODO[GL]: use nix provided ssh?
+            ssh_path = repository_ctx.which("ssh")


Perhaps path to ssh should come from configuration attribute and by default be taken from nix?
Or maybe we can drop ssh'ing all together? Let remote builder worry about caching (or lack of there of)

AleksanderGondek · 2022-06-15T19:44:27Z

core/nixpkgs.bzl

+            output_path = exec_result.stdout.splitlines()[-1]
+
+            # TODO[GL]: use nix provided ssh?
+            ssh_path = repository_ctx.which("ssh")


There should be capability to control certificates / root certificate / ssh config which is used for this operations (otherwise we are 'escaping to host', breaking hermetcity.

r2r-dev · 2022-06-15T20:51:47Z

core/nixpkgs.bzl

+            repository_ctx.report_progress("Remote-building Nix derivation")
+            exec_result = execute_or_fail(
+                repository_ctx,
+                [nix_build_path, "--store", nix_store] + expr_args,


imho it makes sense to add --eval-store auto.

(...) which instructs Nix to use a specific Nix store during the evaluation phase (before starting the build). It should be possible to set the evaluation phase to a remote store too, but in practice it makes most sense to always set it to auto.
nixbuild.net documentation

Definitely, figured out this one out afterwards :-).

r2r-dev · 2022-06-15T21:29:08Z

core/nixpkgs.bzl

+                    repository_ctx.attr.attribute_path,
+                ),
+                quiet = repository_ctx.attr.quiet,
+                timeout = 10000,


You're using the same timeout value in three places across this PR. I'd suggest either declaring a new variable or re-using the one which is already specified within this file.

layus · 2022-06-16T08:06:49Z

@AleksanderGondek What about a configurable nix toolchain ? Or something simmilar adapted to external repositories ? Ultimately this nix build sequence of actions should move to a proper wrapper script instead of this starlark spaghetti.

The current design tries to mimic the remote cache and execution setup, where the remote endpoints are defined in .bazelrc. With env vars, you can also set the remote nix endpoint in .bazelrc, which is pretty nice.

Having to specify it in each and every nixpkgs_package seems pretty bad to me. I know you can wrap the repo rule, but the wrapper will not be used everywhere. I think that global state is excatly what we want. Every call, nested inside nix_go setup, nix_python setup or deep inside dependencies of dependencies should inherit the workspace defaults.

AleksanderGondek · 2022-06-17T13:22:47Z

@layus I think that nix toolchain is a solution long-due :) I really like the idea!

layus · 2022-06-17T15:04:28Z

But we cannot use toolchains in repository_rules, right ? Oh dear, mind is stuck on Friday.

aherrmann · 2022-06-20T15:31:37Z

core/nixpkgs.bzl

+        # are there issues with this?
+        # another idea is to have config_settings, and have attributes for this in nixpkgs_package
+        # but that seems clunkier (have to modify rules, introduce some non-building related logic to rules)
+        ENV_NIX_REMOTE = 'RULES_NIXPKGS_NIX_REMOTE'


If you use an env-var, then make sure to add it to the environ parameter of repository_rule so that Bazel knows when to invalidate and re-run the rule.
This is particularly relevant for this use-case, where failing to re-run the rule after enabling Nix-remote might cause missing store paths on the remote end.

Whether to use an env-var or parameters - we can consider the desired use-cases to see which is better suited.

Parameters provide explicit, checked-in configuration, that will be consistent for all contributors. But, they are difficult to change temporarily, e.g. through a command-line flag such as --config.

Parameters provide fine-grained, per repository rule control. But, they are difficult to set or override globally. E.g. if I need to work offline on transit and want to disable remote builds temporarily.

My understanding is that remote build configuration is something that should be consistent across all instances of nixpkgs_package, to avoid missing remote Nix store paths on individual packages where the parameter was forgotten (e.g. a transitive dependency). And that it is something that we may want to enable or disable via command-line flags or similar configuration, e.g. for temporary offline builds, or different developer and CI configuration.

Viewed that way, env-vars seem closer to what we want.

aherrmann · 2022-06-20T15:47:56Z

core/nixpkgs.bzl

+                timeout = 10000,
+            )
+
+            nix_path = repository_ctx.which("nix")


which could return None, which should be handled to provide a reasonable error message. I think the lookup for nix-build already has some relevant logic around it.

aherrmann · 2022-06-20T16:01:28Z

@layus

it is hard to specify the ENV_NIX_REMOTE_KEY_FILE when it is located in the repository. You need to expand %workspace% and bazel will not do that for environment variables, so you need wrapper script AFAIK.

Indeed, this is a case where a parameter would serve better than an env-var.

But we cannot use toolchains in repository_rules, right ? Oh dear, mind is stuck on Friday.

Indeed, there are no toolchains in repository rules. That said, one can emulate something like it, at least to a certain extent.
E.g. rules_haskell's use_stack can be used to globally override the stack binary used. Note, it's sensitive to ordering!
Or, rules_haskell's handling of stack update ensures that stack update is called exactly once across multiple repository rules (unless we only read from a lock file).

AleksanderGondek · 2022-06-20T19:16:12Z

@aherrmann @layus

First of all - thank you for taking time to creating the change and taking time to listen to feedback. My comments are coming from the place of deep admiration and care forrules_nixpkgs and I hope I am succeeding in conveying my concerns in fault-free manner :)

Secondly - I am afraid I quite cannot grasp the reason as to why the environment variables are considered adequate in the context of this change. Allow me to expand upon my concerns / point of view (and if I am missing out on some information / plainly wrong - please, just say so):

The main functionality introduced by the change is to allow rules_nixpkgs to offload nix build(s) to remote hosts via mechanisms, which are native to nix, not Bazel.
The distrubuted nix-build capability is realying heavily on proper ssh configuration
As far as Bazel is concerned, the packages prepared by rules_nixpkgs run in “remote mode”, are procured on Bazel client host (the RBE does not benefit directly from this feature, unless runners have some special configuration made)
(Future) We might want to possibly utilize nix substitutors mechanism, to increase the efficiency of the solution

Bazel - in principle and design - wants to describe everything a project needs to build itself. This have multiple advantages (and disadvantages ;) ), one of which is - onboarding new developers / CI executors / general ‘consumers’ of build process is made significantly easier and less error prone, as you have explicitly stated requirements in the WORKSPACE.
Although, this frequently does not hold true in context of dependencies that traditionally come from the OS (python, glibc, etc.), it is an idea the Bazel strives for and rules_nixpkgs is a great boon to it (although, not without it’s own issues).

Now to the part, where I start to complain about the usage of environment variables for configuration of remote nix-build commands (or lack of them): it will introduce a significant surface for misconfiguration and lack of synchronization between users of the Bazel project.

At the moment, as long as you have nix installed, you can have your rules_nixpkgs configured within realm of Bazel knowledge*. With the change - the same Bazel Workspace, with nix installed, with the same command being run, may run properly on one machine and fail miserably on another. The key to proper build will be relegated to knowledge that one has to set env variables and have some ssh keys, taken from somewhere and enable somehow.
Therefore, we would be taking yet another step further from explicit dependencies in Bazel, to some implicit conditions that change the outcome of the build. I feel it is a step in the direction opposite Bazel and nix are guiding us towards, a regression.

I also think, that the very same argument that I have seen raised in proposition of using environment variables, could be made if we were implementing what is currently a repository attribute of a nixpkgs_package rule - why explicitly equip every nix package with a specific nixpkgs definition, if more often than not, we are doing the same definition en masse? It far more easy to read it from environment variable. It is easier to write code for it.
Alas, this is not the way it implemented and I think that decision is obviously something that feels correct in context of Bazel architecture.

Then there is also the matter of further configurability, extendability and sensible feedback to user: usage of environment variables make those unwiedly in very short span of features to implement or use-cases to cover.

Third - to give a constructive suggestion: Perhaps a better alternative to using raw environment variables, would be to:

Introduce new attribue to nixpkgs_pkgs, which points towards an external repository target of a default name (“@io_tweag.nix.remote//:cfg”)
Said target can be a configuration file (or something even wilder) deciding if and how nix-build remote should be used
Said target could be created during “rules_nixpkgs_dependencies” macro evaluation but if - and only if - it does not exists before, defaulting to “do not use remote nix building” setting.

Therefore we would have:

Non-breaking change for any previous usage
Capability to configure nix-build “en-masse” for every nixpkgs_pakcage
Capability to configure nix-build on per nixpkgs basis
Retain the ability for the Bazel to have the faintest idea what it needs

Please let me know your thoughts - I would be trilled to prepare a PoC / do some pair-coding if that would help!

googleson78 · 2022-06-21T06:35:22Z

Chiming in as a newbie, I agree with the feelings and assessments that @AleksanderGondek has. I think it's also important to be able to switch remote builds on/off for everything at once for basically the reasons @aherrmann outlined.

The solution proposed by Aleksander seems like it's almost there - I can try to implement it. One thing that would be nice but I'm not sure how to do with that approach is to be able to configure through the CLI (or more commonly in .bazelrc.local) turning off/on remote builds easily. Perhaps a selecting on a config_setting in the external repository's BUILD could work, but I'm fuzzy on the details/if it's actually possible before I've started to implement it.

aherrmann · 2022-06-21T08:22:30Z

I consider the question of env-var or not an implementation detail. What I consider important is the following:

I'd like to be able to switch off remote builds temporarily, e.g. if I need to work offline, or need to disable all remote caching/execution to debug e.g. a reproducibility issue. To that end I'll need some global switch, e.g. a CLI flag, so that I don't need to go through all occurrences of nixpkgs_package to disable remote execution on each.
The configuration needs to be consistent to avoid missing remote Nix store paths. To that end I'll need some shared, global configuration.

That said, at least for the temporary switch, an env-var paired with --repo_env seems like the most straight forward approach. As for the Nix remote URI and credentials configuration. Sure, some kind of dedicated repository rule that nixpkgs_package depends on could do the trick. However, to ensure consistency it would have to be an implicit dependency.

@googleson78

Perhaps a selecting on a config_setting in the external repository's BUILD could work, but I'm fuzzy on the details/if it's actually possible before I've started to implement it.

Unfortunately, Bazel's configuration features, like select, user defined build settings, etc. are unavailable in repository rules, because they are executed in the loading phase before such configuration is resolved. See here about phases, when select is available is pointed out here for example.

Env-vars are a way to achieve command-line flag configuration with repository rules. Notably, these env-vars don't have to be set manually by the user in their environment. Instead, one can e.g. set them in .bazelrc, for example daml did this for the configuration of the Scala version.

build:scala_2_12 --repo_env=DAML_SCALA_VERSION=2.12.14
fetch:scala_2_12 --repo_env=DAML_SCALA_VERSION=2.12.14
query:scala_2_12 --repo_env=DAML_SCALA_VERSION=2.12.14
sync:scala_2_12 --repo_env=DAML_SCALA_VERSION=2.12.14

To be used as bazel build --config scala_2_12 .... Yes, the duplication is very annoying. Maybe one can use common instead, i.e.

common:scala_2_12 --repo_env=DAML_SCALA_VERSION=2.12.14

in the Daml example.

@AleksanderGondek

I also think, that the very same argument that I have seen raised in proposition of using environment variables, could be made if we were implementing what is currently a repository attribute of a nixpkgs_package rule - why explicitly equip every nix package with a specific nixpkgs definition, if more often than not, we are doing the same definition en masse?

I don't agree with that view - I view the two as different kinds of configurations. The nixpkgs revision has meaning w.r.t. what to build and import, i.e. which version of which tool or library. If another contributor builds their project with a different nixpkgs revision, then I can't expect them to get the same results as I do.

But, the remote configuration has meaning w.r.t. how to build it, i.e. it's operational. In that regard it's similar to a remote cache configuration. In a properly configured, hermetic project I expect to get the same results no matter if I use remote cache or not, it'll just take longer without. Remote execution has a similar quality to it. Of course, it's a bit more complicated, as it can also provide access to more platforms. But, at least in the case of same platform builds, it should behave the same.

benradf · 2023-07-17T10:03:17Z

Closing as there have been no changes for over a year now. See Support remote execution with rules_nixpkgs for the latest on this topic.

googleson78 · 2023-07-17T10:57:20Z

Hey @benradf, thanks!

Is that link intended to be public, or is it intended entirely for tweagers? Currently it leads to a private Tweag repo, so it's a bit confusing for people who don't have access to it.

benradf · 2023-07-17T11:04:27Z

Hey @googleson78, good to hear from you again!

Thanks for letting me know the link is private - my mistake. I've changed it to link to the public issue in this repo.

Enable remote nix builds

8fd2975

Co-authored-by: Guillaume Maudoux <[email protected]>

googleson78 force-pushed the nix-remote-build branch from 72c177a to 8fd2975 Compare June 15, 2022 14:17

AleksanderGondek suggested changes Jun 15, 2022

View reviewed changes

r2r-dev reviewed Jun 15, 2022

View reviewed changes

aherrmann reviewed Jun 20, 2022

View reviewed changes

benradf closed this Jul 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable remote nix builds #239

Enable remote nix builds #239

googleson78 commented Jun 15, 2022 •

edited

Loading

layus commented Jun 15, 2022

AleksanderGondek left a comment

AleksanderGondek Jun 15, 2022

aherrmann Jun 20, 2022

AleksanderGondek Jun 15, 2022 •

edited

Loading

AleksanderGondek Jun 15, 2022

r2r-dev Jun 15, 2022

layus Jun 16, 2022

r2r-dev Jun 15, 2022

layus commented Jun 16, 2022

AleksanderGondek commented Jun 17, 2022

layus commented Jun 17, 2022

aherrmann Jun 20, 2022

aherrmann Jun 20, 2022

aherrmann commented Jun 20, 2022

AleksanderGondek commented Jun 20, 2022 •

edited

Loading

googleson78 commented Jun 21, 2022

aherrmann commented Jun 21, 2022

benradf commented Jul 17, 2023 •

edited

Loading

googleson78 commented Jul 17, 2023 •

edited

Loading

benradf commented Jul 17, 2023

Enable remote nix builds #239

Enable remote nix builds #239

Conversation

googleson78 commented Jun 15, 2022 • edited Loading

layus commented Jun 15, 2022

AleksanderGondek left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AleksanderGondek Jun 15, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

layus commented Jun 16, 2022

AleksanderGondek commented Jun 17, 2022

layus commented Jun 17, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aherrmann commented Jun 20, 2022

AleksanderGondek commented Jun 20, 2022 • edited Loading

googleson78 commented Jun 21, 2022

aherrmann commented Jun 21, 2022

benradf commented Jul 17, 2023 • edited Loading

googleson78 commented Jul 17, 2023 • edited Loading

benradf commented Jul 17, 2023

googleson78 commented Jun 15, 2022 •

edited

Loading

AleksanderGondek Jun 15, 2022 •

edited

Loading

AleksanderGondek commented Jun 20, 2022 •

edited

Loading

benradf commented Jul 17, 2023 •

edited

Loading

googleson78 commented Jul 17, 2023 •

edited

Loading