Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable remote nix builds #239

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 73 additions & 3 deletions core/nixpkgs.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -231,17 +231,86 @@ def _nixpkgs_package_impl(repository_ctx):
"nix-build",
extra_msg = "See: https://nixos.org/nix/",
)
nix_build = [nix_build_path] + expr_args

repository_ctx.report_progress("Building Nix derivation")

# Large enough integer that Bazel can still parse. We don't have
# access to MAX_INT and 0 is not a valid timeout so this is as good
# as we can do. The value shouldn't be too large to avoid errors on
# macOS, see https://github.com/tweag/rules_nixpkgs/issues/92.
timeout = 8640000
repository_ctx.report_progress("Building Nix derivation")

# TODO[GL]: should these be something other than env variables?
# it seems like it should be fine, so you can switch on the fly
# are there issues with this?
# another idea is to have config_settings, and have attributes for this in nixpkgs_package
# but that seems clunkier (have to modify rules, introduce some non-building related logic to rules)
ENV_NIX_REMOTE = 'RULES_NIXPKGS_NIX_REMOTE'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is de-facto a hidden, implict flow control.
Explicit is nearly always better than implicit, especially when dealing with Bazel rules which should clearly define it’s input and output.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use an env-var, then make sure to add it to the environ parameter of repository_rule so that Bazel knows when to invalidate and re-run the rule.
This is particularly relevant for this use-case, where failing to re-run the rule after enabling Nix-remote might cause missing store paths on the remote end.


Whether to use an env-var or parameters - we can consider the desired use-cases to see which is better suited.

Parameters provide explicit, checked-in configuration, that will be consistent for all contributors. But, they are difficult to change temporarily, e.g. through a command-line flag such as --config.

Parameters provide fine-grained, per repository rule control. But, they are difficult to set or override globally. E.g. if I need to work offline on transit and want to disable remote builds temporarily.

My understanding is that remote build configuration is something that should be consistent across all instances of nixpkgs_package, to avoid missing remote Nix store paths on individual packages where the parameter was forgotten (e.g. a transitive dependency). And that it is something that we may want to enable or disable via command-line flags or similar configuration, e.g. for temporary offline builds, or different developer and CI configuration.

Viewed that way, env-vars seem closer to what we want.

ENV_NIX_REMOTE_KEY_FILE = 'RULES_NIXPKGS_NIX_REMOTE_KEY_FILE'
ENV_NIX_REMOTE_USER = 'RULES_NIXPKGS_NIX_REMOTE_USER'

nix_host = repository_ctx.os.environ.get(ENV_NIX_REMOTE)
if nix_host:
nix_store_user = repository_ctx.os.environ.get(ENV_NIX_REMOTE_USER)
if not nix_store_user:
fail("{} must be set when running rules_nixpkgs remote nix builds.".format(ENV_NIX_REMOTE_USER))

# TODO[GL]: should we copy this to the bazel workspace?
# it seems nice to have it there, but since this is expected to be set up
# on the builder where the action is happening, that's a controlled environment already
# perhaps there might be issues with remote bazel builds?
nix_store_key_file = repository_ctx.os.environ.get(ENV_NIX_REMOTE_KEY_FILE)
if not nix_store_key_file:
fail("{} must be set when running rules_nixpkgs remote nix builds.".format(ENV_NIX_REMOTE_KEY_FILE))

nix_store = "ssh-ng://{user}@{host}?max-connections=1&ssh-key={key}".format(
user = nix_store_user,
host = nix_host,
key = nix_store_key_file)

repository_ctx.report_progress("Remote-building Nix derivation")
exec_result = execute_or_fail(
repository_ctx,
[nix_build_path, "--store", nix_store] + expr_args,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imho it makes sense to add --eval-store auto.

(...) which instructs Nix to use a specific Nix store during the evaluation phase (before starting the build). It should be possible to set the evaluation phase to a remote store too, but in practice it makes most sense to always set it to auto.
nixbuild.net documentation

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely, figured out this one out afterwards :-).

failure_message = "Cannot build Nix attribute '{}'.".format(
repository_ctx.attr.attribute_path,
),
quiet = repository_ctx.attr.quiet,
timeout = timeout,
)
output_path = exec_result.stdout.splitlines()[-1]

# TODO[GL]: use nix provided ssh?
ssh_path = repository_ctx.which("ssh")
Copy link
Contributor

@AleksanderGondek AleksanderGondek Jun 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps path to ssh should come from configuration attribute and by default be taken from nix?
Or maybe we can drop ssh'ing all together? Let remote builder worry about caching (or lack of there of)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be capability to control certificates / root certificate / ssh config which is used for this operations (otherwise we are 'escaping to host', breaking hermetcity.

repository_ctx.report_progress("Creating remote store root")
exec_result = execute_or_fail(
repository_ctx,
[ssh_path]
+ ["-i", nix_store_key_file]
+ ["{}@{}".format(nix_store_user, nix_host),
"nix-store --add-root /nix/var/nix/gcroots/per-user/{user}/rules_nixpkgs_{root} -r {path}".format(
user = nix_store_user, root = output_path.split('/')[-1], path = output_path) ],
failure_message = "Cannot build Nix attribute '{}'.".format(
repository_ctx.attr.attribute_path,
),
quiet = repository_ctx.attr.quiet,
timeout = 10000,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're using the same timeout value in three places across this PR. I'd suggest either declaring a new variable or re-using the one which is already specified within this file.

)

nix_path = repository_ctx.which("nix")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which could return None, which should be handled to provide a reasonable error message. I think the lookup for nix-build already has some relevant logic around it.

repository_ctx.report_progress("Downloading Nix derivation")
exec_result = repository_ctx.execute(
[nix_path, "copy", "--from", nix_store, output_path],
quiet = repository_ctx.attr.quiet,
timeout = 10000,
)

repository_ctx.report_progress("Creating local store root")

timeout = 10000
exec_result = execute_or_fail(
repository_ctx,
nix_build,
[nix_build_path] + expr_args,
failure_message = "Cannot build Nix attribute '{}'.".format(
repository_ctx.attr.attribute_path,
),
Expand All @@ -250,6 +319,7 @@ def _nixpkgs_package_impl(repository_ctx):
)
output_path = exec_result.stdout.splitlines()[-1]

repository_ctx.report_progress("Creating local folders")
# ensure that the output is a directory
test_path = repository_ctx.which("test")
execute_or_fail(
Expand Down