Skip to content

Latest commit

 

History

History
546 lines (426 loc) · 23.2 KB

README.md

File metadata and controls

546 lines (426 loc) · 23.2 KB

Remote Execution Client

This repository contains a client implementation of Remote Execution API that works with Remote Execution API SDKs.

Reclient integrates with an existing build system to enable remote execution and caching of build actions.

When used with Server implementation of Remote Execution API, it helps to reduce the build times by applying 2 main techniques:

  1. Distribution of the load by executing individual build actions in parallel on separate remote workers instead of on one build machine so that the build actions that are executed in parallel don’t compete for the same local resources.
  2. RE Server instance-wide cache for build actions, inputs, and artifacts As a consequence, results of a build action that was already executed for exactly the same inputs on the same instance of RE Server will be fetched from the cache even if the action was never executed on the machine.

Most clients are expected to see the performance improvement of their builds after migrating from local to remote builds. However, builds with a high number of deterministic build actions that can be executed in parallel are expected to see the greatest improvement.

Reclient consists of the following main binaries:

  1. rewrapper - a wrapper that forwards build commands to RBE
  2. reproxy - a process that should be started at the beginning of the build and shut down at the end. It communicates with RBE to execute build actions remotely and/or fetch build artifacts from RE Server's CAS (Content Addressable Storage).
  3. bootstrap - starts and stops reproxy, and aggregates the metrics during the shutdown.
  4. scandeps_server - a standalone process for scanning includes of C(++) compile actions. Started and stopped automatically by reproxy.

Note

This is not an officially supported Google product.

Prerequisites

  • re-client builds with Bazel. We recommend using Bazelisk to use the version of Bazel currently supported by this code base.
  • re-client also requires gclient, which can be installed by checking out depot_tools and adding the depot_tools directory to your PATH.

Building

re-client currently builds and is supported on Linux / Mac / Windows.

Once you've installed Bazel, and are in the re-client repo:

Build the code

To build a complete set of binaries for reclient with a clangscandeps deps scanner:

$ bazelisk build --config=clangscandeps //:artifacts_tar
[...]
Target //:artifacts_tar up-to-date:
  bazel-bin/artifacts.tar

To build a complete set of binaries for reclient with a goma deps scanner:

$ bazelisk build --config=goma //:artifacts_tar
[...]
Target //:artifacts_tar up-to-date:
  bazel-bin/artifacts.tar

Install binaries (linux and mac only)

To install all binaries to a $BINDIR

$ bazelisk run --config=goma //:artifacts_install -- --destdir $BINDIR
[...]
INFO: Running command line: bazel-bin/artifacts_install --destdir $BINDIR

Run unit tests

$ bazelisk test //pkg/... //internal/...
[...]
INFO: Elapsed time: 77.166s, Critical Path: 30.24s
INFO: 472 processes: 472 linux-sandbox.
INFO: Build completed successfully, 504 total actions
//internal/pkg/cli:go_default_test                                       PASSED in 0.2s
//internal/pkg/deps:go_default_test                                      PASSED in 1.2s
//internal/pkg/inputprocessor/action/cppcompile:go_default_test          PASSED in 0.1s
//internal/pkg/inputprocessor/flagsparser:go_default_test                PASSED in 0.1s
//internal/pkg/inputprocessor/pathtranslator:go_default_test             PASSED in 0.1s
//internal/pkg/inputprocessor/toolchain:go_default_test                  PASSED in 0.2s
//internal/pkg/labels:go_default_test                                    PASSED in 0.1s
//internal/pkg/logger:go_default_test                                    PASSED in 0.2s
//internal/pkg/rbeflag:go_default_test                                   PASSED in 0.1s
//internal/pkg/reproxy:go_default_test                                   PASSED in 15.5s
//internal/pkg/rewrapper:go_default_test                                 PASSED in 0.2s
//internal/pkg/stats:go_default_test                                     PASSED in 0.1s
//pkg/cache:go_default_test                                              PASSED in 0.2s
//pkg/cache/singleflightcache:go_default_test                            PASSED in 0.1s
//pkg/filemetadata:go_default_test                                       PASSED in 2.1s
//pkg/inputprocessor:go_default_test                                     PASSED in 0.2s

Executed 16 out of 16 tests: 16 tests pass.

Reclient can be built to use Goma's input processor. Goma's input processor is 3x faster than clang-scan-deps for a typical compile action in Chrome. Build as follows:

bazelisk build //:artifacts_tar --config=goma

Versioning

There are four binaries that are built from this repository and used with Android Platform for build acceleration:

  • rewrapper
  • reproxy
  • dumpstats
  • bootstrap

These binaries must be stamped with an appropriate version number before they are dropped into Android source for consumption.

Versioning Guidelines

  1. We will maintain a consistent version across all of the binaries. That means, when there are changes to only one of the binaries, we will increment the version number for all of them.

  2. In order to be consistent with Semantic versioning scheme, the version format is of the form “X.Y.Z.SHA” denoting “MAJOR.MINOR.PATCH.GIT_SHA”.

  3. Updating version numbers:

    MAJOR

    • Declare major version “1” when re-client is feature complete for caching and remote-execution capabilities.
    • Update major version post “1”, when there are breaking changes to interface / behavior of rewrapper tooling. Some examples of this are: changing any of the flag names passed to rewrapper, changing the name of rewrapper binary.

    MINOR - Update minor version when

    • New features are introduced in a backward compatible way. For example, when remote-execution capability is introduced.
    • Major implementation changes without changes to behavior / interface. For example, if the “.deps” file is changed to JSON format.

    PATCH - Update patch version

    • For all other bug fixes only. Feature additions (irrespective of how insignificant they are) should result in a MINOR version change.
    • Any new release to Android Platform of re-client tools should update the PATCH version at minimum.
  4. Release Frequency:

    • Kokoro release workflows can be triggered as often as necessary to generate new release artifacts.

How to update version numbers?

You can update the MAJOR/MINOR/PATCH version numbers by simply changing the version.bzl file present in the root of this repository.

Reclient releases

Reclient binaries are released into the CIPD (Chrome Infrastructure Package Deployment) with separate packages for Linux, Mac (amd64 and arm64), and Windows. Whenever a new version of Reclient is released there are 2 sets of binaries released for each of the platforms. Those binaries use 2 different include scanners for C++ build actions: clang-scan-deps and goma. The binaries using the goma include scanner have a version number ending with “-gomaip” suffix, the ones using clang-scan-deps don’t have the suffix. Clients migrating from Goma should use the releases using goma include scanner (with -gomaip suffix).

Downloading Reclient binaries

Reclient binaries can be downloaded using CIPD's Web UI, with a CLI client, or using gclient's configuration.

Downloading binaries with CIPD CLI client

To download Reclient with GomaIP dependency scanner (used for building Chromium):

echo 'infra/rbe/client/${platform}' $RECLIENT_VERSION > /tmp/reclient.ensure
cipd ensure --root $CHECKOUT_DIR --ensure-file /tmp/reclient.ensure

To use Reclient with Clangscandeps (used for Android builds) instead, add -csd suffix to CIPD package:

echo 'infra/rbe/client/${platform}-csd' $RECLIENT_VERSION > /tmp/reclient.ensure

  • $RECLIENT_VERSION - the version of Reclient. It can be set to one of the following:

    • A fixed version of Reclient. For example re_client_version:0.114.2.81e819b-gomaip (for Reclient with GomaIP dependency scanner) or re_client_version:0.114.2.81e819b (for Reclient with Clangscandeps).
    • latest - the latest released Reclient version.
    • stable - the latest stable Reclient version. Stable version is usually 1-2wks behind the latest as Reclient needs to run a few days on test and staging environments without issues and degradations before it's considered as stable.
  • $CHECKOUT_DIR - the location where Reclient should be downloaded.

Downloading binaries with gclient

You can configure gclient to download Reclient binaries during the gclient sync phase. Gclient expects a DEPS file in the repository’s root directory. The file contains components that will be checked out during the sync phase. To check out Reclient, the file should have a similar entry to:

vars = {
    ...
    'reclient_version': '<version>',
    ...
}

deps = {
      ...
'<checkout-directory>': {
    'packages': [
      {
        'package': 'infra/rbe/client/${{platform}}',
        'version': Var('reclient_version'),
      }
    ],
    'dep_type': 'cipd',
  },
}

This will instruct gclient to check out <version> of Reclient from /infra/rbe/client/<platform> CIPD package into <checkout-directory> (example). Extracting a version to a variable (as in an example above) is optional, but provides a benefit of being able to override the default value through gclient’s custom variables.

Note: The snippet above will instruct gclient to download Reclient with GomaIP dependency processor. If you prefer Reclient with Clangscandeps, you'd need to set package to infra/rbe/client/${{platform}}-csd.

Using Reclient

Starting and stopping reproxy

Reclient requires reproxy to be started at the beginning of the build, and stopped at the end. This is done through bootstrap binary by executing following commands:

Start:

bootstrap -re_proxy=$reproxy_location [-cfg=$reproxy_config_location]

Stop:

bootstrap -re_proxy=$reproxy_location -shutdown

Configuration

Each of Reclient’s binaries can be configured either by command line flags, environment variables, config files, or by combination of either of those (some flags provided in the command line while others in the config file or set as environment variables). If the same flag is defined in the command line and in the config file or as an environment variable, the order of precedence is following (from lowest to highest priority):

  1. Config file
  2. Environment variable
  3. Command line argument

To use a configuration file, specify it with the -cfg=$config_file_location flag. The config file is a list of flag_name=flag_value pairs, each on a new line. Example below:

service=$RE_SERVER_ADDRESS
instance=$RE_SERVER_INSTANCE
server_address=unix:///tmp/reproxy.sock
log_dir=/tmp
output_dir=/tmp
proxy_log_dir=/tmp
depsscanner_address=$scandeps_server_location #distributed with Reclient
use_gce_credentials=true

To configure Reclient with environment variables, the variables should be prefixed with RBE_ (e.g. the value of RBE_service environment variable is used to set the service flag).

Rewrapper

Full list of rewrapper config flags can be found in docs/cmd-line-flags.md. A few of the most commonly used flags are:

  • platform - Comma-separated key value pairs in the form key=value. This is used to identify remote platform settings like the docker image to use to run the command. The list of supported keys depends on RE Server implementation. A detailed lexicon can be found here
  • server_address - The address reproxy is running on. It needs to be set to the same value as reproxy’s server_address flag so that rewrapper and reproxy can communicate with each other. This value should be UDS on Linux/Mac (e.g. unix:///tmp/reproxy.sock) and a named pipe on Windows (e.g. pipe://reproxy.ipc). Depot_tools has a helper choosing the address based on the platform (here).
  • labels - Identifies the type of command to help the proxy make decisions regarding remote execution. Labels consist of comma-separated key-value pairs in form key=value where key is one of the following: type, compiler, lang, tool, and toolname. Some examples of valid labels are:
    • type=compile,compiler=clang,lang=cpp - clang compile actions
    • type=compile,compiler=clang-cl,lang=cpp - clang compile actions
    • type=compile,compiler=nacl,lang=cpp - nacl compile actions
    • type=compile,compiler=javac,lang=java - java compile actions
    • type=link,tool=clang - link actions
    • type=tool - generic action that doesn’t require any action specific input processing
  • exec_strategy - One of local, remote, remote_local_fallback, racing. It is recommended to set it to remote_local_fallback or racing. With remote_local_fallback it will try to execute the action remotely and fallback to local if the remote execution failed. With racing it tries to execute both and picks the one that finished sooner.
  • env_var_allowlist - List of environment variables allowed to pass to the proxy. If the build action depends on local environment variables, they should be set here, so they're reproduced on the remote worker.

If you are experiencing sporadic timeouts when dialing reproxy, you might consider adding:

  • dial_timeout - By default is 3m, if the flag is not set. But for some projects, increasing it up to 10m has proved beneficial in eliminating the timeouts

Reproxy

Full list of reproxy flags can be found docs/cmd-line-flags.md. A few of the most commonly used flags are:

  • service - The remote execution service to dial when calling via gRPC, including port, such as localhost:8790
  • instance - If a server supports multiple instances of the execution system (with their own workers, storage, caches, etc.), the field instructs the server which instance of the execution system to operate against. If the server does not support different instances, the field can be omitted.
  • server_address - An address reproxy should start its gRPC server on and listen for incoming communication from rewrapper (should be set to the same value as rewrapper's server_address parameter)
  • depsscanner_address - The address of the dependency scanner service. To use the scandeps_server distributed with Reclient set the value to exec://$absolute_path_to_reclient_dir/scandeps_server For instance, if Reclient is checked out to /home/$user/chromium/src/buildtools/reclient/, the value of the attribute should be exec:///home/$user/chromium/src/buildtools/reclient/scandeps_server

Authentication flags

If your RE Server implementation does not use RPC authentication then use one of:

  • service_no_auth - If true, do not authenticate with the service (implied by -service_no_security).
  • use_rpc_credentials - If false, no per-RPC credentials will be used. Disables --credential_file, -use_application_default_credentials, and -use_gce_credentials. (default true).

If your RE Server uses RPC authentication then use one of the following flags:

  • use_gce_credentials - If true (and --use_application_default_credentials is false), use the default GCE credentials to authenticate with remote execution (https://cloud.google.com/docs/authentication/provide-credentials-adc#attached-sa).

  • use_application_default_credentials - If true, use application default credentials to connect to remote execution. See https://cloud.google.com/sdk/gcloud/reference/auth/application-default/login

  • credential_file - The name of a file that contains service account credentials to use when calling remote execution. Used only if -use_application_default_credentials and -use_gce_credentials are false.

  • credentials_helper - Path to the credentials helper binary. If given execrel://, looks for the credshelper binary in the same folder as bootstrap/reproxy

  • credentials_helper_args - Arguments for the credentials helper, separated by space

    The reproxy is typically started via the bootstrap, so it is recommended to avoid configuring it through the command line flags. It's advised to use either a configuration file that’s passed to the bootstrap with the -cfg flag or by setting environment variables before starting the bootstrap (example).

Auxiliary Metadata flag

If you want to collect backend workers' auxiliary metadata (cpu, memory usage per action), you can generate a .pb (or .proto.bin) file contains the descriptor information that will be used by reproxy at runtime to decode the auxiliary metadata, which is a proto message in the type of google.protobuf.Any.

Once you have customized auxiliary_metadata.proto file per your backend worker's specification, compile it as a .pb or .proto.bin file with protoc, and pass the file path to reproxy via --auxiliary_metadata_path flag, or environment variable RBE_auxiliary_metadata_path. Then, at runtime, reporxy will use this file to parse your backend worker's auxiliary metadata and log the data into reporxy logs.

cd api/auxiliary_metadata # or where you have the cusotmized `.proto` file

protoc \
--proto_path=. \
--descriptor_set_out=auxiliary_metadata.pb \
auxiliary_metadata.proto

export RBE_auxiliary_metadata_path=~/Workspace/re-client/api/auxiliary_metadata/auxiliary_metadata.pb

# then continue with your regular build with reproxy

or

cd api/auxiliary_metadata # or where you have the cusotmized `.proto` file

protoc \
--proto_path=. \
--descriptor_set_out=auxiliary_metadata.proto.bin \
auxiliary_metadata.proto

export RBE_auxiliary_metadata_path=~/Workspace/re-client/api/auxiliary_metadata/auxiliary_metadata.proto.bin

# then continue with your regular build with reproxy

It's worth noting that the backend can give this proto message any arbitrary name; however, the client side proto message should strictly use AuxiliaryMetadata to receive it. For example, in the unit test, the backend send the proto msg out with name WorkerAuxiliaryMetadata, and client receives it as AuxiliaryMetadata.

Integration with the build system

To execute your build actions remotely through Reclient, the build command should be prepended with:

 $rewrapper [-cfg=$config-file] -exec_root=$checkout-dir --

, where:

  • $rewrapper - path of the rewrapper binary
  • $config-file - path of the rewrapper config file (assuming that rewrapper was configured with config file)
  • $checkout-dir - root directory of the source repository

When rewrapper is executed, it passes the build command to a running instance of reproxy that:

  • Determines dependencies either from command line flags (e.g. clang/gcc’s -I flag) or from the content of input files (during input processing phase)
  • Uploads toolchain, the inputs, and their dependencies to RBE
  • Executes the command remotely
  • Downloads artifacts locally

Before the build is executed, reproxy needs to be started by bootstrap and shut down at the end of the build.

During the run, reproxy writes its application-level logs to a directory specified by a log_dir flag and logs records about the executed build actions to an RPL file in a directory specified by the proxy_log_dir flag. During reproxy shutdown, bootstrap dumps Reclient related build metrics to rbe_metrics.txt file saved at a location specified by the bootstrap's output_path flag.

GN integration

GN is a meta-build system that generates build files for Ninja. Its configuration files are written in a simple, dynamically typed language. Reclient can be integrated with the build by modifying the GN config files. Because of GN's language flexibility the method of how Reclient should be integrated will depend on the project, but usually it should involve adding a rewrapper prefix (example) that's controlled by a gn argument (example), and starting and stopping reproxy before and after the build. The latter might be done by a helper script with reproxy start and stop steps around the ninja call example.

CMake integration

You can integrate CMake with Reclient by using <LANG>_COMPILER_LAUNCHER property. This property is initialized by the value of the CMAKE_<LANG>_COMPILER_LAUNCHER variable if it is set when a target is created. For instance, to use Reclient for c/c++ compile actions, you’d need to set both CMAKE_C_COMPILER_LAUNCHER and CMAKE_CXX_COMPILER_LAUNCHER to $rewrapper;-cfg=$config-file;-exec_root=$checkout-dir (the property accepts semicolon separated list as a launcher command).

Please note that CMake operates on absolute paths and you need to ensure that RE server executes the action on a remote worker in the same directory as it is in a local build machine (the method depends on your RE Server implementation). Moreover, please be aware that rewrapper's canonicalize_working_dir flag tampers the build actions' inputs paths, and thus should be disabled for the build actions generated by CMake.

Example of CMake build integration with Reclient can be found here.