Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: update air-gapped docs #7160

Merged
merged 10 commits into from
Aug 9, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
191 changes: 87 additions & 104 deletions docs/docs/advanced/air-gap.md
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an idea about re-structuring this page, but we can do that in another PR.

Original file line number Diff line number Diff line change
@@ -1,142 +1,125 @@
# Air-Gapped Environment
Trivy needs to connect to the internet to download databases. If you are running Trivy in an air-gapped environment, or an tightly controlled network, this document will explain your options.
In an air-gapped environment it is your responsibility to update the Trivy databases on a regular basis, so that the scanner can detect newly disclosed vulnerabilities.

Trivy can be used in air-gapped environments. Note that an allowlist is [here][allowlist].
## Network requirements
Trivy's Databases are distributed as OCI images via GitHub Container registry (GHCR):

## Air-Gapped Environment for vulnerabilities
- <https://ghcr.io/aquasecurity/trivy-db>
- <https://ghcr.io/aquasecurity/trivy-java-db>
- <https://ghcr.io/aquasecurity/trivy-checks>

### Download the vulnerability database
At first, you need to download the vulnerability database for use in air-gapped environments.
If Trivy is running behind a firewall, you'll need to add the following urls to your allowlist:

=== "Trivy"
- `ghcr.io`
- `pkg-containers.githubusercontent.com`

```
TRIVY_TEMP_DIR=$(mktemp -d)
trivy --cache-dir $TRIVY_TEMP_DIR image --download-db-only
tar -cf ./db.tar.gz -C $TRIVY_TEMP_DIR/db metadata.json trivy.db
rm -rf $TRIVY_TEMP_DIR
```
The databases are pulled by Trivy using the [OCI Distribution](https://github.com/opencontainers/distribution-spec) specification, which is based on simple HTTPS protocol.

=== "oras >= v0.13.0"
Please follow [oras installation instruction][oras].
## Running Trivy in air-gapped environment
In an air-gapped environment, you have to tell Trivy on every scan to not attempt to download the latest database files, otherwise the scan will fail. The following flags are relevant:

Download `db.tar.gz`:
- `--skip-db-update` to skip updating the main vulnerability database.
- `--skip-java-db-update` to skip updating the Java vulnerability database.
- `--offline-scan` to scan Java applications without issuing API requests.
itaysk marked this conversation as resolved.
Show resolved Hide resolved
- `--skip-check-update` to skip updating the misconfiguration database.

```
$ oras pull ghcr.io/aquasecurity/trivy-db:2
```

=== "oras < v0.13.0"
Please follow [oras installation instruction][oras].
```shell
trivy image --skip-db-update --skip-java-db-update --offline-scan --skip-check-update myimage
```

Download `db.tar.gz`:
## Self-Hosting
You can also host the databases on your own OCI registry, in order to avoid having Trivy reaching out of your network.

```
$ oras pull -a ghcr.io/aquasecurity/trivy-db:2
```
First, make a copy of the databases in a container registry that is accessible to Trivy. The databases are in:
- `ghcr.io/aquasecurity/trivy-db:2`
- `ghcr.io/aquasecurity/trivy-java-db:1`
- `ghcr.io/aquasecurity/trivy-checks:0`

### Download the Java index database[^1]
Java users also need to download the Java index database for use in air-gapped environments.
Then, tell Trivy to use the private images:

!!! note
You container image may contain JAR files even though you don't use Java directly.
In that case, you also need to download the Java index database.
```shell
trivy image \
--db-repository myregistry.local/trivy-db \
--java-db-repository myregistry.local/trivy-java-db \
--offline-scan \
--checks-bundle-repository myregistry.local/trivy-checks \
myimage
```

=== "Trivy"
### Authentication
itaysk marked this conversation as resolved.
Show resolved Hide resolved

```
TRIVY_TEMP_DIR=$(mktemp -d)
trivy --cache-dir $TRIVY_TEMP_DIR image --download-java-db-only
tar -cf ./javadb.tar.gz -C $TRIVY_TEMP_DIR/java-db metadata.json trivy-java.db
rm -rf $TRIVY_TEMP_DIR
```
=== "oras >= v0.13.0"
Please follow [oras installation instruction][oras].
For Trivy DB, configure it in the [same way as for private images](../advanced/private-registries/index.md).

Download `javadb.tar.gz`:
For Java DB, you need to run `docker login YOUR_REGISTRY`. Currently, specifying a username and password is not supported.
itaysk marked this conversation as resolved.
Show resolved Hide resolved

```
$ oras pull ghcr.io/aquasecurity/trivy-java-db:1
```
## Manual cache population
You can also download the databases files manually and surgically populate the Trivy cache directory with them.

=== "oras < v0.13.0"
Please follow [oras installation instruction][oras].
### Downloading the DB files
On a machine with internet access, pull the database container archive from the registry into your local workspace:

Download `javadb.tar.gz`:
Note that these examples operate in the current working directory.

```
$ oras pull -a ghcr.io/aquasecurity/trivy-java-db:1
```
=== "Using ORAS"
This example uses [ORAS](https://oras.land), but you can use any other container registry manipulation tool.

```shell
oras pull ghcr.io/aquasecurity/trivy-db:2
```

### Transfer the DB files into the air-gapped environment
The way of transfer depends on the environment.
You should now have a file called `db.tar.gz`. Next, extract it to reveal the db files:

=== "Vulnerability db"
```
$ rsync -av -e ssh /path/to/db.tar.gz [user]@[host]:dst
```
```shell
tar -xzf db.tar.gz
```

=== "Java index db[^1]"
```
$ rsync -av -e ssh /path/to/javadb.tar.gz [user]@[host]:dst
```
You should now have 2 new files, `metadata.json` and `trivy.db`. These are the Trivy DB files.

### Put the DB files in Trivy's cache directory
You have to know where to put the DB files. The following command shows the default cache directory.
=== "Using Trivy"
This example uses Trivy to pull the database container archive. The `--cache-dir` flag makes Trivy download the database files into our current working directory. The `--download-db-only` flag tells Trivy to only download the database files, not to scan any images.

```shell
trivy --cache-dir . image --download-db-only
```
$ ssh user@host
$ trivy -h | grep cache
--cache-dir value cache directory (default: "/home/myuser/.cache/trivy") [$TRIVY_CACHE_DIR]

You should now have 2 new files, `metadata.json` and `trivy.db`. These are the Trivy DB files.

### Populating the Trivy Cache
Once you obtained the Trivy DB files (`metadata.json` and `trivy.db`), copy them over to the air-gapped environment.

In order to populate the cache, you need to identify the location of the cache directory. If it is under the default location, you can run the following command to find it:

```shell
trivy -h | grep cache
```
=== "Vulnerability db"
Put the DB file in the cache directory + `/db`.

```
$ mkdir -p /home/myuser/.cache/trivy/db
$ cd /home/myuser/.cache/trivy/db
$ tar xvf /path/to/db.tar.gz -C /home/myuser/.cache/trivy/db
x trivy.db
x metadata.json
$ rm /path/to/db.tar.gz
```

=== "Java index db[^1]"
Put the DB file in the cache directory + `/java-db`.

```
$ mkdir -p /home/myuser/.cache/trivy/java-db
$ cd /home/myuser/.cache/trivy/java-db
$ tar xvf /path/to/javadb.tar.gz -C /home/myuser/.cache/trivy/java-db
x trivy-java.db
x metadata.json
$ rm /path/to/javadb.tar.gz
```



In an air-gapped environment it is your responsibility to update the Trivy databases on a regular basis, so that the scanner can detect recently-identified vulnerabilities.

### Run Trivy with the specific flags.
In an air-gapped environment, you have to specify `--skip-db-update` and `--skip-java-db-update`[^1] so that Trivy doesn't attempt to download the latest database files.
In addition, if you want to scan `pom.xml` dependencies, you need to specify `--offline-scan` since Trivy tries to issue API requests for scanning Java applications by default.

For the example, we will assume the `TRIVY_CACHE_DIR` variable holds the cache location:

```shell
TRIVY_CACHE_DIR=/home/user/.cache/trivy
```
$ trivy image --skip-db-update --skip-java-db-update --offline-scan alpine:3.12

Put the Trivy DB files in the Trivy cache directory under a `db` subdirectory:

```shell
# ensure cache db directory exists
mkdir -p ${TRIVY_CACHE_DIR}/db
# copy the db files
cp /path/to/trivy.db /path/to/metadata.json ${TRIVY_CACHE_DIR}/db/
```

## Air-Gapped Environment for misconfigurations
### Java DB

No special measures are required to detect misconfigurations in an air-gapped environment.
For Java DB the process is the same, except for the following:
1. Image location is `ghcr.io/aquasecurity/trivy-java-db:1`
2. Archive file name is `javadb.tar.gz`
3. DB file name is `trivy-java.db`

### Run Trivy with `--skip-check-update` option
In an air-gapped environment, specify `--skip-check-update` so that Trivy doesn't attempt to download the latest misconfiguration checks.
## Misconfigurations scanning

```
$ trivy conf --skip-policy-update /path/to/conf
```
Note that the misconfigurations database is also embedded in the Trivy binary (at build time), and will be used as a fallback if the external database is not available. This means that you can still scan for misconfigurations in an air-gapped environment using the Checks from the time of the Trivy release you are using.
itaysk marked this conversation as resolved.
Show resolved Hide resolved

[allowlist]: ../references/troubleshooting.md
[allowlist]: ../references/troubleshooting.md#error-downloading-vulnerability-db
[oras]: https://oras.land/docs/installation

[^1]: This is only required to scan `jar` files. More information about `Java index db` [here](../coverage/language/java.md)
6 changes: 2 additions & 4 deletions docs/docs/references/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -203,10 +203,7 @@ Trivy v0.23.0 or later requires Trivy DB v2. Please update your local database o
!!! error
FATAL failed to download vulnerability DB

If trivy is running behind corporate firewall, you have to add the following urls to your allowlist.

- ghcr.io
- pkg-containers.githubusercontent.com
If Trivy is running behind corporate firewall, refer to the necessary connectivity requirements as described [here][network].

### Denied

Expand Down Expand Up @@ -271,4 +268,5 @@ $ trivy clean --all
```

[air-gapped]: ../advanced/air-gap.md
[network]: ../advanced/air-gap.md#network-requirements
[redis-cache]: ../../vulnerability/examples/cache/#cache-backend
25 changes: 13 additions & 12 deletions docs/docs/scanner/misconfiguration/check/builtin.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,22 @@
# Built-in Checks

## Check Sources
Built-in checks are mainly written in [Rego][rego] and Go.
Those checks are managed under [trivy-checks repository][trivy-checks].
## Checks Sources
Trivy has an extensive library of misconfiguration checks that is maintained at <https://github.com/aquasecurity/trivy-checks>.
Trivy checks are mainly written in [Rego][rego], while some checks are written in Go.
See [here](../../../coverage/iac/index.md) for the list of supported config types.

For suggestions or issues regarding policy content, please open an issue under the [trivy-checks][trivy-checks] repository.
## Checks Bundle
When performing a misconfiguration scan, Trivy will automatically downloads the relevant Checks bundle. The bundle is cached locally and Trivy will reuse it for subsequent scans on the same machine. Trivy takes care of updating the cache automatically so normally can be oblivious to it.
itaysk marked this conversation as resolved.
Show resolved Hide resolved

## Check Distribution
Trivy checks are distributed as an OPA bundle on [GitHub Container Registry][ghcr] (GHCR).
When misconfiguration detection is enabled, Trivy pulls the OPA bundle from GHCR as an OCI artifact and stores it in the cache.
Those checks are then loaded into Trivy OPA engine and used for detecting misconfigurations.
If Trivy is unable to pull down newer checks, it will use the embedded set of checks as a fallback. This is also the case in air-gap environments where `--skip-policy-update` might be passed.
For CLI flags related to the database, please refer to [this page](../configuration/db.md).
itaysk marked this conversation as resolved.
Show resolved Hide resolved

## Update Interval
## Checks Distribution
Trivy checks are distributed as an [OPA bundle](opa-bundle) hosted in the following GitHub Container Registry: <https://ghcr.io/aquasecurity/trivy-checks>.
Trivy checks for updates to OPA bundle on GHCR every 24 hours and pulls it if there are any updates.

### External connectivity
Trivy needs to connect to the internet to download the bundle. If you are running Trivy in an air-gapped environment, or an tightly controlled network, please refer to the [air-gapped documentation](../advanced/air-gap.md).
The Checks bundle is also embedded in the Trivy binary (at build time), and will be used as a fallback if Trivy is unable to download the bundle. This means that you can still scan for misconfigurations in an air-gapped environment using the Checks from the time of the Trivy release you are using.

[rego]: https://www.openpolicyagent.org/docs/latest/policy-language/
[trivy-checks]: https://github.com/aquasecurity/trivy-checks
[ghcr]: https://github.com/aquasecurity/trivy-checks/pkgs/container/trivy-checks
[opa-bundle]: https://www.openpolicyagent.org/docs/latest/management-bundles/
43 changes: 10 additions & 33 deletions docs/docs/scanner/vulnerability.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,45 +158,22 @@ Trivy can detect vulnerabilities in Kubernetes clusters and components by scanni

[^1]: Some manual triage and correction has been made.

## Database
Trivy downloads [the vulnerability database](https://github.com/aquasecurity/trivy-db) every 6 hours.
Trivy uses two types of databases for vulnerability detection:

- Vulnerability Database
- Java Index Database

This page provides detailed information about these databases.

### Vulnerability Database
Trivy utilizes a database containing vulnerability information.
This database is built every six hours on [GitHub](https://github.com/aquasecurity/trivy-db) and is distributed via [GitHub Container registry (GHCR)](https://ghcr.io/aquasecurity/trivy-db).
The database is cached and updated as needed.
As Trivy updates the database automatically during execution, users don't need to be concerned about it.
## Databases
Trivy utilizes several databases containing information relevant for vulnerability scanning.
When performing a vulnerability scan, Trivy will automatically downloads the relevant databases. The databases are cached locally and Trivy will reuse them for subsequent scans on the same machine. Trivy takes care of updating the databases cache automatically so normally can be oblivious to it.

For CLI flags related to the database, please refer to [this page](../configuration/db.md).

#### Private Hosting
If you host the database on your own OCI registry, you can specify a different repository with the `--db-repository` flag.
The default is `ghcr.io/aquasecurity/trivy-db`.

```shell
$ trivy image --db-repository YOUR_REPO YOUR_IMAGE
```

If authentication is required, it can be configured in the same way as for private images.
Please refer to [the documentation](../advanced/private-registries/index.md) for more details.
### Vulnerability Database
This is Trivy's main database which contains vulnerability information, as collected from the datasources mentioned above.
It is built every six hours on [GitHub](https://github.com/aquasecurity/trivy-db).

### Java Index Database
This database is only downloaded when scanning JAR files so that Trivy can identify the groupId, artifactId, and version of JAR files.
It is built once a day on [GitHub](https://github.com/aquasecurity/trivy-java-db) and distributed via [GitHub Container registry (GHCR)](https://ghcr.io/aquasecurity/trivy-java-db).
Like the vulnerability database, it is automatically downloaded and updated when needed, so users don't need to worry about it.

#### Private Hosting
If you host the database on your own OCI registry, you can specify a different repository with the `--java-db-repository` flag.
The default is `ghcr.io/aquasecurity/trivy-java-db`.
When scanning JAR files, Trivy relies on a dedicated database for identifying the groupId, artifactId, and version of the scanned JAR files. This database is only used when scanning JAR files, however your scanned artifacts might contain JAR files that you're not aware of.
This database is built once a day on [GitHub](https://github.com/aquasecurity/trivy-java-db).

If authentication is required, you need to run `docker login YOUR_REGISTRY`.
Currently, specifying a username and password is not supported.
### External connectivity
Trivy needs to connect to the internet to download the databases. If you are running Trivy in an air-gapped environment, or an tightly controlled network, please refer to the [air-gapped documentation](../advanced/air-gap.md).

## Configuration
This section describes vulnerability-specific configuration.
Expand Down