Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add assesment docs #323

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions content/en/docs/Architecture/trustedboot.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,12 +78,17 @@ The keys are used to sign the UKI file, and to generate a PCR policy keypair req

Check the relevant documentation on how to [extend the system with system extensions]({{%relref "/docs/advanced/sys-extensions" %}})

### Trusted Boot Boot Assessment
Itxaka marked this conversation as resolved.
Show resolved Hide resolved

See the [Trusted Boot Boot Assessment]({{%relref "/docs/examples/boot_assessment_trusted_boot" %}}) documentation for more information on how to enable automatic boot assessment with Trusted Boot and how to make your own services participate in the boot assessment process.
Itxaka marked this conversation as resolved.
Show resolved Hide resolved

### Considerations

#### Booting command lines

UKI file's signatures are including also the kernel command line, so any change to the kernel command line will require a new UKI file to be generated and the installer image to be rebuilt. This implies that you cannot change the booting options once the system is installed (and the system won't be able to access the encrypted data)


### References

- [UEFI specification](https://uefi.org/sites/default/files/resources/UEFI_Spec_2_8_final.pdf)
Expand Down
107 changes: 107 additions & 0 deletions content/en/docs/Examples/boot_assessment_trusted_boot.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@



# Enabling Automatic Boot Assessment with Trusted Boot
Itxaka marked this conversation as resolved.
Show resolved Hide resolved

In this tutorial, we will walk through how to configure Kairos to enable **automatic boot assessment**, where the boot loader can determine the health of a boot entry. Specifically, we'll configure systemd services to trigger the `boot-complete.target` to mark boot entries as *good* or *bad*. We'll also cover how to implement automatic reboots when a service fails, allowing retries of a boot entry until success or exhaustion of attempts.
Copy link
Member

@mudler mudler Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By reading it entirely something is not really clear to me: here we say that the example is on how to enable the automatic boot assessment, but later we talk about marking an arbitrary service failure impacting the boot assessment - I think we should be explicit here in answering the following questions:

  • what are system defaults ? is boot assessment enabled?
  • is Kairos having a default boot assessment strategy?
  • if Kairos does have a default , how it behaves, what are the limitations?
  • If it doesn't, why we don't have a default for it?
  • finally, how can we extend the boot assessment by including our own custom services to impact the system state check?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we currently dont have any assesment strategy...yet.

The idea was to write the docs first and then start enabling it ourselves, plus adding e2e tests for it.

Current boot assessment is set as the default, i.e. if multi-user.target is reached, then it considers the system as good and marks it as such.

But I get what you mean, I will redo this to extend to introduce first the current states of assesment on kairos and then keep this as how can be extended to support more things.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expanded!


---


## Overview of Trusted Boot Automatic Boot Assessment

`systemd-boot` manages boot entries and provides a mechanism to automatically assess their success or failure. The key features are:

1. **Boot Entry Marking**:
- A boot entry is marked as *good* when the `boot-complete.target` is reached during startup.
- If the system fails to reach `boot-complete.target`, the boot entry is marked as *bad*.

2. **Retries**:
- By default, each boot entry has **3 retries**. A failure to reach `boot-complete.target` reduces the retry count. Once retries are exhausted, another boot entry is chosen if available.

3. **Service-Level Controls**:
- Configure services to participate in boot entry assessment and trigger reboots on failure.

---

## Step 1: Configuring a Service to Trigger `boot-complete.target`

To ensure a service's success or failure impacts the boot assessment, modify its service file to interact with `boot-complete.target`:
Itxaka marked this conversation as resolved.
Show resolved Hide resolved

1. **Edit the Service File**:
Override the service configuration using:
```bash
sudo systemctl edit <service-name>
Itxaka marked this conversation as resolved.
Show resolved Hide resolved
```

2. **Add Dependencies and Order to the Service File**:
Append the following to the override file:
```bash
[Unit]
# Ensure this unit starts after default system targets
After=default.target graphical.target multi-user.target
# Ensure this unit completes before boot-complete.target
Before=boot-complete.target

[Install]
# Make this service a hard dependency of boot-complete.target
RequiredBy=boot-complete.target
```

3. **Reload Systemd and Enable the Service:**:
```bash
sudo systemctl daemon-reload
sudo systemctl enable <service-name>
```

4. **Explanation**:
- The service runs after critical system targets (e.g., default.target) to ensure the system is operational.
- The service must complete successfully to allow boot-complete.target to be reached.
- If the service fails, the boot entry is not marked as good.


## Step 2: Adding Automatic Reboot to a Service

To configure a service to automatically reboot the system upon failure:

1. Edit the Service File:
Override the service configuration using:
```bash
sudo systemctl edit <service-name>
```
2. Add the Reboot Action:
In the [Unit] section, add:
```bash
[Unit]
FailureAction=reboot
```
3. Reload Systemd:
```bash
sudo systemctl daemon-reload
```
4. Explanation:
- On failure, FailureAction=reboot instructs systemd to reboot the system.
- This causes the boot entry to retry until success or retries are exhausted.


## Step 3: Combining Both Approaches

While the above configurations are independent, combining them can create a robust system:

1. Trigger boot-complete.target:
Configure services as described in Step 1 to impact boot assessment.

2. Enable Automatic Reboot:
Add `FailureAction=reboot` to relevant services as described in Step 2.

3. Behavior:
- On a service failure, the system reboots (`FailureAction=reboot`).
- During the retry, if boot-complete.target is not reached, the boot entry is not marked as good, and retries continue.
- If retries are exhausted, the system attempts the next available boot entry.

## Notes

- Services are started on both passive and active boot entries. So if a service is failing on active, and the failure is not due to the OS, it will also fail on passive. This can lead to the system rebooting on passive boot entries as well as active and end in the system booting to recovery.
- We recommend using this feature with caution, as it can lead to a boot loop if not configured correctly.
- Ideally, as the upgrade is done against the active images, we would recommend having 2 services, one for the active and one for the passive, to avoid the system rebooting on passive boot entries and having a safe fallback to the active boot entry. This can be achieved by using the `ConditionPathExists` directive in the service file to check if the service is running on the active or passive boot entry (marked byt eh files `/run/cos/active_mode` and `/run/cos/passive_mode`) so the service that auto reboots can be started only on the active boot entry.
Itxaka marked this conversation as resolved.
Show resolved Hide resolved