Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add boot assesment for install and bootentry #604

Merged
merged 7 commits into from
Nov 27, 2024
Merged

Add boot assesment for install and bootentry #604

merged 7 commits into from
Nov 27, 2024

Conversation

Itxaka
Copy link
Member

@Itxaka Itxaka commented Nov 20, 2024

Part of kairos-io/kairos#2864

The way it works:

  • On install, it runs over all the config entries and if it has no assesment it adds it by changing the filename to append +3 to it
  • On upgrade, does the same
  • On reset, does the same
  • On bootentry, it removes the assessment, if any, from the display name for presentation purposes, then before selecting the entry, reads the assesment number from the file and writes it back if needed. This is done to avoid missing retries in case there is any (so if passive was tried twice, it would keep those tries instead of restarting them)

@Itxaka Itxaka requested a review from a team November 20, 2024 15:27
@Itxaka
Copy link
Member Author

Itxaka commented Nov 20, 2024

I need to check upgrade as Its not clear what it does.

Copy link

codecov bot commented Nov 20, 2024

Codecov Report

Attention: Patch coverage is 74.69880% with 21 lines in your changes missing coverage. Please review.

Project coverage is 48.47%. Comparing base (a6dd348) to head (8df5ffe).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
pkg/action/bootentries.go 70.58% 4 Missing and 1 partial ⚠️
pkg/uki/install.go 0.00% 4 Missing ⚠️
pkg/uki/reset.go 0.00% 4 Missing ⚠️
pkg/uki/upgrade.go 0.00% 4 Missing ⚠️
pkg/utils/common.go 92.59% 3 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #604      +/-   ##
==========================================
+ Coverage   48.14%   48.47%   +0.32%     
==========================================
  Files          48       48              
  Lines        6023     6100      +77     
==========================================
+ Hits         2900     2957      +57     
- Misses       2844     2862      +18     
- Partials      279      281       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Itxaka
Copy link
Member Author

Itxaka commented Nov 25, 2024

Seems to work.

During install:

  • inits the entries with a +3

During boot:

  • bless-boot triggers and changes the conf name once the system is fully up
  • other entries are untouched
  • bootentry selects the proper conf (with or without assesment) and shows the nicer entries
  • bootentry --select selects the proper conf (with or without assessment)
  • cant select ACTIVE or PASSIVE, it only allows COS or FALLBACK (check if current selector does the same)
root@kairos-o654:~# ls /efi/loader/entries/
active.conf  passive+3.conf  recovery+3.conf  statereset+3.conf

On upgrade and reset:

  • Updates any entries to have boot assesment if they dont have it already (active on upgrade, both on reset as they are recreated)
2024-11-25T17:10:36Z DBG Conf file /efi/loader/entries/active.conf has values ma
p[string]string{
  "efi": "/EFI/kairos/recovery.efi",
  "title": "Kairos recovery",
}
2024-11-25T17:10:36Z DBG Conf file /efi/loader/entries/active.conf new values ma
p[string]string{
  "efi": "/EFI/kairos/active.efi",
  "title": "Kairos recovery",
}
2024-11-25T17:10:36Z DBG Enabling boot assessment from=/efi/loader/entries/activ
e.conf to=/efi/loader/entries/active+3.conf
2024-11-25T17:10:36Z DBG Boot assessment already present in file file=/efi/loade
r/entries/passive+3.conf
2024-11-25T17:10:36Z DBG Enabling boot assessment from=/efi/loader/entries/recov
ery.conf to=/efi/loader/entries/recovery+3.conf
2024-11-25T17:10:36Z DBG Boot assessment already present in file file=/efi/loade
r/entries/statereset+3.conf

@jimmykarily
Copy link
Contributor

This output is strange:

2024-11-25T17:10:36Z DBG Conf file /efi/loader/entries/active.conf has values ma
p[string]string{
  "efi": "/EFI/kairos/recovery.efi",
  "title": "Kairos recovery",
}
2024-11-25T17:10:36Z DBG Conf file /efi/loader/entries/active.conf new values ma
p[string]string{
  "efi": "/EFI/kairos/active.efi",
  "title": "Kairos recovery",
}

how did active.conf end up with title "Kairos recovery" (you run a reset before?). When we reset from recovery, shouldn't we replace the active.efi with the recovery.efi ? If we just point active.conf to recovery.efi, if someone upgrades the recovery image, they will accidentally replace what they boot when they boot active.

I'm not sure how the above state was reached so I may be confusing things.

@Itxaka
Copy link
Member Author

Itxaka commented Nov 26, 2024

This output is strange:

2024-11-25T17:10:36Z DBG Conf file /efi/loader/entries/active.conf has values ma
p[string]string{
  "efi": "/EFI/kairos/recovery.efi",
  "title": "Kairos recovery",
}
2024-11-25T17:10:36Z DBG Conf file /efi/loader/entries/active.conf new values ma
p[string]string{
  "efi": "/EFI/kairos/active.efi",
  "title": "Kairos recovery",
}

how did active.conf end up with title "Kairos recovery" (you run a reset before?). When we reset from recovery, shouldn't we replace the active.efi with the recovery.efi ? If we just point active.conf to recovery.efi, if someone upgrades the recovery image, they will accidentally replace what they boot when they boot active.

I'm not sure how the above state was reached so I may be confusing things.

yes, this was a reset from recovery, which I guess makes sense somehow? Notice that it was after 32 runs so that may not be the usual as I was testing with different things all the time.

In any case, this PR does not change that code, nor it modifies the contents of the entries themselves, just the filenames to append +3 AND it does it after that output, so if there is any issue with that, it happens before and not due to this PR....I think. As the boot assessment is triggered afterwards

}

// Read the directory.
entries, err := fs.ReadDir(dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is assuming the glob character will be in the file part. It will not work for patterns like "/mydir/*/myfile". I guess we don't care but maybe pattern should be called filePattern instead and have another argument for the directory (which will not support globing).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

umm good point, it wont.

})
It("fails to write the boot assessment in non existing dir", func() {
err := utils.AddBootAssessment(fs, "/fake", logger)
Expect(err).To(HaveOccurred())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better use MatchError to match the specific error. Just in case something else fails in the future instead of the expected error.

@jimmykarily
Copy link
Contributor

I wonder if it would be possible to have an e2e test for this. Unit testing doesn't really test that we indeed boot into fallback automatically. Setting this up will definitely be tricky...

@jimmykarily
Copy link
Contributor

I tried this branch and it seems to be doing correct things in regards to naming conf files. What we miss is:

  • automatic rebooting (or maybe it already works in some cases like kernel panics?)
  • sorting the boot entries in a deterministic way so that if "active" fails, we boot into "passive". Currently the next one picked was "state reset" when I run:
mount -o remount,rw /efi
/usr/lib/systemd/systemd-bless-boot bad
reboot

(From the FAQ here)

We probably need to play with sort-key like we discussed in Slack (https://uapi-group.org/specifications/specs/boot_loader_specification/#version-order)

Merge and create tickets for the additional stuff?

@Itxaka Itxaka merged commit 7be897c into main Nov 27, 2024
14 checks passed
@Itxaka Itxaka deleted the bootassesment branch November 27, 2024 10:16
@Itxaka
Copy link
Member Author

Itxaka commented Nov 27, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants