Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cloud image in 2.4.1 with EFI missing regexp grub module #1947

Closed
jvantillo opened this issue Oct 24, 2023 · 4 comments · Fixed by kairos-io/kairos-agent#171
Closed

Cloud image in 2.4.1 with EFI missing regexp grub module #1947

jvantillo opened this issue Oct 24, 2023 · 4 comments · Fixed by kairos-io/kairos-agent#171
Labels
bug Something isn't working

Comments

@jvantillo
Copy link

Kairos version:

PRETTY_NAME="Ubuntu 22.04.3 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.3 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
KAIROS_NAME="kairos-kairos-ubuntu-22-lts"
KAIROS_VERSION="v2.4.1-k3sv1.27.3+k3s1"
KAIROS_ID="kairos"
KAIROS_ID_LIKE="kairos-kairos-ubuntu-22-lts"
KAIROS_VERSION_ID="v2.4.1-k3sv1.27.3+k3s1"
KAIROS_PRETTY_NAME="kairos-kairos-ubuntu-22-lts v2.4.1-k3sv1.27.3+k3s1"
KAIROS_BUG_REPORT_URL="https://github.com/kairos-io/kairos/issues"
KAIROS_HOME_URL="https://github.com/kairos-io/kairos"
KAIROS_IMAGE_REPO="quay.io/kairos/kairos-ubuntu-22-lts"
KAIROS_IMAGE_LABEL=""
KAIROS_GITHUB_REPO="kairos-io/provider-kairos"
KAIROS_VARIANT="kairos"
KAIROS_FLAVOR="ubuntu-22-lts"

CPU architecture
AMD64

Describe the bug
Used OSBuilder to build a cloud image, intended to run in Azure. This works fine, able to enter recovery mode and reset with kairos-agent. After reboot, grub fail with the error error: ../../grub-core/script/function.c:119:can't find command 'regexp'

To Reproduce

  • Build VHD file using the OSBuilder
  • Spin up new generation 2 Azure VM with the disk
  • Perform kairos-agent reset through serial console, followed by reset

Expected behavior
Get through Grub without error messages

Analysis and potential cause
The error is fairly clear and is most likely caused by the fact that the regexp module of grub is not copied over into the grub directory of the active image. The kairos-agent debug output has the following few lines:

�[37mDEBU�[0m[2023-10-23T19:38:30Z] Syncing data...                              
�[36mINFO�[0m[2023-10-23T19:38:35Z] Finished syncing                             
�[36mINFO�[0m[2023-10-23T19:38:35Z] Finished copying /run/rootfsbase into /run/cos/active 
�[36mINFO�[0m[2023-10-23T19:38:35Z] Using grub config dir /run/cos/active/etc/cos/grub.cfg 
�[36mINFO�[0m[2023-10-23T19:38:35Z] Copying grub contents from /run/cos/active/etc/cos/grub.cfg to /run/cos/state/grub2/grub.cfg 
�[36mINFO�[0m[2023-10-23T19:38:35Z] Generating grub files for efi on /dev/sda    
�[37mDEBU�[0m[2023-10-23T19:38:35Z] Copying /run/cos/active/usr/lib/grub/x86_64-efi/loopback.mod to /run/cos/state/grub2/x86_64-efi/loopback.mod 
�[37mDEBU�[0m[2023-10-23T19:38:35Z] Copying /run/cos/active/usr/share/grub2/x86_64-efi/loopback.mod to /run/cos/state/grub2/x86_64-efi/loopback.mod 
�[37mDEBU�[0m[2023-10-23T19:38:35Z] Copying /run/cos/active/usr/lib/grub/x86_64-efi/squash4.mod to /run/cos/state/grub2/x86_64-efi/squash4.mod 
�[37mDEBU�[0m[2023-10-23T19:38:35Z] Copying /run/cos/active/usr/share/grub2/x86_64-efi/squash4.mod to /run/cos/state/grub2/x86_64-efi/squash4.mod 
�[37mDEBU�[0m[2023-10-23T19:38:35Z] Copying /run/cos/active/usr/lib/grub/x86_64-efi/xzio.mod to /run/cos/state/grub2/x86_64-efi/xzio.mod 
�[37mDEBU�[0m[2023-10-23T19:38:35Z] Copying /run/cos/active/usr/share/grub2/x86_64-efi/xzio.mod to /run/cos/state/grub2/x86_64-efi/xzio.mod 
�[37mDEBU�[0m[2023-10-23T19:38:35Z] Copying /run/cos/active/usr/lib/grub/x86_64-efi/gzio.mod to /run/cos/state/grub2/x86_64-efi/gzio.mod 
�[37mDEBU�[0m[2023-10-23T19:38:35Z] Copying /run/cos/active/usr/share/grub2/x86_64-efi/gzio.mod to /run/cos/state/grub2/x86_64-efi/gzio.mod 
�[33mWARN�[0m[2023-10-23T19:38:35Z] did not find grub font ascii.pf2 under /run/cos/active 
�[33mWARN�[0m[2023-10-23T19:38:36Z] did not find grub font euro.pf2 under /run/cos/active 
�[33mWARN�[0m[2023-10-23T19:38:36Z] did not find grub font unicode.pf2 under /run/cos/active 
�[37mDEBU�[0m[2023-10-23T19:38:36Z] Copying /usr/share/efi/x86_64/grub.efi to /run/cos/efi/EFI/boot/grub.efi 
�[37mDEBU�[0m[2023-10-23T19:38:36Z] Copying /usr/share/efi/x86_64/shim.efi to /run/cos/efi/EFI/boot/shim.efi 
�[37mDEBU�[0m[2023-10-23T19:38:36Z] No files relabelling as SELinux utilities are not found 

In #1824 we see the unified bootargs.cfg being introduced, which uses the regexp command. When the image first boots into recovery mode, it will have the regexp module in the live grub directory and thus has no problem. But when kairos-agent is resetting the system, if running on an UEFI system it will only copy over a specific list of modules. Source here.
If this analysis happens to be correct, we may also need to insmod the regexp module in the grub.cfg in packages?

We managed to take the disk into a local Gen2 HyperV instance and were able to reproduce the same issue as well. When loaded into a Gen1 instance, it will not exhibit this behavior.

@jvantillo jvantillo added the bug Something isn't working label Oct 24, 2023
@jimmykarily jimmykarily moved this from Todo 🖊 to In Progress 🏃 in 🧙Issue tracking board Oct 30, 2023
@jimmykarily jimmykarily moved this from In Progress 🏃 to Todo 🖊 in 🧙Issue tracking board Oct 30, 2023
@Itxaka
Copy link
Member

Itxaka commented Oct 30, 2023

@Itxaka Itxaka linked a pull request Oct 30, 2023 that will close this issue
@Itxaka
Copy link
Member

Itxaka commented Oct 30, 2023

can reproduce locally on a uefi machine but it still boots after hitting enter

@Itxaka
Copy link
Member

Itxaka commented Oct 30, 2023

it may be that the regexp is not working and the cmdline defaults are wrong so there is no serial output

@jvantillo
Copy link
Author

it may be that the regexp is not working and the cmdline defaults are wrong so there is no serial output

That was my observation as well, after I posted the issue. I think booting will continue just fine, but it may have missed some config depending on the distro. In my original test case, Ubuntu on Azure, I had to set console=ttyS0,115200n8 earlyprintk=ttyS0,115200 rootdelay=300 quiet splash and this yielded better serial output. I've since switched to OpenSuse, which out of the box seems mostly fine, and have applied a small patch to my image to remove the regexp from the bootargs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants