Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken kernel.efi does not reboot automatically #36

Open
anonymouse64 opened this issue Feb 14, 2020 · 2 comments
Open

Broken kernel.efi does not reboot automatically #36

anonymouse64 opened this issue Feb 14, 2020 · 2 comments

Comments

@anonymouse64
Copy link
Contributor

If I intentionally break the kernel.efi[1] in ubuntu-boot on a UC20 system, then grub does not reboot the system and instead hangs with the following message:

double free at 0x8c6f7d60
Aborted. Press any key to exit.

Additionally, pressing a key here (at least in QEMU) will drop me to the UEFI menu, where I would then have to manually reboot.

It would be great if grub could reboot automatically in this situation, because this may happen in the wild with a bad kernel snap update (where try-kernel.efi is corrupted like this) and then, if the system were rebooted, the grub.cfg on ubuntu-boot would trigger a rollback to the previous kernel.efi as without user intervention.

[1] I ran if=/dev/zero of=kernel.efi count=1

@alfonsosanchezbeato
Copy link
Member

alfonsosanchezbeato commented Nov 22, 2021

@anonymouse64 I have taken a look at this, but couldn't reproduce exactly. If I create a kernel image with dd if=/dev/zero of=kernel.efi count=1 (512 null bytes), the error I get is

 Failed to boot both default and fallback entries.
 
Press any key to continue...

and then I get to the grub menu, where I need to force a reboot to move forward and get the kernel reverted. To me, the double free seems like a grub bug while doing something with kernel.efi, that would need to be solved first, and then we would probably get the "Failed to boot both..." error. It would be nice if you can share the kernel.efi that triggered the issue.

Anyway, I tried grub's fallback feature with this grub configuration:
https://gist.github.com/alfonsosanchezbeato/8ad42596cd06a5986a9aea16c815f841
which seems to work. When trying the kernel.efi created by dd, it is able to boot the old kernel without interactions with the user (grub menu is not shown), and the new kernel snap is reverted as expected. I think we can have this change in snapd without backwards compatibility problems. More sophisticated things that do not apply only when updating the kernel would need a bit more of thought, as you mentioned.

Probably we should go through some cases with bad kernel images and give that some testing in the spread tests:

  1. Empty kernel.efi
  2. Corrupted kernel.efi (bad PE+ header)
  3. kernel.efi with good header but bad systemd stub
  4. kernel.efi with good systemd stub but corrupted/empty kernel section
  5. Unsigned kernel.efi

and probably some more...

@anonymouse64
Copy link
Contributor Author

@alfonsosanchezbeato thanks for testing that, the grub.cfg you posted looks good to me.

I don't have the original kernel.efi which triggered that double free anymore, I don't recall what I did to cause that, I may have just copied a junk file to the kernel.efi rather than all zeros like the dd command.

All of those cases would be good, we do have this existing spread test which could be modified to handle these additional cases I think: https://github.com/snapcore/snapd/blob/master/tests/nested/core/core20-kernel-failover/task.yaml

Though it is worth wondering whether the full test situations should live in snapd repo or somewhere else... as it seems all those other variants are more about grub's behavior than snapd's, but then again we do have grub.cfg inside snapd now so I suppose it makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants