Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Install RAID Creation Fails: sgdisk Failing Successfully #638

Open
lethedata opened this issue Feb 15, 2024 · 4 comments
Open

Install RAID Creation Fails: sgdisk Failing Successfully #638

lethedata opened this issue Feb 15, 2024 · 4 comments

Comments

@lethedata
Copy link

During installation RAID creation there were no failure messages but the raid device was not appearing and selected disks still appeared as individual disks. Dropping to console I was able to manually create the RAID but wiping it, rebooting, and letting the installer handle it still failed.

Looking at the host-installer specifically diskutil.py create_raid I manually ran each command. Doing this I was able to see that sgdisk --zap-all was "failing successfully". The program seemed to complete without error however it didn't properly wipe the GPT tables so the system was auto recovering. I'm not exactly clear why this was impacting the rest of the raid creation but after manually wiping with gdisk's zap I had no other issues.

Looking online I found Ubuntu gdisk Bug 1303903 which mentions that this might be caused by how sgdisk handles MBR disks and incorrectly assuming an MBR disk. Following their work around, adding the --mbrtogpt --clear flags might prevent this from happening. I was unable to reproduce after my gdisk wipe so am unable to verify.

@olivierlambert
Copy link
Member

Hi! Thanks for the report.

@ydirson is it related to the PR we made a while ago and Citrix/XS never wanted to merge?

@ydirson
Copy link
Contributor

ydirson commented Feb 15, 2024

@olivierlambert Could be. This failing command is in our RAID-creation code that that XS will not merge, more specifically about xcp-ng/host-installer#7 which added the sgdisk --zap-all call.
It could be that xenserver/host-installer#38 would help, notably by stopping the OS from auto-assembling pre-existing RAID volumes.

@lethedata I'm interested in the logs for this "successful failing" here if you still have them, maybe we can improve the behavior here.

@lethedata
Copy link
Author

lethedata commented Feb 16, 2024

@lethedata I'm interested in the logs for this "successful failing" here if you still have them, maybe we can improve the behavior here.

@ydirson Unfortunately I didn't think to grab any output until after I fixed things. For future reference, if the iso logs, where does it output to when install not completed?

I messed around trying to reproduce the error but was only able to get sgdisk to push a GPT restore once. The issue is that I don't know what was written in the original backup GPT sector leading sgdisk to constantly detect MBR after restore. This process also doesn't seem to impact mdadm through the installer.

  1. Clean disk completely
  2. Create MBR disk: fdisk /dev/DISK (Options: o, n, p, default, default, w)
  3. Backup MBR : dd if=/dev/DISK of=/PATH/MBR.backup bs=512 count=1
  4. Create GPT tabl: fdisk /dev/DISK (Options: g, w)
  5. Delete front GPT : dd if=/dev/zero of=/dev/DISK bs=512 count=34
  6. Restore MBR : dd if=/PATH/MBR.backup of=/dev/DISK bs=512 count=1
  7. Zap Drive: sgdisk --zap-all /dev/DISK

My hunch is that whatever was written in the back of the drive was a "perfect sequence" leading sgdisk to not wipe and mdadm to fail but this is just a hunch. No matter what I tried I couldn't seem to reproduce it. Now I know when it comes to odd disk issues it's probably a good idea to at least backup the tables before wiping things.

@ydirson
Copy link
Contributor

ydirson commented Feb 19, 2024

For future reference, if the iso logs, where does it output to when install not completed?

During the installation it logs essentially into /tmp/install-log. Then you'll find the installer logs on the installed host in /var/log/installer/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants