-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
console=ttyS0 is too slow and useless #48
Comments
Or for example only enable and run console-conf on it, and not make kernel/journald slowly push messages to serial console delaying the whole boot. |
Note that changes to the kernel command line should probably not be done immediately until snapd has support to read the kernel command line from from the gadget.yaml / otherwise because right now the snapd snap has the kernel command line we seal the TPM to hard-coded so changing this will break FDE. I believe @bboozzoo was working on the feature to read the kernel command-line from the gadget, do you have a status update on that? |
@anonymouse64 i am aware of the current duplication / disconnect of the gadget vs sealing code, so yeah will not push out an update to this uncoordinated. |
all x86 IoT gateways i have touched yet (as well as most servers) do default to using a serial console ... dropping it completely doesn't smell like a good plan ... |
The current reference target for the pc gadget is Intel NUC, which does not have serial console by default. Ubuntu Core does not target servers. Can you please elaborate on the "IoT gateways" => can they run the stock PC gadget, or have custom ones? I thought they all have custom gadgets and do not use the reference PC gadget. |
Also clouds may or may not have serial console, but they should be forking their own gadget anyway. |
@anonymouse64 @bboozzoo if it helps, we can turn console=* values into a variable in either the stock grubenv file or a custom grubenv file i.e. |
My 2¢ here is that we should probably:
I agree with @ogra1, it might be the case that the Intel NUC doesn't have a serial, but for example another IoT amd64 edge gateway we enabled UC16 for was the Dell gateways which do have physical serial ports. |
well, the dell gateways (which admittedly currently come with a custom gadget) would be an example, all advantech ones i have touched yet. but also 90% of other Industrial PCs that might "just install" x86 focused images we provide on cdimage. after all the typical IoT or industrial PC is often headless, yet an x86 base often means you can use an uncustomized image on them, unlike with arm devices where you can not have a generic image easily due to HW specific bootloaders. EDIT: i mentioned servers simply because IoT GWs are typically a cut down server, not a cut down desktop ... |
This is precisely why I think we should leave serial on by default in the pc gadget so that folks can "test-drive" UC on their IoT devices by just flashing a released default image and login with console-conf via serial without needing to build their own gadget snap/image. |
That will cost us a lot of boot time out of the box. Even "as fast as possible" is very slow. 30s+ of additional boot time. Note, this is about dropping "console=" from the kernel command line to stop forcing kernel to slow down it's boot to the speed of being able to push kmsg to the serial console. This is not about stopping/preventing consoleconf to run on serial consoles. By default it is spawned on them all. |
It's an embedded platform. Neither desktop or server. Because for something to be called a server, I expect 1TB of RAM, 1PT of NVME storage, RAID, infiniband, etc. |
while console-conf will indeed still come up, are there not menu bits at the initrd level now that would also use the defined console= ? indeed, if it is just kernel boot messages we lose thats completely neglectable and i'd fully agree with the removal, but AFAIK there are potentially interactive bits before systemd kicks in as well |
So w/o console=ttyS0 in the kernel commandline for run mode, what would the user experience be like? They plug in their device look at a blank serial console for ... however many minutes and then magically at some point console-conf shows up? |
Good question. Need to double check experimentally, I can record some videos. Somehow it still feels wrong to have both enabled by default on any hardware. It almost feels more appropriate to detect console in grub, and if it is serial pass serial console to the kernel, if it's video pass video to the kernel. |
We know that today, the experience is of 30s+ hang with no output from the kernel, when waiting for serial to show up that does not exist. Because we force the kernel to look for one, when there isn't one. |
Arguably this is a regression from UC18 -> UC20 in that there is a 30s+ hang with no output from the kernel on non-serial TTYs because the kernel is stuck trying to write to a non-existent serial TTY. I'd hate to introduce what appears to be a a hang on serial TTYs just because we don't want what appears to be a hang on non-serial TTYs.
This would be great but I don't know how we can do that while still enabling automatic FDE by sealing the kernel command-line against the TPM, unless both snapd + grub somehow learn to check if there are serial TTY's on the system, etc. Maybe there's a simpler solution I'm not aware of. |
Not a regression, UC18 also hangs in the same way.
As per original London sprint design, snapd must seal against the install-time dynamic cmdline and persist that through modes/kernel updates and resealings. That was the requirement of the original design. Currently snapd doesn't do resealing as far as I can tell, but it must support that. |
Is there any movement or update on this ? I'm particularly interested, as this causes an artificially long boot time on NUCs, and that eats into our Service Level budget on updates that require a reboot. Also, now that snapd seems to be in control of the grub config, what is the recommended way to change the linux commandline ? Is it even possible from the gadget ? Thanks. |
I can't speak to the original London sprint design as I wasn't there and joined the project later, but the new plan is to have snapd dynamically generate the kernel command line that is to be used with sealing using the following things:
The last bit is what we are currently missing from snapd, which is a way for a gadget snap to specify additional kernel command line parameters. We have a rough plan and will implement it soon.
Currently there is not a way to configure the command line without recompiling snapd. As mentioned, we will be working on a way to do this soon. |
Thanks for the info. Sounds like, if the only static config is |
Ah yes sorry I forgot to explain that too, what will happen is that currently actually |
Perfect - sounds good. |
I have outstanding tasks to experiment with master serial console options, and/or speeding up the kernels serial console. |
Hi. Just wondering,= seeing as snapd |
Hi. Just wondering, seeing as snapd |
@jocado the feature enabling gadget specified kernel command line options will not be in 2.48, it is still under very active development, but is getting much closer, for example see canonical/snapd#9724 and canonical/snapd#9719 which are getting us closer and closer to the final bits needed for this. It is unclear if we will backport those changes to 2.48 to be available in i.e. 2.48.1 or if the feature will just go into 2.49. |
Just checking in here to see if we are able yet, or have a good idea of when, to be able to disable the serial console args in the kernel commandline via the gadget. Is it supported in Thanks! |
@jocado unfortunately no, 2.49 does not have the full set of changes yet, we will keep you updated on when the feature is enabled. Thanks for your patience. |
It looks like we are very close now :) https://forum.snapcraft.io/t/customising-uc20-kernel-command-line-arguments/24370 You will see comment from me there. I have tested and it's working for me with current edge revision. @anonymouse64 Is there any rough release date for |
@jocado snapd 2.50 is being released to stable as we speak, it is released in phases so not every device will get it at the same time, but by the current looks of it I think it should be 100% phased out within the next 24 hours |
@anonymouse64 which revision will it be though ? As the feature only seemed to be working for me in the current edge channel.
Candidate |
Revision 11841 is being released, can you detail in the forum post how the candidate channel didn't work for you? |
It was simply that the I did try and look around in logs etc, but I didn't see anything useful or obvious clues. I can add the contents of the |
It'd be interesting to see debug level logs. You can add Perhaps it's also useful to take a look at the spread test we have: https://github.com/snapcore/snapd/blob/master/tests/nested/manual/core20-custom-kernel-commandline/task.yaml the test repacks |
BTW. have you installed the device from scratch maybe? snapd 2.50 carries an update to the boot script which supports cmdline.full, however, we decided to not bump the boot config version number, thus your current boot script will not get automatically updated. |
I did install it from scratch yes. That is one of our common use cases currently. What should I expect in that situation though ? It doesn't work from system bootstrap, but the works at some point int he future , next time the gadget is updated perhaps ? |
I'm looking into it right now. Looks like there's some mixup with what was cherry picked for 2.50. Some bits made it, but ones that glue everything together did not. I need to double check with @mvo5 but we may need to do 2.50.1. In the meantime, can you try edge branch? |
It worked 100% for me with the edge revision yesterday. |
That's good. When I have a branch for 2.50 ready, I'll add a link to it here. We build artifacts with the snapd snap as part of the workflow, you'll be able to grab it from there and verify. |
Great - thank you 👍 |
The branch is up canonical/snapd#10265 AFAIK we haven't decided yet whether this will be in 2.50. |
ok - thanks 🤞 - we are very keen for this feature 🙂 |
The tests have finished, and relevant ones were successful. When you click on the test workflow details, you should be able to access artifacts, which is a zip file with the snapd snap from that branch inside. |
Hi. Sorry for the delayed response. Just to confirm, The artifact above seemed to work for me. |
Just following on from last week, and the current revision that made it to Not looking for absolutes, but is there any kind of rough ETA for that ? Are we talking weeks or months ? |
Hi. Can anyone confirm if we are looking at |
2.50.1 has the fix and should be in stable now, but yes 2.51 also has the full fix and should be headed to stable next week hopefully. |
console=ttyS0
is specified in the gadget by default, in UC20, for all modes: recovery, install, and run mode.However, on the hardware that does not have serial console (majority of real x86 hardware) this option significantly delays the boot, as the kernel is polling for the serial console to appear, delaying the boot by 90s.
Furthermore if the serial console is present, the baud rate is not set to be high enough, resulting in painfully slow boots still.
I would like to drop serial console option from the pc gadget.
If not completely, I can see the value of keeping it for the recover mode.
Alternatively I think we should publish a separate serial pc gadget, that specifies only the serial console with a high baud rate.
Could we make console a grubenv paramenter? such that ubuntu-image / snap-prepare-image can modify it, and it would persist from install mode, to sealed secrets, run/recover modes?
Also see https://bugs.launchpad.net/ubuntu/+source/snapd/+bug/1879290
The text was updated successfully, but these errors were encountered: