[v2.5] batch flash nodes #201

svenrademakers · 2024-05-13T10:45:13Z

Is your feature request related to a problem? Please describe.
The Turing Pi can flash an OS image to a given (supported) module. The firmware loads a USB plug onto the module, which, in turn, exposes an API used to write the new OS images. On v2.4 boards, only one module can be switched to the BMC at a time. On v2.5, we replaced these muxes with a USB hub, which opens up the possibility of flashing multiple nodes simultaneously.

Describe the solution you'd like
We want to be able to write an image to a selection of nodes. The dropdown in the "flash" tab of UI gets replaced with checkboxes, which the user can use to select which nodes to flash simultaneously. Flashing different images concurrently to different nodes is out of scope. Keep it simple!

if an error occurs with one of the nodes, all other tasks are aborted as well.
error messages need to be altered so it's clear to the user which node caused the error.
All flashing features should be expanded to the other nodes as well. sha265 checking, skip crc bool and xz decompressing. Be mindful that we are extremely limited on memory. for instance, don't decompress the same OS image multiple times.

additional information
we expect changes in the following 2 repos:

BMC-UI
BMCD

barrenechea · 2024-05-13T12:26:56Z

I can prepare the UI so we're able to support this use case, I'll be playing with options there 😄

barrenechea · 2024-05-13T12:30:27Z

To know, the UI should behave differently depending on if the board is <= v2.4 (or >=v2.5). Am I right? Could I get the board revision to render options conditionally? I think a good endpoint would be the one currently providing data for the About tab

I think that a good option would be for v2.4 users only to be able to pick one option (and automatically disable the user from picking more than one choice), and if the board is >=2.5, for it to not have that "disabled after one". That way, the experience would be similar for all users, and v2.5 boards could pick many nodes.

svenrademakers · 2024-05-14T06:45:16Z

You brought up a good point. Of course, this behavior should only occur when a 2.5+ board is detected. You're also right that we need an endpoint to detect which of the 2 versions needs to be loaded. I would prefer to have a field encoded in the actual flashing endpoint that specifies something like:

{
 can_do_bulk_flashing: true
 ...
 }

Making the code dependent on the firmware version is a less clean option as we make ourselves dependent on this specific hardware when in theory, it doesn't matter on which hardware it runs.

I think that a good option would be for v2.4 users only to be able to pick one option (and automatically disable the user from picking more than one choice),

that sounds good to me. it will keep things consistent!

barrenechea · 2024-05-15T13:36:12Z

I wonder if there is a chance to make the multiple selection of nodes work on 2.4... It may not be possible to flash them all simultaneously, but if we could flash them in sequence, the UI would work for both boards (just that v2.5 would be up to four times faster).

I could do a workaround on the frontend (to "send" flashing requests in sequence after one finishes), but if the backend could handle it, we could handle all the flashing sequence with a single image upload.

MPC-GH · 2024-05-15T13:55:26Z

The BMC itself doesn't have a lot of storage or ram, so you would be reliant on there being an SD Card of sufficient size in place if you were sequential flashing without re-streaming the image over the network repeatedly. Seems complex to do nicely in the web interface.

Would we want to consider caching before flashing anyway if there's a suitably large SD card in place from a reliability perspective? I can certainly see some use cases (remote or hard to physically access setups) where you may not want to risk a network drop mid-flash. For my use cases, I probably wouldn't be using the GUI at that point if I'm honest, but a locally saved image and the command line tooling.

barrenechea · 2024-05-17T02:14:02Z

@MPC-GH Yeah you're right, it probably streams the uploaded file directly to the target node(s). Better to keep it simple for now so we don't delay the main feature.

@svenrademakers a question regarding the /api/bmc?opt=set&type=flash call. Currently, it expects something like:
/api/bmc?opt=set&type=flash&file=ubuntu.img&length=55345150&node=0 (node being 0-indexed)

Would it make sense for this to send a comma-separated list in the node field for bulk flashing? Something like:
/api/bmc?opt=set&type=flash&file=ubuntu.img&length=55345150&node=0,1,2,3

svenrademakers · 2024-05-23T08:31:09Z

@barrenechea, I would like to keep the API backward compatible as much as possible. Therefore, it would be better if we introduced an additional key (it's not more elegant by all means). Maybe copy or batch is the right word?

e.g.
/api/bmc?opt=set&type=flash&file=ubuntu.img&length=55345150&node=0&batch=1,2,3

barrenechea · 2024-05-23T22:13:32Z

e.g.
/api/bmc?opt=set&type=flash&file=ubuntu.img&length=55345150&node=0&batch=1,2,3

I like batch! I followed it to the teeth 😄 my draft PR is currently handling it with the following cases:

For all v2.4 boards (and v2.5 clicking a single node): node=0 (no batch parameter)
[v2.5 only] Nodes 1,2,3,4 clicked: node=0&batch=1,2,3
[v2.5 only] Nodes 1,3,4 clicked: node=0&batch=2,3
[v2.5 only] Nodes 2,4 clicked: node=1&batch=3

Note that I'm ordering the node IDs on the client, meaning:

If the user clicks first on Node 4 and second on Node 1, the payload will be:
node=0&batch=3

And not in the order the user clicked, like:
node=3&batch=0 <- This will not happen

It's just an Array.sort I'm doing before sending the request. If irrelevant, I could clean it up and save some CPU cycles on the front end 🤣

We'll see how it goes, but we have something to play with!

barrenechea mentioned this issue May 23, 2024

feat(2.5): Batch Flash Nodes turing-machines/BMC-UI#8

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v2.5] batch flash nodes #201

[v2.5] batch flash nodes #201

svenrademakers commented May 13, 2024 •

edited

Loading

barrenechea commented May 13, 2024

barrenechea commented May 13, 2024 •

edited

Loading

svenrademakers commented May 14, 2024

barrenechea commented May 15, 2024

MPC-GH commented May 15, 2024

barrenechea commented May 17, 2024

svenrademakers commented May 23, 2024

barrenechea commented May 23, 2024

[v2.5] batch flash nodes #201

[v2.5] batch flash nodes #201

Comments

svenrademakers commented May 13, 2024 • edited Loading

barrenechea commented May 13, 2024

barrenechea commented May 13, 2024 • edited Loading

svenrademakers commented May 14, 2024

barrenechea commented May 15, 2024

MPC-GH commented May 15, 2024

barrenechea commented May 17, 2024

svenrademakers commented May 23, 2024

barrenechea commented May 23, 2024

svenrademakers commented May 13, 2024 •

edited

Loading

barrenechea commented May 13, 2024 •

edited

Loading