-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Heads up: Kubernetes script adapter! #434
Comments
Well, this sounds like an interesting development! So, I do have a few comments/questions on future tasks that may impact this, and more general questions on what you might need for this script adapter. And apologies in advance for any dumb questions about containers in general as I've not had much experience with them yet.
|
I'm running out the door for a quick run, but some quick answers (and can follow up with more detail where needed)
For scheduling to kubernetes what we are generally taking are cores, tasks, gpu, the basic high level stuff (for now). We can also support memory (which flux cannot). This goes into a resource request for the kubelet. So for now, just scope it to the same stuff that flux might use.
It comes down to the shell provided for the container - e.g., if there is no bash in the container, the entrypoint would need to use /bin/sh. As for the script contents, we write them on the fly (just a bash script) so it can be whatever. Right now I'm writing a config map entrypoint that gets mounted read only and executed with
I'm assuming one script == one kubernetes abstraction. That could be a single node / pod, or it could be an entire flux operator cluster. I might ask for more detail on this because I'm not sure about what a "step" means here.
If you mean to have the inner logic of the script further run maestro, I'd stay away from that design for now. Snakemake does something similar and it works pretty well for controlled environments, but for kubernetes I've found it adds a ton of complexity because (for the most part) I don't want to build a container with my application and snakemake (or maestro). OK - I'm off! 🏃 |
Sounds nice and easy there!
So, bundling these two as they're closely related. First, just to clarify and make sure we're talking about the same thing, I was really asking about what the 'cmd' block in the maestro spec looks like in this mode. Currently they're all bash, so question was aimed really at does the step definition in the maestro spec change appreciably for this mode, or is the only real difference being that the step's cmd/script is just executing in a container vs by an hpc scheduler? And then the question becomes where's that container defined, and how, as defining containers on a per step basis seems like a requirement for this mode vs the one scheduler per study mode it has now.
Not quite; this wouldn't be maestro executing itself inside a step (though i'm sure there'll be requests for something like that at some point). The |
It doesn't really change, it's still a lump of bash script stuff and just executed when the container starts. From what I've seen, there would usually be a flux submit or similar in that block, and you wouldn't have that here. Once that script is running, you are already in an isolated job (the batch/v1 Job that has one or more pods with container(s).
I'm defining the container on the level of the job step. You are correct that for different abstractions (e.g., JobSet) where there could be two containers deployed for one job, you'd have variation there.
We would not want that in a container. mpirun, yes, but not a flux submit, unless it's the flux operator, in which case that is handled for you. |
hey maestro maintainers! 👋
Heads up that we wrote a script adapter (according to maestro's design) for Kubernetes - it will create a batch job and associated config map with an entrypoint to run a job as a Job. Instead of using the factory (that would require the code to live here) I simply just instantiate it while we are developing. It's currently working pretty nicely for me locally (but only has existed for a few hours, so likely more work TBA), but I wanted to give you a heads up in case there is any development detail I should know about (e.g., next release date, change in design of something, etc). I also added an OCI registry handle to our mini mummi component that knows how to handle pushing/pulling artifacts shared between steps, and likely we want to imagine something like that for here too (I haven't thought about it yet). I'm opening this issue for any discussion we might have, and for when the PR is opened, to track potential features or needs.
We are prototyping it for a mini-mummi (and eventually more scheduling experiments I suspect) and I'd like to submit a PR when we are further along with that.
Ping @tpatki and @milroy to stay in the loop. 🌀
The text was updated successfully, but these errors were encountered: