Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

auto partition. #778

Open
joaander opened this issue Oct 30, 2023 · 5 comments
Open

auto partition. #778

joaander opened this issue Oct 30, 2023 · 5 comments

Comments

@joaander
Copy link
Member

Feature description

I would find it more convenient to use flow if the partition were automatically selected based on the job resource request. Many clusters have separate CPU and GPU partitions, or separate shared and whole node partitions. In a workflow with mixed CPU/GPU jobs (and/or jobs of different sizes), the user must manually run (e.g.):

project.py submit -o .*gpu' --partition=gpu
project.py submit -o .*small' --partition=shared
project.py submit -o .*large' --partition=wholenode

Some operations may auto-scale depending on the number of jobs left to execute. Until the user runs the submission command, they don't know whether shared or wholenode is the appropriate partition.

Proposed solution

The user should be able to make one submission:

project.py submit --partition=auto

Additional context

auto would select from one of the "standard" partitions (e.g. not the debug or high memory partitions) based on the job request:

  • If GPUs are requested, choose the gpu partition.
  • If more than one node is requested, choose the wholenode partition.
  • If less than one node is requested, choose the shared partition.

Any partition will remain settable explicitly on request.

@joaander joaander self-assigned this Oct 30, 2023
@b-butler
Copy link
Member

This should be an easy feature. I would support its addition. We need the appropriate underscored attributes in the environment classes where we set it to None by default. Perhaps something like

_default_partitions = {"gpu-shared": "gpu",
                       "cpu-shared": "shared",
                       "cpu": "standard",
                       "gpu": "gpu"}

@joaander
Copy link
Member Author

Yes, with that it may be possible to implement the auto selection in the base class this.

@joaander
Copy link
Member Author

Some systems use separate accounts for CPU and GPU: #703. These would not be able to use the auto partition.

@tcmoore3
Copy link
Member

Some systems use separate accounts for CPU and GPU: #703. These would not be able to use the auto partition.

Could we make that a config option, where users can set a default account and a GPU account?

@b-butler
Copy link
Member

@tcmoore3 theoretically yes, but then I wonder if we are getting too niche with that. I would rather something more future proof or less logic on our side like an account argument to an operation decorator or perhaps as a decorator (like the second less as it is not really a resource). We could likewise specify a partition to make two more keyword arguments.

@joaander joaander removed their assignment Nov 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants