-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use num_machines * num_mpiprocs_per_machines to set num_cpus #33
Use num_machines * num_mpiprocs_per_machines to set num_cpus #33
Conversation
e46d949
to
faa0ca0
Compare
for more information, see https://pre-commit.ci
dd54854
to
e97ac6a
Compare
77f7de1
to
bdcd861
Compare
322e412
to
4d779ee
Compare
for more information, see https://pre-commit.ci
…-for-backward-compatibility
@superstar54 can you have a look at this PR? this is needed by the qeapp to set num_cpu through the default num_procs as we discussed. |
…-for-backward-compatibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @unkcpz thanks for the work!
# The `num_mpiprocs_per_machine` sometime can be read from "Default #procs/machine" of computer setup but if it is not exist | ||
# the builder can not be properly get without passing `option` to builder generator. | ||
# It is anyway a workaround for backward compatibility so this default is implemented despite it is quite specific for the qe plugin. | ||
resources.num_cpus = kwargs.pop("num_machines") * kwargs.pop( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember that this method is only used to validate the resource, and the return resources
is not used to override the actual resources, thus setting num_cpus
here will not help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the qeapp yes, you are correct, it will then be override by the change in here https://github.com/aiidalab/aiidalab-qe/pull/795/files. But this setting is not only for qeapp but also for the calculation that didn't set num_cpus
it will directly use num_machines
and num_mpiprocs_per_machine
(with the warning raised below) not fail the calculation.
"Please set `num_cpus` and `memory_mb`." | ||
|
||
message = f"{message} (this will be removed in aiida-hyperqueue v1.0)" | ||
warnings.warn(message, AiiDAHypereQueueDeprecationWarning, stacklevel=3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If my previous comment is ture, here we need to raise the error message instead of warnning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No the raise is in the KeyError
conditional branch. Here the warning is from else branch which warns when the num_cpus
is get from num_mpiprocs_per_machine
and num_machines
instead of directly set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification! One final comment—feel free to update or not.
# TODO: I only setting the default value as 1 for `num_mpiprocs_per_machine` because aiida-quantumespresso override | ||
# resources default with `num_machines` set to 1 and then get builder with such setting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aiida-hyperqueue should be independent from aiida-quantumespresso, could you rewrite this comment to make it more general?
Is the TODO
something that we need to work on in the future? Could you open an issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good point, in fact I don't have strong idea, the aiida-quantumespresso is actually also quite rely on scheduler is slurm-like.
I personally want to make hyperequeue an independent scheduler not inherent or rely on anything that is assumed as "base scheduler" mode. So the issue is kind of this one #8 but I have no good overview on how to having a good design on aiida-core side to solve this.
For backward compatibility where
num_machines
andnum_mpiprocs_per_machine
are setting.I only setting the default value as 1 for
num_mpiprocs_per_machine
becauseaiida-quantumespresso
overrideresources default with
num_machines
set to 1 and then get builder with such setting.The
num_mpiprocs_per_machine
sometime can be read from "Default #procs/machine" of computer setup but if it is not exist the builder can not be properly get without passingoption
to builder generator.It is anyway a workaround for backward compatibility so this default is implemented despite it is quite specific for the qe plugin.