-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow Green function checkpointing on large setups risks unusable gf file #73
Comments
I think the problem is when the job crashed while writing the green functions.
|
E.g. of timing:
|
ok, I guess the problem is that the full green function (including the zeros) needs to be written at each call. |
Here is an example of BP5 with the default mesh.
|
Ok, it seems I fixed one of the problem with this simple commit: Now checkpointing is much faster!
|
and with cfd7a25 |
Describe the bug
I'm running BP5.toml based on this branch #72 (at commit ee87ac9)
which is a few commits on top of #59
I changed res_f to 5 to have a very small mesh to test.
Im BP5.toml, I add:
So that green functions are checkpointed every new green function.
Generally it works.
But it also happened several times that it was not able to restart.
E.g. job killed during generation of GF:
Next job failing:
I noticed similar issues on kernelpanic.
Expected behavior
the green function generation should have started again.
To Reproduce
Steps to reproduce the behavior:
spack intstalled on supermuc NG with:
spack install -j 30 tandem@tscp polynomial_degree=2 domain_dimension=3
Here is a list of the dependencies of tandem, and there specs:
The text was updated successfully, but these errors were encountered: