Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem on running dcore DMFT loop #128

Open
shunhongzhang opened this issue Jul 15, 2022 · 18 comments
Open

Problem on running dcore DMFT loop #128

shunhongzhang opened this issue Jul 15, 2022 · 18 comments

Comments

@shunhongzhang
Copy link

Dear developers,

I am a beginner of dcore and I came across the following problem.
First I install dcore, triqs, triqs_dft_tools properly. (No error prompt and all modules can be imported)
Then I copy the example of 2D Hubbard model from the documentation web (dmft_square_pomerol.ini)
No problem in running
dcore_pre dmft_square_pomerol.ini
But when I run the True DMFT loop with
dcore dmft_square_pomerol.ini --np 1
it stops with the prompt


raise RuntimeError("Error occurred while executing MPI program! Output messages may be found in {}!".format(os.path.abspath(output_file.name)))

RuntimeError: Error occurred while executing MPI program! Output messages may be found in /public/home/workspace/many-body/dcore/work/sumkdft_Gloc/output!

I saw the following message in the output file


File "/public/home/anaconda3/envs/wanbe/lib/python3.8/site-packages/triqs/gf/gf.py", line 135, in delegate
assert isinstance(indices, (type(None), list, GfIndices)), "Type of indices incorrect : should be None, Gfindices, list of str, or list of list of str"
AssertionError: Type of indices incorrect : should be None, Gfindices, list of str, or list of list of str
Unexpected error: Type of indices incorrect : should be None, Gfindices, list of str, or list of list of str

Could you help me figure out what does this message mean and how can I solve it?
Thank you very much in advance!

Best regards,
SH

@shinaoka
Copy link
Contributor

shinaoka commented Jul 15, 2022

Hi, what is the version of TRIQS? Can I have a look at the full error message?

@gpwahl
Copy link

gpwahl commented Sep 24, 2022

Hi, I encountered the same error message when trying to run the main loop of dcore. Is there a recommendation how to resolve this issue? This is using triqs-3.1.x and running the example of srvo3 (using the alps/ct-hyb solver). Happy to send log files if useful.

The error message is the same as above (only last python error quoted here):
File "/.../triqs/lib/python3.6/site-packages/triqs/gf/gf.py", line 135, in delegate
assert isinstance(indices, (type(None), list, GfIndices)), "Type of indices incorrect : should be None, Gfindices, list of str, or list of list of str"
AssertionError: Type of indices incorrect : should be None, Gfindices, list of str, or list of list of str

@shinaoka
Copy link
Contributor

Thank you for the report.
I will work on it.
the alps/ct-hyb solver means this one?
https://github.com/ALPSCore/CT-HYB

@gpwahl
Copy link

gpwahl commented Sep 24, 2022

yes, that's correct. It's initialised in the .ini file as follows:
[impurity_solver]
name = ALPS/cthyb
timelimit{int} = 90
exec_path{str} = /.../ct-hyb/bin/hybmat
compiling the impurity solver and running its tests worked without any problems.

@shinaoka
Copy link
Contributor

shinaoka commented Sep 27, 2022

I somehow failed to reproduce the issue:

>>> from triqs.version import *
>>> show_version()

You are using the TRIQS library version 3.1.1

>>> show_git_hash()

You are using the git hash 1da1cb4ed217dea552a9b4147e4ecdc67d0fd94c

>>> import dcore
>>> dcore.__version__
'v3.3.1'

I'd appreciate if you describe how you installed triqs and DCore.
There may be something to do with Anaconda, I guess.

Could you install a development version and set the following environment variable?
This forces DCore to use the internally bundled version of TRIQS.

> pip install git+https://github.com/issp-center-dev/DCore.git@develop
> export DCORE_TRIQS_COMPAT=1
> dcore...

@gpwahl
Copy link

gpwahl commented Oct 3, 2022

Thanks a lot, installing the development version with the environment variable worked fine. Is there any disadvantage in using it like that?

In case it is of any use, if I don't set DCORE_TRIQS_COMPAT to 1, I get the following error messages in the output file (replicated once for each MPI task), which is the same as above:
Traceback (most recent call last):
File "/home/.../.local/lib/python3.6/site-packages/dcore/sumkdft_workers/mpi_main.py", line 20, in
runner.run()
File "/home/.../.local/lib/python3.6/site-packages/dcore/sumkdft_workers/gloc_worker.py", line 38, in run
Gloc = sk.extract_G_loc(with_dc=with_dc)
File "/home/.../.local/lib/python3.6/site-packages/dcore/sumkdft_opt.py", line 269, in extract_G_loc
make_copies=False) for ish in range(self.n_inequiv_shells)]
File "/home/.../.local/lib/python3.6/site-packages/dcore/sumkdft_opt.py", line 269, in
make_copies=False) for ish in range(self.n_inequiv_shells)]
File "/home/.../.local/lib/python3.6/site-packages/dcore/sumkdft_opt.py", line 268, in
G_loc_inequiv = [BlockGf(name_block_generator=[(block, GfImFreq(indices=inner, mesh=G_loc[0].mesh)) for block, inner in self.gf_struct_solver[ish].items()],
File "/.../triqs/lib/python3.6/site-packages/triqs/gf/backwd_compat/gf_imfreq.py", line 73, in init
delegate(self, **kw)
File "/.../triqs/lib/python3.6/site-packages/triqs/gf/backwd_compat/gf_imfreq.py", line 71, in delegate
name = name)
File "/.../triqs/lib/python3.6/site-packages/triqs/gf/gf.py", line 192, in init
delegate(self, **kw)
File "/.../triqs/lib/python3.6/site-packages/triqs/gf/gf.py", line 135, in delegate
assert isinstance(indices, (type(None), list, GfIndices)), "Type of indices incorrect : should be None, Gfindices, list of str, or list of list of str"
AssertionError: Type of indices incorrect : should be None, Gfindices, list of str, or list of list of str

For the triqs installation, this is pulling it from the GitHub repository and compiling it myself. The compilation goes through fine and passes all tests for triqs (i.e. make test shows no failed tests).

@shinaoka
Copy link
Contributor

shinaoka commented Oct 4, 2022

Is there any disadvantage in using it like that?

As far as I tested myself, there is no disadvantage.

For the triqs installation, this is pulling it from the GitHub repository and compiling it myself. The compilation goes through fine and passes all tests for triqs (i.e. make test shows no failed tests).

This is weird... I have no idea.

@an-karpov
Copy link

an-karpov commented Oct 27, 2022

Hi. I have same problem with examples here:
https://issp-center-dev.github.io/DCore/master/impuritysolvers/triqs_cthyb/cthyb.html

but this doesn't help me

> pip install git+https://github.com/issp-center-dev/DCore.git@develop
> export DCORE_TRIQS_COMPAT=1
> dcore...

The text of error:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/dcore/sumkdft_workers/mpi_main.py", line 20, in
runner.run()
File "/usr/local/lib/python3.8/dist-packages/dcore/sumkdft_workers/gloc_worker.py", line 38, in run
Gloc = sk.extract_G_loc(with_dc=with_dc)
File "/usr/local/lib/python3.8/dist-packages/dcore/sumkdft_opt.py", line 268, in extract_G_loc
G_loc_inequiv = [BlockGf(name_block_generator=[(block, GfImFreq(indices=inner, mesh=G_loc[0].mesh)) for block, inner in self.gf_struct_solver[ish].items()],
File "/usr/local/lib/python3.8/dist-packages/dcore/sumkdft_opt.py", line 268, in
G_loc_inequiv = [BlockGf(name_block_generator=[(block, GfImFreq(indices=inner, mesh=G_loc[0].mesh)) for block, inner in self.gf_struct_solver[ish].items()],
File "/usr/local/lib/python3.8/dist-packages/dcore/sumkdft_opt.py", line 268, in
G_loc_inequiv = [BlockGf(name_block_generator=[(block, GfImFreq(indices=inner, mesh=G_loc[0].mesh)) for block, inner in self.gf_struct_solver[ish].items()],
File "/usr/lib/python3.8/dist-packages/triqs/gf/backwd_compat/gf_imfreq.py", line 73, in init
delegate(self, **kw)
File "/usr/lib/python3.8/dist-packages/triqs/gf/backwd_compat/gf_imfreq.py", line 66, in delegate
super(GfImFreq, self).init(
File "/usr/lib/python3.8/dist-packages/triqs/gf/gf.py", line 192, in init
delegate(self, **kw)
File "/usr/lib/python3.8/dist-packages/triqs/gf/gf.py", line 135, in delegate
assert isinstance(indices, (type(None), list, GfIndices)), "Type of indices incorrect : should be None, Gfindices, list of str, or list of list of str"
AssertionError: Type of indices incorrect : should be None, Gfindices, list of str, or list of list of str

My versions of triqs and dcore

>>> show_version()

You are using the TRIQS library version 3.1.1

>>> show_git_hash()

You are using the git hash a8924f99d1805de71e3d9b778f1b8aebba03f2d7

>>> import dcore
>>> dcore.__version__
'v3.4.0+1.gb1ea136'```

@shinaoka
Copy link
Contributor

shinaoka commented Oct 27, 2022

@an-karpov
Hi, how did you build the triqs/cthyb solver?
The solver must be built with -DLocal_hamiltonian_is_complex=ON -DHybridisation_is_complex=ON.
https://issp-center-dev.github.io/DCore/master/impuritysolvers/triqs_cthyb/cthyb.html

@an-karpov
Copy link

@an-karpov Hi, how did you build the triqs/cthyb solver? The solver must be built with -DLocal_hamiltonian_is_complex=ON -DHybridisation_is_complex=ON. https://issp-center-dev.github.io/DCore/master/impuritysolvers/triqs_cthyb/cthyb.html

I used the following command:
sudo apt-get install -y triqs_cthyb

does it have to be installed via the sources?

@shinaoka
Copy link
Contributor

shinaoka commented Oct 27, 2022

does it have to be installed via the sources?

I think so. I am using my own ALPS/CT-HYB regularly.

@Hanxiang-XU
Copy link

Dear developers,

I try to run a DCore calculation on ISSP supercomputer System B (ohtaka) by MPI job using following impurity solver 'ALPS/cthyb'

[impurity_solver]
name = ALPS/cthyb
exec_path{str} = /home/issp/materiapps/intel/alps_cthyb/alps_cthyb-51cf6b6715ced29cb1f9c1266556691490a035b7-gcc/bin/hybmat
timelimit{int} = 300

For the example of 2D Hubbard model, it works well. But when I apply it on a real material, the calculation is stopped by an error 'Thrown exception Acceptance_rate_global_shift was measured on only some of the MPI processes.' in 2nd iteration, which can be checked in dcore.log or output file in work/ directory.
And in dcore.err, there is an error

File "/home/issp/materiapps/intel/triqs/triqs-3.0.0-0/lib/python3.8/site-packages/h5/archive.py", line 211, in getitem1
raise KeyError("Key %s does not exist."%key)
KeyError: 'Key Sign does not exist.'

Operators of ISSP said they found no problems with System B and suggested me to report the error in here.
Is it possible to solve this problem?
If you need more information or error messages, please let me know.

@shinaoka
Copy link
Contributor

shinaoka commented Dec 2, 2022

The error message indicates that the total simulation time is too short or one Monte Carlo step took loo long to accumulate Monte Carlo results. Do you have more log messages from the solver?

@Hanxiang-XU
Copy link

Yes, there is, but I could not find the difference of log messages comparing with successful calculation.

In work/imp_shell0_ite2/output :

[LOG_CAT_MCAST] VMC Failed to join multicast, is_root 1. Unexpected event was received: event=13, str=RDMA_CM_EVENT_MULTICAST_ERROR, status=-22
[LOG_CAT_MCAST] MCAST rank=7: Error in initializing vmc communicator
[LOG_CAT_P2P] Failed to create MCAST comm
[LOG_CAT_MCAST] MCAST rank=2: Error in initializing vmc communicator
[LOG_CAT_P2P] Failed to create MCAST comm
[LOG_CAT_MCAST] MCAST rank=4: Error in initializing vmc communicator
[LOG_CAT_P2P] Failed to create MCAST comm
[LOG_CAT_MCAST] MCAST rank=5: Error in initializing vmc communicator
[LOG_CAT_P2P] Failed to create MCAST comm
[LOG_CAT_MCAST] MCAST rank=3: Error in initializing vmc communicator
[LOG_CAT_P2P] Failed to create MCAST comm
[LOG_CAT_MCAST] MCAST rank=6: Error in initializing vmc communicator
[LOG_CAT_P2P] Failed to create MCAST comm
[LOG_CAT_MCAST] MCAST rank=1: Error in initializing vmc communicator
[LOG_CAT_P2P] Failed to create MCAST comm
[LOG_CAT_MCAST] MCAST rank=0: Error in initializing vmc communicator
[LOG_CAT_P2P] Failed to create MCAST comm
Reading ./Uijkl.txt...
Number of non-zero elements in U tensor is 140
Opening ./basis.txt...
dim of Hilbert space 1024

of blocks 36

of sectors 36

Sector 0 : dim = 1, min energy = -2.0418, max energy = -2.0418
...
Sector 35 : dim = 1, min energy = 0, max energy = 0
Max eigen energy = 0
Min eigen energy = -26.0209
Throwing away high energy states...
Sector 0 : dim = 1, min energy = -2.0418, max energy = -2.0418
...
Sector 35 : dim = 1, min energy = 0, max energy = 0
Max eigen energy = 0
Min eigen energy = -26.0209
Reference energy -26.0209
Dim of ket: sector 0 inner 1 outer 1
...
Dim of ket: sector 35 inner 1 outer 1
The following swap updates will be performed.
Update #0 generated from template #0
flavor 0 to flavor 1
flavor 1 to flavor 0
flavor 2 to flavor 3
flavor 3 to flavor 2
flavor 4 to flavor 5
flavor 5 to flavor 4
flavor 6 to flavor 7
flavor 7 to flavor 6
flavor 8 to flavor 9
flavor 9 to flavor 8
Checking if the simulation is finished: 2 sec passed.
Checking if the simulation is finished: 11 sec passed.
Checking if the simulation is finished: 23 sec passed.
Checking if the simulation is finished: 43 sec passed.
Thermalization process done after 40 steps.
The number of segments for sliding window update is 1.
Perturbation orders (averaged over processes) are the following:
flavor 0 0
...
flavor 9 0
Checking if the simulation is finished: 68 sec passed.
...
Checking if the simulation is finished: 301 sec passed.
Thrown exception Acceptance_rate_global_shift was measured on only some of the MPI processes.

Some logs are omitted by '...'

In dcore.err, there are all logs:

Traceback (most recent call last):
File "/home/k0395/k039504/.local/bin/dcore", line 33, in
sys.exit(load_entry_point('dcore', 'console_scripts', 'dcore')())
File "/home/k0395/k039504/dcore.src/src/dcore/dcore.py", line 111, in run
dcore(args.path_input_files, int(args.np))
File "/home/k0395/k039504/dcore.src/src/dcore/dcore.py", line 58, in dcore
solver.do_steps(max_step=params["control"]["max_step"])
File "/home/k0395/k039504/dcore.src/src/dcore/dmft_core.py", line 842, in do_steps
new_Sigma_iw, new_Gimp_iw = self.solve_impurity_models(Gloc_iw_sh, iteration_number)
File "/home/k0395/k039504/dcore.src/src/dcore/dmft_core.py", line 666, in solve_impurity_models
r = solve_impurity_model(solver_name, self._solver_params, self._mpirun_command,
File "/home/k0395/k039504/dcore.src/src/dcore/dmft_core.py", line 215, in solve_impurity_model
sol.solve(rot, mpirun_command, s_params)
File "/home/k0395/k039504/dcore.src/src/dcore/impurity_solvers/alps_cthyb.py", line 126, in solve
self._solve_impl(rot, mpirun_command, None, params_kw)
File "/home/k0395/k039504/dcore.src/src/dcore/impurity_solvers/alps_cthyb.py", line 291, in _solve_impl
sign = f['Sign']
File "/home/issp/materiapps/intel/triqs/triqs-3.0.0-0/lib/python3.8/site-packages/h5/archive.py", line 205, in getitem
return self.getitem1(key,self._reconstruct_python_objects)
File "/home/issp/materiapps/intel/triqs/triqs-3.0.0-0/lib/python3.8/site-packages/h5/archive.py", line 211, in getitem1
raise KeyError("Key %s does not exist."%key)
KeyError: 'Key Sign does not exist.'

@shinaoka
Copy link
Contributor

shinaoka commented Dec 7, 2022

Thank you. I need to check the omitted message. Could you upload the full log message from the QMC solver as a zip file?

@Hanxiang-XU
Copy link

Thank you very much!
I just notice that I can upload files in the comments.
These are files that were used to report error to ISSP including input files and output log files.
I hope their are useful.

error_DCore_hybmat.zip

@shinaoka
Copy link
Contributor

shinaoka commented Dec 9, 2022

The simulation time seems to be too short. Could you increase the total simulation time?

Perturbation orders just before and after measurement steps are 26.4284 and 45.2751.

These two numbers should match roughly.

@Hanxiang-XU
Copy link

Thank you very much!
It worked for a low precision, cheap calculation. I am trying another a little heavy calculation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants