dump computer/codes in YAML format #3521

adegomme · 2019-11-07T15:59:27Z

Sometimes when using another environment, or setting up a new aiida for a new user, we have to setup the same computers/codes over and over again. For this, loading from YAML is quite practical.

But I can't find a way to generate these YAML files from an existing installation directly.
"verdi computer show" output is quite close to yaml and can be converted rather simply, but a "dump" command, or a --yaml switch for show would be useful (maybe I missed something and it's already possible, apologies in this case).

ltalirz · 2019-11-07T19:53:20Z

very good point - it does not exist and would be very useful (and easy to implement)

adegomme · 2019-11-08T10:09:59Z

This would actually make it possible to provide a library of ready-to-use configurations for known HPC systems and codes, this could be interesting, as this part is actually quite tedious for end users.

ltalirz · 2019-11-08T10:15:46Z

This would actually make it possible to provide a library of ready-to-use configurations for known HPC systems and codes

This is already possible: simply set up the computers and codes and create an AiiDA export file.
This has the added advantage that computers / codes will be uniquely identified when used by multiple people.

verdi import works with URLs, so you can host your export file on a public URL and people will be able to get the configuration directly from the verdi cli.

fixes aiidateam#3521 Add `verdi code export` command to export code from command line as a ymal file. This is mentioned in usability improvement as well as having a command to export the code and computer setup. Keys of YAML file are read from the cli option of the corresponding code class.

@qiaojunfeng

I commited the changes from my local fork of the PR now here directly, as I think that will make working on it more straightforward. Hope that's OK for everybody. Based on @qiaojunfeng's original implementation, there's a recursive function that traverses the `WorkChain` hierarchy. Now, as actual file I/O is only done for the `CalcJobNode`s involved in the workchain, I factored that out into its own function `_calcjob_dump`. I also moved out these implementations into `tools/dumping/processes` so that it's available via the normal Python API. In `cmd_workchain` and `cmd_calcjob` the commands `verdi workchain dump` and `verdi calcjob dump` are only wrappers around the respective functions. The `_calcjob_dump`, by default, dumps the `node.base.repository` and `node.outputs.retrieved` using `copy_tree`, as well as `input_nodes` of the `calcJobNode` if they are of type `SingleFileData` or `FolderData`. I started working on the `--use-prepare-for-submission` option, but as previously indicated by @sphuber (also see [here](https://aiida.discourse.group/t/obtain-a-calcjob-instance-from-a-corresponding-calcjobnode-or-builder/300)), it's not straightforward to make it work as intended, so I put that on hold for now and added a warning. For each `WorkChainNode` and `CalcJobNode` a selected set of the node `@property`s are dumped to yaml files. `extra`s and `attribute`s can also be included via the `cli`. I initially had a function for this, but realized that I'm just passing these arguments all the way through, so I encapsulated that in the the `ProcessNodeYamlDumper` class. Now, an instance is created when the respective `cli` command is run, and the arguments are set as instance variables, with the instance being passed through, rather than the original arguments. The `dump_yaml` method is then called at the relevant positions with the `process_node` and `output_path` arguments. Regarding relevant positions: To get the node yaml file for every `ProcessNode` involved, it's called in `cmd_workchain` and `cmd_calcjob` for the parent node, and subsequently for all outgoing links. Maybe there's a better way to handle that? THe other commands initially mentioned by @qiaojunfeng also seem very interesting and could probably easily implemented based on his original implementation, though we should agree on the overall API/relevant namespace first. A few more notes: - The `cli` options for `verdi workchain dump` and `verdi calcjob dump` are basically the same and the code is duplicated. We could avoid that by merging it into one, e.g. under `verdi node dump`, however, as a user, I would find `verdi workchain dump` and `verdi calcjob dump` more intuitive. Also, `verdi node repo dump` uses a former implementation of `copy_tree`, so I'd go ahead and update that, but in a separate PR (related, do we want to also implement (or change it to) just `verdi node dump`?) - Regarding the `yaml` dumping, the `ProcessNodeYamlDumper` class is quite specific and just a little helper to get the job done here. Should we generalize such functionality, e.g. to resolve [issue aiidateam#3521](aiidateam#3521) to allow duming of computers/codes to yaml, or keep these things separate? - Currently, the pseudo directories are called `pseudos__<X>`, and I thought about splitting on the double underscore to just have one `pseudos` directory with a subdirectory for each element. Personally, I'd find that nicer than having a bunch of `pseudos__<X>`, but I'm not sure if the double underscore is based on general AiiDA name mangling, or again specific to `aiida-quantumespresso`. Lastly, examples of the default structure obtained from `verdi workchain dump`: ```shell dump-462 ├── 01-relax-PwRelaxWorkChain │ ├── 01-PwBaseWorkChain │ │ ├── 01-PwCalculation │ │ │ ├── aiida_node_metadata.yaml │ │ │ ├── node_inputs │ │ │ │ └── pseudos__Si │ │ │ │ └── Si.pbesol-n-rrkjus_psl.1.0.0.UPF │ │ │ ├── raw_inputs │ │ │ │ ├── .aiida │ │ │ │ │ ├── calcinfo.json │ │ │ │ │ └── job_tmpl.json │ │ │ │ ├── _aiidasubmit.sh │ │ │ │ └── aiida.in │ │ │ └── raw_outputs │ │ │ ├── _scheduler-stderr.txt │ │ │ ├── _scheduler-stdout.txt │ │ │ ├── aiida.out │ │ │ └── data-file-schema.xml │ │ └── aiida_node_metadata.yaml │ ├── 02-PwBaseWorkChain │ │ ├── 01-PwCalculation │ │ │ ├── aiida_node_metadata.yaml │ │ │ ├── node_inputs │ │ │ │ └── pseudos__Si │ │ │ │ └── Si.pbesol-n-rrkjus_psl.1.0.0.UPF │ │ │ ├── raw_inputs │ │ │ │ ├── .aiida │ │ │ │ │ ├── calcinfo.json │ │ │ │ │ └── job_tmpl.json │ │ │ │ ├── _aiidasubmit.sh │ │ │ │ └── aiida.in │ │ │ └── raw_outputs │ │ │ ├── _scheduler-stderr.txt │ │ │ ├── _scheduler-stdout.txt │ │ │ ├── aiida.out │ │ │ └── data-file-schema.xml │ │ └── aiida_node_metadata.yaml │ └── aiida_node_metadata.yaml ├── 02-scf-PwBaseWorkChain │ ├── 01-PwCalculation │ │ ├── aiida_node_metadata.yaml │ │ ├── node_inputs │ │ │ └── pseudos__Si │ │ │ └── Si.pbesol-n-rrkjus_psl.1.0.0.UPF │ │ ├── raw_inputs │ │ │ ├── .aiida │ │ │ │ ├── calcinfo.json │ │ │ │ └── job_tmpl.json │ │ │ ├── _aiidasubmit.sh │ │ │ └── aiida.in │ │ └── raw_outputs │ │ ├── _scheduler-stderr.txt │ │ ├── _scheduler-stdout.txt │ │ ├── aiida.out │ │ └── data-file-schema.xml │ └── aiida_node_metadata.yaml ├── 03-bands-PwBaseWorkChain │ ├── 01-PwCalculation │ │ ├── aiida_node_metadata.yaml │ │ ├── node_inputs │ │ │ └── pseudos__Si │ │ │ └── Si.pbesol-n-rrkjus_psl.1.0.0.UPF │ │ ├── raw_inputs │ │ │ ├── .aiida │ │ │ │ ├── calcinfo.json │ │ │ │ └── job_tmpl.json │ │ │ ├── _aiidasubmit.sh │ │ │ └── aiida.in │ │ └── raw_outputs │ │ ├── _scheduler-stderr.txt │ │ ├── _scheduler-stdout.txt │ │ ├── aiida.out │ │ └── data-file-schema.xml │ └── aiida_node_metadata.yaml └── aiida_node_metadata.yaml ``` and `verdi calcjob dump`: ```shell dump-530 ├── aiida_node_metadata.yaml ├── node_inputs │ └── pseudos__Si │ └── Si.pbesol-n-rrkjus_psl.1.0.0.UPF ├── raw_inputs │ ├── .aiida │ │ ├── calcinfo.json │ │ └── job_tmpl.json │ ├── _aiidasubmit.sh │ └── aiida.in └── raw_outputs ├── _scheduler-stderr.txt ├── _scheduler-stdout.txt ├── aiida.out └── data-file-schema.xml ``` for a `PwBandsWorkChain` and one of its involved `CalcJobNode`s.

sphuber · 2024-06-01T19:43:09Z

This was addressed in #6389 and #5860

sphuber added priority/nice-to-have topic/computers topic/verdi type/accepted feature approved feature request labels Nov 8, 2019

sphuber self-assigned this Nov 30, 2019

sphuber mentioned this issue Dec 9, 2019

Refactor Computer and add --format option to verdi computer show #3616

Closed

sphuber mentioned this issue Jun 18, 2020

Guide to set up a new profile #3727

Closed

unkcpz mentioned this issue Jan 6, 2023

CLI: verdi export to a yaml file with keys from code class #5860

Merged

mbercx mentioned this issue Apr 27, 2023

✨NEW: Add functionality to dump/load computers/codes aiidateam/aiida-project#3

Open

GeigerJ2 mentioned this issue Feb 28, 2024

Add CLI command to dump inputs/outputs of CalcJob/WorkChain #6276

Merged

agoscinski mentioned this issue May 13, 2024

CLI: Add the subcommand verdi computer export #6389

Merged

sphuber closed this as completed Jun 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dump computer/codes in YAML format #3521

dump computer/codes in YAML format #3521

adegomme commented Nov 7, 2019

ltalirz commented Nov 7, 2019

adegomme commented Nov 8, 2019 •

edited

Loading

ltalirz commented Nov 8, 2019 •

edited

Loading

sphuber commented Jun 1, 2024

dump computer/codes in YAML format #3521

dump computer/codes in YAML format #3521

Comments

adegomme commented Nov 7, 2019

ltalirz commented Nov 7, 2019

adegomme commented Nov 8, 2019 • edited Loading

ltalirz commented Nov 8, 2019 • edited Loading

sphuber commented Jun 1, 2024

adegomme commented Nov 8, 2019 •

edited

Loading

ltalirz commented Nov 8, 2019 •

edited

Loading