-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add --also-remote
option to verdi process dump
#6578
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -576,6 +576,13 @@ | |
show_default=True, | ||
help='Include extras in the `.aiida_node_metadata.yaml` written for every `ProcessNode`.', | ||
) | ||
@click.option( | ||
'--also-remote', | ||
is_flag=True, | ||
default=False, | ||
show_default=True, | ||
help='If true, try to obtain also intermediate files on the remote computer that were not initially retrieved.', | ||
) | ||
@click.option( | ||
'-f', | ||
'--flat', | ||
|
@@ -591,19 +598,21 @@ | |
include_outputs, | ||
include_attributes, | ||
include_extras, | ||
also_remote, | ||
flat, | ||
) -> None: | ||
"""Dump process input and output files to disk. | ||
|
||
Child calculations/workflows (also called `CalcJob`s/`CalcFunction`s and `WorkChain`s/`WorkFunction`s in AiiDA | ||
jargon) run by the parent workflow are contained in the directory tree as sub-folders and are sorted by their | ||
creation time. The directory tree thus mirrors the logical execution of the workflow, which can also be queried by | ||
creation time. The directory tree thus mirrors the logical execution of the workflow, which can also be queried by | ||
running `verdi process status <pk>` on the command line. | ||
|
||
By default, input and output files of each calculation can be found in the corresponding "inputs" and | ||
"outputs" directories (the former also contains the hidden ".aiida" folder with machine-readable job execution | ||
settings). Additional input and output files (depending on the type of calculation) are placed in the "node_inputs" | ||
and "node_outputs", respectively. | ||
and "node_outputs", respectively. Using the `--also-remote` flag, additional files of the `remote_workdir` on the | ||
Computer where the CalcJobs were run can be retrieved (if they still exist on the remote). | ||
|
||
Lastly, every folder also contains a hidden, human-readable `.aiida_node_metadata.yaml` file with the relevant AiiDA | ||
node data for further inspection. | ||
|
@@ -618,8 +627,17 @@ | |
include_extras=include_extras, | ||
overwrite=overwrite, | ||
flat=flat, | ||
also_remote=also_remote, | ||
) | ||
|
||
if also_remote: | ||
echo.echo_report( | ||
'`--also-remote` set to True. Will try to retrieve additional files from the `workdir` of the remote Computer.' # noqa: E501 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. not really the workdir of the Computer (which is the base directory of the scratch) but rather of the calcjob itself |
||
) | ||
echo.echo_report( | ||
'If files are non-existent, they will be skipped silently. Check if the output files are what you expect.' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this a good idea? will be difficult to tell if they werent copied because they no longer exist or because there was a problem |
||
) | ||
|
||
try: | ||
dump_path = process_dumper.dump(process_node=process, output_path=path) | ||
except FileExistsError: | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,6 +10,7 @@ | |
|
||
from __future__ import annotations | ||
|
||
import itertools as it | ||
import logging | ||
from pathlib import Path | ||
from types import SimpleNamespace | ||
|
@@ -18,7 +19,7 @@ | |
import yaml | ||
|
||
from aiida.common import LinkType | ||
from aiida.common.exceptions import NotExistentAttributeError | ||
from aiida.common.exceptions import ConfigurationError, NotExistentAttributeError | ||
from aiida.orm import ( | ||
CalcFunctionNode, | ||
CalcJobNode, | ||
|
@@ -30,6 +31,7 @@ | |
WorkFunctionNode, | ||
) | ||
from aiida.orm.utils import LinkTriple | ||
from aiida.orm.utils.remote import get_calcjob_remote_paths | ||
|
||
LOGGER = logging.getLogger(__name__) | ||
|
||
|
@@ -42,13 +44,15 @@ | |
include_attributes: bool = True, | ||
include_extras: bool = True, | ||
overwrite: bool = False, | ||
also_remote: bool = False, | ||
flat: bool = False, | ||
) -> None: | ||
self.include_inputs = include_inputs | ||
self.include_outputs = include_outputs | ||
self.include_attributes = include_attributes | ||
self.include_extras = include_extras | ||
self.overwrite = overwrite | ||
self.also_remote = also_remote | ||
self.flat = flat | ||
|
||
@staticmethod | ||
|
@@ -285,6 +289,53 @@ | |
link_triples=output_links, | ||
) | ||
|
||
# Additional remote file retrieval should only apply for CalcJobNodes, not CalcFunctionNodes | ||
if self.also_remote and isinstance(calculation_node, CalcJobNode): | ||
self._dump_calculation_remote_files(calcjob_node=calculation_node, output_path=output_path) | ||
|
||
def _dump_calculation_remote_files(self, calcjob_node: CalcJobNode, output_path: Path) -> None: | ||
"""Dump the additional remote files attached to a `CalcJobNode` to a specified output path. | ||
|
||
:param calcjob_node: The `CalcJobNode` to be dumped. | ||
:param output_path: The path where the files will be dumped. | ||
""" | ||
|
||
remote_workdir = calcjob_node.get_remote_workdir() | ||
if remote_workdir is None: | ||
raise NotExistentAttributeError(f"CalcJobNode <{calcjob_node.pk}> doesn't have a `remote_workdir`.") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is not caught, do you really want to let this let the entire dump fail? Perhaps just log a warning and move on |
||
|
||
# Exclude the objects that were already dumped, as they were either originally retrieved via `retrieve_list` | ||
# or are already part of the file repository of the CalcJobNode, e.g. the `aiida.in` and `_aiidasubmit.sh` | ||
retrieve_list = list(calcjob_node.get_retrieve_list()) # type: ignore[arg-type] | ||
repository_list = calcjob_node.base.repository.list_object_names() | ||
exclude_list = retrieve_list + repository_list | ||
Comment on lines
+309
to
+311
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Few things here:
|
||
|
||
# Obtain a flattened list of the `RemoteData` objects. | ||
# The computer UUIDs that are the keys of the returned dictionary of `get_calcjob_remote_paths` aren't | ||
# needed, as we only run the function for a single CalcJobNode using its associated transport to get the files | ||
calcjob_remote_paths = get_calcjob_remote_paths(pks=[calcjob_node.pk]) # type: ignore[list-item] | ||
calcjob_remote_datas = list(it.chain.from_iterable(calcjob_remote_paths.values())) # type: ignore[union-attr] | ||
|
||
# Unlike for the `retrieve_files_from_list` in `execmanager.py`, we only dump the files to disk, rather than | ||
# also storing them in the repository via `FolderData` | ||
try: | ||
with calcjob_node.get_transport() as transport: | ||
for calcjob_remote_data in calcjob_remote_datas: | ||
# Obtain all objects from each of the RemoteDatas associated with the CalcJobNode | ||
# (could this even be more than one?) | ||
retrieve_objects = [_ for _ in calcjob_remote_data.listdir() if _ not in exclude_list] | ||
remote_paths = [(Path(remote_workdir) / _).resolve() for _ in retrieve_objects] | ||
local_paths = [(output_path / 'remote_files' / _).resolve() for _ in retrieve_objects] | ||
|
||
# Transport.get() works for files and folders, so we don't need to make a distinction | ||
for rem, loc in zip(remote_paths, local_paths): | ||
transport.get(str(rem), str(loc), ignore_nonexisting=True) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think A one other points, also mentioned by others:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, good point! In principle, dumping has an overwrite argument, as well. Though, I'm getting a bit tired of populating this setting down the hierarchy to the different sub-WorkChains and CalcJobs. So I've been thinking of either making it global, for just the parent directory of the dumping, or possibly making it more fine-grained. The remote files are a case where the latter case could make sense. Currently working on the possibility to provide explicit list of files or glob patterns to include/exclude! 🚀 |
||
|
||
# Getting the transport fails, propagate exception outwards | ||
# (could just remove the except, but being explicit might make it clearer here) | ||
except ConfigurationError: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. wait, a |
||
raise | ||
|
||
def _dump_calculation_io(self, parent_path: Path, link_triples: LinkManager | List[LinkTriple]): | ||
"""Small helper function to dump linked input/output nodes of a `CalculationNode`. | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps give a bit more detail here, e.g., mention that this requires opening a connection to the remote computer and that it possible cannot retrieve the files if the remote folder has been altered or deleted entirely in the meantime