Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --also-remote option to verdi process dump #6578

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 20 additions & 2 deletions src/aiida/cmdline/commands/cmd_process.py
Original file line number Diff line number Diff line change
Expand Up @@ -576,6 +576,13 @@
show_default=True,
help='Include extras in the `.aiida_node_metadata.yaml` written for every `ProcessNode`.',
)
@click.option(
'--also-remote',
is_flag=True,
default=False,
show_default=True,
help='If true, try to obtain also intermediate files on the remote computer that were not initially retrieved.',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps give a bit more detail here, e.g., mention that this requires opening a connection to the remote computer and that it possible cannot retrieve the files if the remote folder has been altered or deleted entirely in the meantime

)
@click.option(
'-f',
'--flat',
Expand All @@ -591,19 +598,21 @@
include_outputs,
include_attributes,
include_extras,
also_remote,
flat,
) -> None:
"""Dump process input and output files to disk.

Child calculations/workflows (also called `CalcJob`s/`CalcFunction`s and `WorkChain`s/`WorkFunction`s in AiiDA
jargon) run by the parent workflow are contained in the directory tree as sub-folders and are sorted by their
creation time. The directory tree thus mirrors the logical execution of the workflow, which can also be queried by
creation time. The directory tree thus mirrors the logical execution of the workflow, which can also be queried by
running `verdi process status <pk>` on the command line.

By default, input and output files of each calculation can be found in the corresponding "inputs" and
"outputs" directories (the former also contains the hidden ".aiida" folder with machine-readable job execution
settings). Additional input and output files (depending on the type of calculation) are placed in the "node_inputs"
and "node_outputs", respectively.
and "node_outputs", respectively. Using the `--also-remote` flag, additional files of the `remote_workdir` on the
Computer where the CalcJobs were run can be retrieved (if they still exist on the remote).

Lastly, every folder also contains a hidden, human-readable `.aiida_node_metadata.yaml` file with the relevant AiiDA
node data for further inspection.
Expand All @@ -618,8 +627,17 @@
include_extras=include_extras,
overwrite=overwrite,
flat=flat,
also_remote=also_remote,
)

if also_remote:
echo.echo_report(

Check warning on line 634 in src/aiida/cmdline/commands/cmd_process.py

View check run for this annotation

Codecov / codecov/patch

src/aiida/cmdline/commands/cmd_process.py#L634

Added line #L634 was not covered by tests
'`--also-remote` set to True. Will try to retrieve additional files from the `workdir` of the remote Computer.' # noqa: E501
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not really the workdir of the Computer (which is the base directory of the scratch) but rather of the calcjob itself

)
echo.echo_report(

Check warning on line 637 in src/aiida/cmdline/commands/cmd_process.py

View check run for this annotation

Codecov / codecov/patch

src/aiida/cmdline/commands/cmd_process.py#L637

Added line #L637 was not covered by tests
'If files are non-existent, they will be skipped silently. Check if the output files are what you expect.'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a good idea? will be difficult to tell if they werent copied because they no longer exist or because there was a problem

)

try:
dump_path = process_dumper.dump(process_node=process, output_path=path)
except FileExistsError:
Expand Down
53 changes: 52 additions & 1 deletion src/aiida/tools/dumping/processes.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@

from __future__ import annotations

import itertools as it
import logging
from pathlib import Path
from types import SimpleNamespace
Expand All @@ -18,7 +19,7 @@
import yaml

from aiida.common import LinkType
from aiida.common.exceptions import NotExistentAttributeError
from aiida.common.exceptions import ConfigurationError, NotExistentAttributeError
from aiida.orm import (
CalcFunctionNode,
CalcJobNode,
Expand All @@ -30,6 +31,7 @@
WorkFunctionNode,
)
from aiida.orm.utils import LinkTriple
from aiida.orm.utils.remote import get_calcjob_remote_paths

LOGGER = logging.getLogger(__name__)

Expand All @@ -42,13 +44,15 @@
include_attributes: bool = True,
include_extras: bool = True,
overwrite: bool = False,
also_remote: bool = False,
flat: bool = False,
) -> None:
self.include_inputs = include_inputs
self.include_outputs = include_outputs
self.include_attributes = include_attributes
self.include_extras = include_extras
self.overwrite = overwrite
self.also_remote = also_remote
self.flat = flat

@staticmethod
Expand Down Expand Up @@ -285,6 +289,53 @@
link_triples=output_links,
)

# Additional remote file retrieval should only apply for CalcJobNodes, not CalcFunctionNodes
if self.also_remote and isinstance(calculation_node, CalcJobNode):
self._dump_calculation_remote_files(calcjob_node=calculation_node, output_path=output_path)

Check warning on line 294 in src/aiida/tools/dumping/processes.py

View check run for this annotation

Codecov / codecov/patch

src/aiida/tools/dumping/processes.py#L294

Added line #L294 was not covered by tests

def _dump_calculation_remote_files(self, calcjob_node: CalcJobNode, output_path: Path) -> None:
"""Dump the additional remote files attached to a `CalcJobNode` to a specified output path.

:param calcjob_node: The `CalcJobNode` to be dumped.
:param output_path: The path where the files will be dumped.
"""

remote_workdir = calcjob_node.get_remote_workdir()
if remote_workdir is None:
raise NotExistentAttributeError(f"CalcJobNode <{calcjob_node.pk}> doesn't have a `remote_workdir`.")

Check warning on line 305 in src/aiida/tools/dumping/processes.py

View check run for this annotation

Codecov / codecov/patch

src/aiida/tools/dumping/processes.py#L303-L305

Added lines #L303 - L305 were not covered by tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not caught, do you really want to let this let the entire dump fail? Perhaps just log a warning and move on


# Exclude the objects that were already dumped, as they were either originally retrieved via `retrieve_list`
# or are already part of the file repository of the CalcJobNode, e.g. the `aiida.in` and `_aiidasubmit.sh`
retrieve_list = list(calcjob_node.get_retrieve_list()) # type: ignore[arg-type]
repository_list = calcjob_node.base.repository.list_object_names()
exclude_list = retrieve_list + repository_list

Check warning on line 311 in src/aiida/tools/dumping/processes.py

View check run for this annotation

Codecov / codecov/patch

src/aiida/tools/dumping/processes.py#L309-L311

Added lines #L309 - L311 were not covered by tests
Comment on lines +309 to +311
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few things here:

  1. Shouldn't everything that it is in the retrieve list also be in the repository list? Maybe there may be more in the retrieve list, but that would then correspond to a file that doesn't exist, so it will presumable still not be there on the remote, so it can be ignored.
  2. a bigger problem though I think is that the retrieve_list can contain wildcards and I don't think you currently handle them in the code below right? If you just rely on the contents of the repository, this problem disappears.


# Obtain a flattened list of the `RemoteData` objects.
# The computer UUIDs that are the keys of the returned dictionary of `get_calcjob_remote_paths` aren't
# needed, as we only run the function for a single CalcJobNode using its associated transport to get the files
calcjob_remote_paths = get_calcjob_remote_paths(pks=[calcjob_node.pk]) # type: ignore[list-item]
calcjob_remote_datas = list(it.chain.from_iterable(calcjob_remote_paths.values())) # type: ignore[union-attr]

Check warning on line 317 in src/aiida/tools/dumping/processes.py

View check run for this annotation

Codecov / codecov/patch

src/aiida/tools/dumping/processes.py#L316-L317

Added lines #L316 - L317 were not covered by tests

# Unlike for the `retrieve_files_from_list` in `execmanager.py`, we only dump the files to disk, rather than
# also storing them in the repository via `FolderData`
try:
with calcjob_node.get_transport() as transport:
for calcjob_remote_data in calcjob_remote_datas:

Check warning on line 323 in src/aiida/tools/dumping/processes.py

View check run for this annotation

Codecov / codecov/patch

src/aiida/tools/dumping/processes.py#L321-L323

Added lines #L321 - L323 were not covered by tests
# Obtain all objects from each of the RemoteDatas associated with the CalcJobNode
# (could this even be more than one?)
retrieve_objects = [_ for _ in calcjob_remote_data.listdir() if _ not in exclude_list]
remote_paths = [(Path(remote_workdir) / _).resolve() for _ in retrieve_objects]
local_paths = [(output_path / 'remote_files' / _).resolve() for _ in retrieve_objects]

Check warning on line 328 in src/aiida/tools/dumping/processes.py

View check run for this annotation

Codecov / codecov/patch

src/aiida/tools/dumping/processes.py#L326-L328

Added lines #L326 - L328 were not covered by tests

# Transport.get() works for files and folders, so we don't need to make a distinction
for rem, loc in zip(remote_paths, local_paths):
transport.get(str(rem), str(loc), ignore_nonexisting=True)

Check warning on line 332 in src/aiida/tools/dumping/processes.py

View check run for this annotation

Codecov / codecov/patch

src/aiida/tools/dumping/processes.py#L331-L332

Added lines #L331 - L332 were not covered by tests
Copy link
Contributor

@khsrali khsrali Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think transport.get() has overwrite=True for default.
Maybe in this case it makes sense to issue a warning (that I'm overwriting your local files)

A one other points, also mentioned by others:

  • maybe one can pass glob patterns (e.g. all *.log files)

Copy link
Contributor Author

@GeigerJ2 GeigerJ2 Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good point! In principle, dumping has an overwrite argument, as well. Though, I'm getting a bit tired of populating this setting down the hierarchy to the different sub-WorkChains and CalcJobs. So I've been thinking of either making it global, for just the parent directory of the dumping, or possibly making it more fine-grained. The remote files are a case where the latter case could make sense.

Currently working on the possibility to provide explicit list of files or glob patterns to include/exclude! 🚀


# Getting the transport fails, propagate exception outwards
# (could just remove the except, but being explicit might make it clearer here)
except ConfigurationError:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait, a ConfigurationError? Where is that being thrown? Looks like a different problem.

raise

Check warning on line 337 in src/aiida/tools/dumping/processes.py

View check run for this annotation

Codecov / codecov/patch

src/aiida/tools/dumping/processes.py#L336-L337

Added lines #L336 - L337 were not covered by tests

def _dump_calculation_io(self, parent_path: Path, link_triples: LinkManager | List[LinkTriple]):
"""Small helper function to dump linked input/output nodes of a `CalculationNode`.

Expand Down
Loading