Skip to content

Commit

Permalink
deploy: 44ac327
Browse files Browse the repository at this point in the history
  • Loading branch information
bernstei committed Oct 30, 2023
1 parent 3da4c0b commit 8d0c704
Show file tree
Hide file tree
Showing 8 changed files with 101 additions and 10 deletions.
8 changes: 8 additions & 0 deletions _sources/overview.configset.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,14 @@ Input and output of atomic structures

``OutputSpec`` works as the output layer, used for writing results during iterations, but the actual writing is not guaranteed to happen until the operation is closed with ``OutputSpec.close()``. It is possible to map a different output file to each input file, which will result in the outputs corresponding to each input file ending up in a different output file.

.. warning::
To efficiently restart interrupted operations, if the ``OutputSpec`` object specifies storing the output
data in a file, autoparallelized workflow operations will use the existing file instead of redoing the calculation.
If the workflow code (or any functions that are called by it, directly or indirectly) are changed, this will not
be detected, and the previous, perhaps no longer correct, output will still be used.
The user must manually delete output files from operations that have been changed to force
the calculation to be redone.

Users should consult the simple example in :doc:`first_example`, or the documentation of the two classes at
:meth:`wfl.configset.ConfigSet` and :meth:`wfl.configset.OutputSpec`

Expand Down
5 changes: 5 additions & 0 deletions _sources/overview.parallelisation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,11 @@ Much of the pipeline, including the input/output facilitated by ``ConfigSet``/``
job submitted to a local or remote queuing system. The job can then use python
subprocess parallelization itself. [remote jobs not documented here yet]

.. warning::
Autoparallelized operations will use cached output files. Even if the code that is executed by
the operation has changed, the previous and perhaps wrong output will be used.
See warning in :doc:`overview.configset`

*****************************************************
Programming script that use parallelized operations
*****************************************************
Expand Down
35 changes: 32 additions & 3 deletions _sources/overview.queued.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,12 @@ should be executed this way. Any remote machine to be used requires that the `w
module be installed. If needed, commands needed to make this module available (e.g. setting `PYTHONPATH`)
can be set on a per-machine basis in the `config.json` file mentioned below.

```{warning}
To facilitate restarts of interrupted operations, submitted jobs are cached. If the code
executed by the job is changed, this may result in cached but incorrect output being used.
See [discussion below](sec:example:restarts).
```

In addition, `wfl.fit.gap_simple`, `wfl.fit.gap_multistage`, and `wfl.fit.ace` have been wrapped, as a single
job each. The GAP simple fit is controlled by the `WFL_GAP_SIMPLE_FIT_REMOTEINFO` env var. Setting
this variable will also lead to the multistage fit submitting each simple fit as its own job.
Expand All @@ -17,15 +23,18 @@ with the `WFL_GAP_MULTISTAGE_FIT_REMOTEINFO` env var. In principle, doing each
as its own job could enable running committee fits in parallel, but that is not supported right now.
The env var `WFL_ACE_FIT_REMOTEINFO` is used for ACE fits.

[NOTE: now that the multistage fit does very little other than the repeated simple fitting, does
it need its own level of remote job execution]
```{note}
Now that the multistage fit does very little other than the repeated simple fitting, does
it need its own level of remote job execution?
```

The `*REMOTEINFO` and `WFL_EXPYRE_INFO` environment variables allow to flexibly control which parts of
a (likely long and multi-file) fitting script are executed remotely and with what resources without a need
to change the script itself thus allowing for more flexibility. For simpler scripts, `RemoteInfo` python object
may be given to the to-be remotely submitted function instead of setting the environment variables.


(sec:example)=
## Example

The workflow (`do_workflow.py`) is essentially identical to what you'd otherwise construct:
Expand Down Expand Up @@ -81,6 +90,9 @@ the initial `_`, not `.`, so it is more visible) can optionally be created at
the directory hierarchy level that indicates the scope of the project,
to separate the jobs database from any other project.

(sec:example:restarts)=
### Restarts

Restarts are supposed to be handled automatically - if the workflow script is
interrupted, just rerun it. If the entire `autoparallelize` call is complete,
the default behavior of `OutputSpec` will allow
Expand All @@ -95,10 +107,27 @@ argument (obviously only if ignoring it for the purpose of detecting
duplicate submission is indeed correct). All functions already ignore the
`outputs` `OutputSpec` argument.

```{warning}
The hashing mechanism is only designed for interrupted runs, and does
not detect changes to the called function (or to any functions that
function calls). If the code is being modified, the user should erase the
`ExPyRe` staged job directories, and clean up the `sqlite` database file,
before rerunning. Using a per-project `_expyre` directory makes this
easier, since the database file can simply be erased, otherwise the `xpr` command
line tool needs to be used to delete the previously created jobs.
Note that this is only relevant to incomplete autoparallelized
operations, since any completed operation (once all the remote job outputs have
been gathered into the location specified in the `OutputSpec`) no longer depends on
anything `ExPyRe`-related. See also the warning in the
`OutputSpec` [documentation](overview.configset).
```

## WFL\_EXPYRE\_INFO syntax

The `WFL_EXPYRE_INFO` variable contains a JSON or the name of a file that contains a JSON. The JSON encodes a dict with keys
indicating particular function calls, and values containing arguments for constructing [`RemoteInfo`](wfl.autoparallelize.RemoteInfo) objects.
indicating particular function calls, and values containing arguments for constructing
[`RemoteInfo`](wfl.autoparallelize.remoteinfo.RemoteInfo) objects.


### Keys
Expand Down
Binary file modified objects.inv
Binary file not shown.
9 changes: 9 additions & 0 deletions overview.configset.html
Original file line number Diff line number Diff line change
Expand Up @@ -377,6 +377,15 @@ <h2> Contents </h2>
</div>
<p><code class="docutils literal notranslate"><span class="pre">ConfigSet</span></code> can encapsulate one or multiple lists of <code class="docutils literal notranslate"><span class="pre">ase.atoms.Atoms</span></code> objects, or reference to stored sets of configuration in files (ABCD databases are currently unsupported). It can function as an iterator over all configs in the input, or iterate over groups of them according to the input definition with the <code class="docutils literal notranslate"><span class="pre">ConfigSet().groups()</span></code> method. The <code class="docutils literal notranslate"><span class="pre">ConfigSet</span></code> must be initialized with its input configurations, files, or other <code class="docutils literal notranslate"><span class="pre">ConfigSet</span></code> objects.</p>
<p><code class="docutils literal notranslate"><span class="pre">OutputSpec</span></code> works as the output layer, used for writing results during iterations, but the actual writing is not guaranteed to happen until the operation is closed with <code class="docutils literal notranslate"><span class="pre">OutputSpec.close()</span></code>. It is possible to map a different output file to each input file, which will result in the outputs corresponding to each input file ending up in a different output file.</p>
<div class="admonition warning">
<p class="admonition-title">Warning</p>
<p>To efficiently restart interrupted operations, if the <code class="docutils literal notranslate"><span class="pre">OutputSpec</span></code> object specifies storing the output
data in a file, autoparallelized workflow operations will use the existing file instead of redoing the calculation.
If the workflow code (or any functions that are called by it, directly or indirectly) are changed, this will not
be detected, and the previous, perhaps no longer correct, output will still be used.
The user must manually delete output files from operations that have been changed to force
the calculation to be redone.</p>
</div>
<p>Users should consult the simple example in <a class="reference internal" href="first_example.html"><span class="doc">First Example</span></a>, or the documentation of the two classes at
<a class="reference internal" href="wfl.html#wfl.configset.ConfigSet" title="wfl.configset.ConfigSet"><code class="xref py py-meth docutils literal notranslate"><span class="pre">wfl.configset.ConfigSet()</span></code></a> and <a class="reference internal" href="wfl.html#wfl.configset.OutputSpec" title="wfl.configset.OutputSpec"><code class="xref py py-meth docutils literal notranslate"><span class="pre">wfl.configset.OutputSpec()</span></code></a></p>
<div class="section" id="internals-for-developers">
Expand Down
6 changes: 6 additions & 0 deletions overview.parallelisation.html
Original file line number Diff line number Diff line change
Expand Up @@ -388,6 +388,12 @@ <h2> Contents </h2>
job submitted to a local or remote queuing system. The job can then use python
subprocess parallelization itself. [remote jobs not documented here yet]</p></li>
</ul>
<div class="admonition warning">
<p class="admonition-title">Warning</p>
<p>Autoparallelized operations will use cached output files. Even if the code that is executed by
the operation has changed, the previous and perhaps wrong output will be used.
See warning in <a class="reference internal" href="overview.configset.html"><span class="doc">Input and output of atomic structures</span></a></p>
</div>
<div class="section" id="programming-script-that-use-parallelized-operations">
<h2>Programming script that use parallelized operations<a class="headerlink" href="#programming-script-that-use-parallelized-operations" title="Permalink to this heading">#</a></h2>
<p>Parallelized operations can be called from a python script, and have</p>
Expand Down
46 changes: 40 additions & 6 deletions overview.queued.html
Original file line number Diff line number Diff line change
Expand Up @@ -357,7 +357,10 @@ <h2> Contents </h2>
</div>
<nav aria-label="Page">
<ul class="visible nav section-nav flex-column">
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#example">Example</a></li>
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#example">Example</a><ul class="nav section-nav flex-column">
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#restarts">Restarts</a></li>
</ul>
</li>
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#wfl-expyre-info-syntax">WFL_EXPYRE_INFO syntax</a><ul class="nav section-nav flex-column">
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#keys">Keys</a></li>
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#values">Values</a></li>
Expand Down Expand Up @@ -388,21 +391,30 @@ <h1>Functions as independently queued jobs<a class="headerlink" href="#functions
should be executed this way. Any remote machine to be used requires that the <code class="docutils literal notranslate"><span class="pre">wfl</span></code> python
module be installed. If needed, commands needed to make this module available (e.g. setting <code class="docutils literal notranslate"><span class="pre">PYTHONPATH</span></code>)
can be set on a per-machine basis in the <code class="docutils literal notranslate"><span class="pre">config.json</span></code> file mentioned below.</p>
<div class="admonition warning">
<p class="admonition-title">Warning</p>
<p>To facilitate restarts of interrupted operations, submitted jobs are cached. If the code
executed by the job is changed, this may result in cached but incorrect output being used.
See <a class="reference internal" href="#sec-example-restarts"><span class="std std-ref">discussion below</span></a>.</p>
</div>
<p>In addition, <code class="docutils literal notranslate"><span class="pre">wfl.fit.gap_simple</span></code>, <code class="docutils literal notranslate"><span class="pre">wfl.fit.gap_multistage</span></code>, and <code class="docutils literal notranslate"><span class="pre">wfl.fit.ace</span></code> have been wrapped, as a single
job each. The GAP simple fit is controlled by the <code class="docutils literal notranslate"><span class="pre">WFL_GAP_SIMPLE_FIT_REMOTEINFO</span></code> env var. Setting
this variable will also lead to the multistage fit submitting each simple fit as its own job.
In addition, the multistage fit can be turned into a single job containing all the stages
with the <code class="docutils literal notranslate"><span class="pre">WFL_GAP_MULTISTAGE_FIT_REMOTEINFO</span></code> env var. In principle, doing each simple fit
as its own job could enable running committee fits in parallel, but that is not supported right now.
The env var <code class="docutils literal notranslate"><span class="pre">WFL_ACE_FIT_REMOTEINFO</span></code> is used for ACE fits.</p>
<p>[NOTE: now that the multistage fit does very little other than the repeated simple fitting, does
it need its own level of remote job execution]</p>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>Now that the multistage fit does very little other than the repeated simple fitting, does
it need its own level of remote job execution?</p>
</div>
<p>The <code class="docutils literal notranslate"><span class="pre">*REMOTEINFO</span></code> and <code class="docutils literal notranslate"><span class="pre">WFL_EXPYRE_INFO</span></code> environment variables allow to flexibly control which parts of
a (likely long and multi-file) fitting script are executed remotely and with what resources without a need
to change the script itself thus allowing for more flexibility. For simpler scripts, <code class="docutils literal notranslate"><span class="pre">RemoteInfo</span></code> python object
may be given to the to-be remotely submitted function instead of setting the environment variables.</p>
<div class="section" id="example">
<h2>Example<a class="headerlink" href="#example" title="Permalink to this heading">#</a></h2>
<span id="sec-example"></span><h2>Example<a class="headerlink" href="#example" title="Permalink to this heading">#</a></h2>
<p>The workflow (<code class="docutils literal notranslate"><span class="pre">do_workflow.py</span></code>) is essentially identical to what you’d otherwise construct:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">wfl.configset</span> <span class="kn">import</span> <span class="n">ConfigSet</span><span class="p">,</span> <span class="n">OutputSpec</span>
<span class="kn">from</span> <span class="nn">wfl.generate_configs.minim</span> <span class="kn">import</span> <span class="n">run</span>
Expand Down Expand Up @@ -453,6 +465,8 @@ <h2>Example<a class="headerlink" href="#example" title="Permalink to this headin
the initial <code class="docutils literal notranslate"><span class="pre">_</span></code>, not <code class="docutils literal notranslate"><span class="pre">.</span></code>, so it is more visible) can optionally be created at
the directory hierarchy level that indicates the scope of the project,
to separate the jobs database from any other project.</p>
<div class="section" id="restarts">
<span id="sec-example-restarts"></span><h3>Restarts<a class="headerlink" href="#restarts" title="Permalink to this heading">#</a></h3>
<p>Restarts are supposed to be handled automatically - if the workflow script is
interrupted, just rerun it. If the entire <code class="docutils literal notranslate"><span class="pre">autoparallelize</span></code> call is complete,
the default behavior of <code class="docutils literal notranslate"><span class="pre">OutputSpec</span></code> will allow
Expand All @@ -466,11 +480,28 @@ <h2>Example<a class="headerlink" href="#example" title="Permalink to this headin
argument (obviously only if ignoring it for the purpose of detecting
duplicate submission is indeed correct). All functions already ignore the
<code class="docutils literal notranslate"><span class="pre">outputs</span></code> <code class="docutils literal notranslate"><span class="pre">OutputSpec</span></code> argument.</p>
<div class="admonition warning">
<p class="admonition-title">Warning</p>
<p>The hashing mechanism is only designed for interrupted runs, and does
not detect changes to the called function (or to any functions that
function calls). If the code is being modified, the user should erase the
<code class="docutils literal notranslate"><span class="pre">ExPyRe</span></code> staged job directories, and clean up the <code class="docutils literal notranslate"><span class="pre">sqlite</span></code> database file,
before rerunning. Using a per-project <code class="docutils literal notranslate"><span class="pre">_expyre</span></code> directory makes this
easier, since the database file can simply be erased, otherwise the <code class="docutils literal notranslate"><span class="pre">xpr</span></code> command
line tool needs to be used to delete the previously created jobs.</p>
<p>Note that this is only relevant to incomplete autoparallelized
operations, since any completed operation (once all the remote job outputs have
been gathered into the location specified in the <code class="docutils literal notranslate"><span class="pre">OutputSpec</span></code>) no longer depends on
anything <code class="docutils literal notranslate"><span class="pre">ExPyRe</span></code>-related. See also the warning in the
<code class="docutils literal notranslate"><span class="pre">OutputSpec</span></code> <a class="reference internal" href="overview.configset.html"><span class="doc std std-doc">documentation</span></a>.</p>
</div>
</div>
</div>
<div class="section" id="wfl-expyre-info-syntax">
<h2>WFL_EXPYRE_INFO syntax<a class="headerlink" href="#wfl-expyre-info-syntax" title="Permalink to this heading">#</a></h2>
<p>The <code class="docutils literal notranslate"><span class="pre">WFL_EXPYRE_INFO</span></code> variable contains a JSON or the name of a file that contains a JSON. The JSON encodes a dict with keys
indicating particular function calls, and values containing arguments for constructing <a class="reference internal" href="wfl.autoparallelize.html#wfl.autoparallelize.RemoteInfo" title="wfl.autoparallelize.RemoteInfo"><span class="xref myst py py-class"><code class="docutils literal notranslate"><span class="pre">RemoteInfo</span></code></span></a> objects.</p>
indicating particular function calls, and values containing arguments for constructing
<a class="reference internal" href="wfl.autoparallelize.html#wfl.autoparallelize.remoteinfo.RemoteInfo" title="wfl.autoparallelize.remoteinfo.RemoteInfo"><span class="xref myst py py-class"><code class="docutils literal notranslate"><span class="pre">RemoteInfo</span></code></span></a> objects.</p>
<div class="section" id="keys">
<h3>Keys<a class="headerlink" href="#keys" title="Permalink to this heading">#</a></h3>
<p>Each key consist of a comma separated list of <code class="docutils literal notranslate"><span class="pre">remote_label</span></code> or <code class="docutils literal notranslate"><span class="pre">&quot;end_of_path_to_file::function_name&quot;</span></code>.</p>
Expand Down Expand Up @@ -568,7 +599,10 @@ <h3>Pytest with remote run example<a class="headerlink" href="#pytest-with-remot
</div>
<nav class="bd-toc-nav page-toc">
<ul class="visible nav section-nav flex-column">
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#example">Example</a></li>
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#example">Example</a><ul class="nav section-nav flex-column">
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#restarts">Restarts</a></li>
</ul>
</li>
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#wfl-expyre-info-syntax">WFL_EXPYRE_INFO syntax</a><ul class="nav section-nav flex-column">
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#keys">Keys</a></li>
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#values">Values</a></li>
Expand Down
2 changes: 1 addition & 1 deletion searchindex.js

Large diffs are not rendered by default.

0 comments on commit 8d0c704

Please sign in to comment.