Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asyncio rebase #4

Open
wants to merge 22 commits into
base: develop
Choose a base branch
from
Open

Asyncio rebase #4

wants to merge 22 commits into from

Conversation

unkcpz
Copy link
Owner

@unkcpz unkcpz commented Aug 22, 2020

No description provided.

@unkcpz unkcpz force-pushed the asyncio-rebase branch 7 times, most recently from f1f1646 to 5b40ead Compare August 28, 2020 09:45
@sphuber sphuber force-pushed the asyncio-rebase branch 5 times, most recently from 99d5c33 to c92d27d Compare September 5, 2020 12:37
@unkcpz unkcpz force-pushed the asyncio-rebase branch 2 times, most recently from 2ef0ecd to 4ed9ffe Compare September 19, 2020 04:07
@unkcpz unkcpz force-pushed the asyncio-rebase branch 3 times, most recently from fa5344c to 7e2ba0e Compare October 20, 2020 09:25
@sphuber sphuber force-pushed the asyncio-rebase branch 4 times, most recently from 42cb732 to bfd558a Compare November 14, 2020 08:38
dependabot bot and others added 3 commits November 16, 2020 08:16
)

Bumps `cryptography` from 2.8 to 3.2.

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Sebastiaan Huber <[email protected]>
Citation for the latest paper on the engine is added to the README and
the documentation index page. The paper in `aiida/__init__.py` is also
updated which was still referencing the original publication of 2016.
When in non-interactive mode, do not ask whether to use existing user/database
greschd and others added 4 commits November 18, 2020 09:59
…ateam#4574)

The profile must be setup prior to starting the daemons to avoid an error.
This commit fixes a bug,
whereby click was using a version statically stored on install of the package.
This meant changes to `__version__` were not dynamically reflected.
The `verdi node delete` process fully loaded all ORM objects at multiple stages
during the process, which is highly inefficient.
This commit ensures the process now only loads the PKs when possible.
As an example, the time to delete 100 "empty" nodes (no attributes/objects)
is now reduced from ~32 seconds to ~5 seconds.
@sphuber sphuber force-pushed the asyncio-rebase branch 3 times, most recently from 21edfef to 8cfadb0 Compare November 18, 2020 16:59
sphuber and others added 3 commits November 19, 2020 10:09
…eam#4437)

This new option allows one to specify additional files to be retrieved
on a per-instance basis, in addition to the files that are already
defined by the plugin to be retrieved. This was often implemented by
plugin packages itself through a `settings` node that supported a key
that would allow a user to specify these additional files.

Since this is a common use case, we implement this functionality on
`aiida-core` instead to guarantee a consistent interface across plugins.
* Add options for transport tasks

When encountering failures during the execution of transport tasks, a runner
will wait for a time interval between transport task attempts. This time
interval between attempts is increased using an exponential backoff
mechanism, i.e. the time interval is equal to:

(TRANSPORT_TASK_RETRY_INITIAL_INTERVAL) * 2 ** (N_ATTEMPT - 1)

where N_ATTEMPT is the number of failed attempts. This mechanism is
interrupted once the TRANSPORT_TASK_MAXIMUM_ATTEMPTS is reached.

The initial interval and maximum attempts are currently fixed to 20
seconds and 5, respectively. This commit adds two configuration options
that use these defaults, but allow the user to adjust them using `verdi
config`.
Currently the transport options for the EBM are obtained by using the
get_config function, e.g.:

`initial_interval = get_config_option(RETRY_INTERVAL_OPTION)`

However, it seems that `get_config()` does not get you the current
configuration (see aiidateam#4586). 

Replacing `get_config().get_option()` with `get_config_option()` fixes this
issue for the EBM options.
@sphuber sphuber force-pushed the asyncio-rebase branch 2 times, most recently from a9e0102 to ee92f59 Compare November 25, 2020 16:10
sphuber and others added 10 commits November 27, 2020 13:40
This work around was added some time ago because this source for the
`apt` package manager was causing the install of system dependencies to
fail.
The main test workflow runs against a single version of RabbitMQ but
experience has shown that the code can break for different versions of
the RabbitMQ server. Here we add a new CI workflow that runs various
unit tests through pytest that simulate the typical interaction with the
RabbitMQ server in normal AiiDA operation. The difference is that these
are tested against the currently available versions of RabbitMQ.

The current setup, still only tests part of the functionality that AiiDA
uses, for example, the default credentials and virtual host are used.
Connections over TLS are also not tested. These options would require
the RabbitMQ service that is running in a docker container to be
configured differently. It is not clear how these various options can be
parametrized in concert with the actual unit tests.
The `plumpy` and `kiwipy` dependencies have already been migrated from
using `tornado` to the Python built-in module `asyncio` in the versions
`0.16.0` and `0.6.0`, respectively. This allows us to also rid AiiDA of
the `tornado` dependency, which has been giving requirement clashes with
other tools, specifically from the Jupyter and iPython world. The final
limitation was the `circus` library that is used to daemonize the daemon
workers, which as of `v0.17.1` also supports `tornado~=5`.

A summary of the changes:

 * Replace `tornado.ioloop` with `asyncio` event loop.
 * Coroutines are marked with `async` instead of decorated with the
   `tornado.gen.coroutine` decorator.
 * Replace `yield` with `await` when calling a coroutine.
 * Replace `raise tornado.gen.Return` with `return` when returning from
   a coroutine.
 * Replace `add_callback` call on event loop with `call_soon` when
   scheduling a callback.
 * Replace `add_callback` call on event loop with `create_task` when
   scheduling `process.step_until_terminated()`.
 * Replace `run_sync` call on event loop with `run_until_complete`.
 * Replace `pika` uses with `aio-pika` which is now used by the `plumpy`
   and `kiwipy` libraries.
 * Replace `concurrent.Future` with `asyncio.Future`.
 * Replace `yield tornado.gen.sleep` with `await asyncio.sleep`.

Additional changes:

 * Remove the `tornado` logger from the logging configuration.
 * Remove the `logging.tornado_loglevel` configuration option.
 * Turn the `TransportQueue.loop` attribute from method into property.
 * Call `Communicator.close()` instead of `Communicator.stop()` in the
   `Manager.close()` method. The `stop` method has been deprecated in
   `kiwipy==0.6.0`.
The result returned by `ProcessController.kill_process` that is called
in `Process.kill` for each of its children, if it has any, can itself be
a future, since the killing cannot always be performed directly, but
instead will be scheduled in the event loop. To resolve the future of
the main process, it will have to wait for the futures of all its
children to be resolved as well. Therefore an intermediate future needs
to be added that will be done once all child futures are resolved.
The commands of `verdi process` that perform an RPC on a live process
will do so through the `ProcessController`, which returns a future.
Currently, the process controller uses the `LoopCommunicator` as its
communicator which adds an additional layer of wrapping. Ideally, the
return type of the communicator should not change depending on the
specific implementation that is used, however, for now that is the case
and so the future needs to be unwrapped explicitly one additional time.
Once the `LoopCommunicator` is fixed to return the same future type as
the base `Communicator` class, this workaround can and should be
removed.
With the migration to `asyncio`, there is now only a single event loop
that is made reentrant through the `nest-asyncio` library, that monkey
patches `asyncio`'s built-in mechanism to prevent this. This means that
in the `Runner` constructor, we should simply get the global event loop
instead of creating a new one, if no explicit loop is passed into the
constructor. This also implies that the runner should never take charge
in closing the loop, because it no longer owns the global loop.

In addition, process functions now simply use the global runner instead
of creating a new runner. This used to be necessary because running in
the same runner, would mean running in the same loop and so the child
process would block the parent. However, with the new design on
`asyncio`, everything runs in a single reentrant loop and so child
processes no longer need to spawn their own independent nested runner.
When a daemon runner is started, the `SIGINT` and `SIGTERM` signals are
captured to shutdown the runner before exiting the interpreter. However,
the async tasks associated with the interpreter should be properly
canceled first.
The event loop implementation of `asyncio` does not allow to make the
event loop to be reentrant, which essentially means that event loops
cannot be nested. One event loop cannot be run within another event
loop. However, this concept is crucial for `plumpy`'s design to work and
was perfectly allowed by the previous event loop provider `tornado`.

To work around this, `plumpy` uses the library `nest_asyncio` to patch
the `asyncio` event loop and make it reentrant. The trick is that this
should be applied at the correct time. Here we update the `Runner` to
enable `plumpy`'s event loop policy, which will patch the default event
loop policy. This location is chosen since any process in `aiida-core`
*has* to be run by a `Runner` and only one runner instance will ever be
created in a Python interpreter. When the runner shuts down, the event
policy is reset to undo the patch.
RabbitMQ 3.6 changed the way integer values are interpreted for
connection parameters. This would cause certain integer values that used
to be perfectly acceptable, to all of suddent cause the declaration of
resources, such as channels and queues, to fail.

The library `pamqp`, that is used by `aiormq`, which in turn is used
ultimately by `kiwipy` to communicate with the RabbitMQ server, adapted
to these changes, but this would break code with RabbitMQ 3.5 that used
to work just fine. For example, the message TTL when declaring a queue
would now fail when `32767 < TTL < 655636` due to incorrect
interpretation of the integer type.

The library `pamqp` provides a way to enable compatibility with these
older versions. One should merely call the method:

    pamqp.encode.support_deprecated_rabbitmq()

This will enable the legacy integer conversion table and will restore
functionality for RabbitMQ 3.5.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants