Skip to content

AiiDA release roadmap

Leopold Talirz edited this page Jun 10, 2020 · 69 revisions

Where to look for information

For information on upcoming AiiDA releases and features in the pipeline consult the following resources:

Please note: All of these resources are working drafts and subject to change.

Development roadmap

This is a short overview of themes under discussion for future AiiDA development. Most of these items require significant development effort, and will be achieved more quickly with dedicated efforts from external contributors.

Domain-agnostic AiiDA core [no timeline]

Most of AiiDA's API is already domain-agnostic but some materials science specific classes remain in aiida-core (e.g. StructureData, KpointsData, ...). In order to make aiida-core more friendly to other disciplines, we would like to move these classes out to separate packages (e.g. aiida-pseudopotenials, aiida-atomic, ...).

One important open question is how to handle database schema migrations for data types defined by plugins.

Solution for HPC centres with two-factor authentication

AiiDA communicates with HPC centres via SSH keys. The SSH agent already allows AiiDA to deal with password-protected SSH keys, but so far there is no support for HPC centres that require two-factor authentication (2FA). With the cyberattack on European HPC centres in May 2020, this is becoming a pressing issue.

There are multiple routes to explore -- simpler installation on login nodes of HPC centers as well as solutions for scoping SSH key access (for certain time periods).

For more details, see https://github.com/aiidateam/aiida-core/issues/3929

Efficient file repository for large numbers of files

AiiDA uses the computer's file system to store files that are too large to be stored in the database. While a simple solution, this can create issues when the number of files grows large (reaching the limit of inodes of the file system). It also currently does not provide a way to compress files in order to save space.

See the enhancement proposal https://github.com/aiidateam/AEP/pull/11

Support object store API for static file repositories

When serving static AiiDA provenance graphs (such as on materialscloud.org), it can be convenient to have a web API for accessing files from AiiDA file repositories, so that multiple servers can access the same repository. The natural solution would seem to be to support specifying an object store as the location of the AiiDA file repository.

One open question is whether explicit support in AiiDA is needed, given that there are adapters like the swift virtual file system that allow mounting an object store container on disk.

Automatic integration tests for plugins registered on AiiDA registry

The AiiDA registry could run specific tests of AiiDA plugins to check whether they work with certain python / AiiDA versions.

To add: link to early implementation + discussion during meeting

Support for python-based simulation codes

AiiDA has two classes of processes: the locally running process functions, and the AiiDA will treat simulation codes written in python just as any other executable with input files and output files. While you can use python packages inside work functions and calculation functions that run locally on your computer, for codes running on remote machines AiiDA supports only file-based interaction.

Support for containerized simulation codes

See the Google Summer of Code project

Simplify generation of CLI/GUI for ORM classes

Creating a command-line interface (CLI) or a graphical user interface (GUI) to provide the information needed to create instances of the AiiDA ORM, e.g. a Computer, a Code or even an Int, currently involves boilerplate code. It would be useful, if the Computer/Code/... classes could be annotated in a way that makes the creation of a CLI/GUI automatic.

One open question in this respect is how to deal with validation - most of the validation currently occurs at the click level. In order to avoid duplication of validation checks, these would need to move into the ORM classes themselves.

Task farming: Disguise multiple jobs as one

AiiDA originated from the ab initio electronic structure community, where individual calculation jobs usually occupy a full compute node. As users from neighboring disciplines start using AiiDA (e.g. for force-field calculations), this is no longer the case, and one node may need to be shared between multiple AiiDA jobs.

Many HPC centers have queues that allow for node-sharing of multiple serial jobs, but there a centers that only allow one job per node. For such cases, it would be useful if AiiDA could pack multiple jobs into one.

See section "10. Task farming" in the report of the 2020 AiiDA hackathon for more details.

Contacts: @giovannipizzi, @pzarabadip

Cross-computer scheduling

While AiiDA makes it easy to submit jobs to different computers, it is the responsibility of the user to do so. It would be useful, if AiiDA would include some basic cross-computer scheduling features, such as setting a maximum number of jobs running on one computer at any given time.

"git push/pull" for exchanging AiiDA graphs

AiiDA provides export files as a means of exchanging AiiDA graphs. It would be useful if AiiDA supported pushing/pulling changes in AiiDA graphs from/to collaborators.

To add: link to early implementation by szoupanos; presentation from AiiDA coding week