diff --git a/README.md b/README.md index f549cd03..94664d36 100644 --- a/README.md +++ b/README.md @@ -114,6 +114,7 @@ janus phonons janus eos janus train janus descriptors +janus preprocess ``` For example, a single point calcuation (using the [MACE-MP](https://github.com/ACEsuit/mace-mp) "small" force-field) can be performed by running: diff --git a/docs/source/apidoc/janus_core.rst b/docs/source/apidoc/janus_core.rst index e560c537..1a65f026 100644 --- a/docs/source/apidoc/janus_core.rst +++ b/docs/source/apidoc/janus_core.rst @@ -146,6 +146,16 @@ janus\_core.cli.phonons module :undoc-members: :show-inheritance: +janus\_core.cli.preprocess module +--------------------------------- + +.. automodule:: janus_core.cli.preprocess + :members: + :special-members: + :private-members: + :undoc-members: + :show-inheritance: + janus\_core.cli.singlepoint module ---------------------------------- @@ -289,6 +299,16 @@ janus\_core.processing.symmetry module :undoc-members: :show-inheritance: +janus\_core.training.preprocess module +-------------------------------------- + +.. automodule:: janus_core.training.preprocess + :members: + :special-members: + :private-members: + :undoc-members: + :show-inheritance: + janus\_core.training.train module --------------------------------- diff --git a/docs/source/user_guide/command_line.rst b/docs/source/user_guide/command_line.rst index 03ee0e50..2220597e 100644 --- a/docs/source/user_guide/command_line.rst +++ b/docs/source/user_guide/command_line.rst @@ -346,7 +346,7 @@ Training and fine-tuning MLIPs ------------------------------ .. note:: - Currently only MACE models are supported. See the `MACE CLI `_ for further configuration details + Currently only MACE models are supported. See the `MACE run_train CLI `_ for further configuration details Models can be trained by passing a configuration file to the MLIP's command line interface: @@ -364,6 +364,27 @@ Foundational models can also be fine-tuned, by including the ``foundation_model` janus train --mlip-config /path/to/fine/tuning/config.yml --fine-tune +Preprocessing training data +---------------------------- + +.. note:: + Currently only MACE models are supported. See the `MACE preprocess_data CLI `_ for further configuration details + +Large datasets, which may not fit into GPU memory, can be preprocessed, +converting xyz training, test, and validation files into HDF5 files that can then be used for on-line data loading. + +This can be done by passing a configuration file to the MLIP's command line interface: + +.. code-block:: bash + + janus preprocess --mlip-config /path/to/preprocessing/config.yml + +For MACE, this will create separate folders for ``train``, ``val`` and ``test`` HDF5 data files, when relevant, +as well as saving the statistics of your data in ``statistics.json``, if requested. + +Additionally, a log file, ``preprocess-log.yml``, and summary file, ``preprocess-summary.yml``, will be generated. + + Calculate descriptors ---------------------