docs: document PyTorch backend (#3193)

Fix #3121. There are TODOs: (1) PyTorch-backend specific features and arguments; (2) Python interface installation. Currently, the TensorFlow backend is always installed, and I am considering rewriting the logic; (3) Unsupported features - write docs when implemented. --------- Signed-off-by: Jinzhe Zeng <[email protected]>
deepmodeling · Jan 29, 2024 · 8eadd3e · 8eadd3e
1 parent 1e51a88
commit 8eadd3e
Show file tree

Hide file tree

Showing 18 changed files with 189 additions and 50 deletions.
diff --git a/CITATIONS.bib b/CITATIONS.bib
@@ -105,6 +105,25 @@ @misc{Zhang_2022_DPA1
     doi = {10.48550/arXiv.2208.08236},
 }
 
+@misc{Zhang_2023_DPA2,
+    annote = {DPA-2},
+    author = {Duo Zhang and Xinzijian Liu and Xiangyu Zhang and Chengqian Zhang and
+              Chun Cai and Hangrui Bi and Yiming Du and Xuejian Qin and Jiameng Huang
+              and Bowen Li and Yifan Shan and Jinzhe Zeng and Yuzhi Zhang and Siyuan
+              Liu and Yifan Li and Junhan Chang and Xinyan Wang and Shuo Zhou and
+              Jianchuan Liu and Xiaoshan Luo and Zhenyu Wang and Wanrun Jiang and Jing
+              Wu and Yudi Yang and Jiyuan Yang and Manyi Yang and Fu-Qiang Gong and
+              Linshuang Zhang and Mengchao Shi and Fu-Zhi Dai and Darrin M. York and
+              Shi Liu and Tong Zhu and Zhicheng Zhong and Jian Lv and Jun Cheng and
+              Weile Jia and Mohan Chen and Guolin Ke and Weinan E and Linfeng Zhang
+              and Han Wang},
+    title = {{DPA-2: Towards a universal large atomic model for molecular and material
+              simulation}},
+    publisher = {arXiv},
+    year = {2023},
+    doi = {10.48550/arXiv.2312.15492},
+}
+
 @article{Zhang_PhysPlasmas_2020_v27_p122704,
     annote = {frame-specific parameters (e.g. electronic temperature)},
     author = {Zhang, Yuzhi and Gao, Chang and Liu, Qianrui and Zhang, Linfeng and Wang, Han and Chen, Mohan},

diff --git a/backend/dynamic_metadata.py b/backend/dynamic_metadata.py
@@ -46,6 +46,7 @@ def dynamic_metadata(
                 "sphinx_markdown_tables",
                 "myst-nb>=1.0.0rc0",
                 "myst-parser>=0.19.2",
+                "sphinx-design",
                 "breathe",
                 "exhale",
                 "numpydoc",

diff --git a/doc/backend.md b/doc/backend.md
@@ -0,0 +1,28 @@
+# Backend
+
+## Supported backends
+
+DeePMD-kit supports multiple backends: TensorFlow and PyTorch.
+To use DeePMD-kit, you must install at least one backend.
+Each backend does not support all features.
+In the documentation, TensorFlow {{ tensorflow_icon }} and PyTorch {{ pytorch_icon }} icons are used to mark whether a backend supports a feature.
+
+### TensorFlow {{ tensorflow_icon }}
+
+TensorFlow 2.2 or above is required.
+DeePMD-kit does not use the TensorFlow v2 API but uses the TensorFlow v1 API (`tf.compat.v1`) in the graph mode.
+
+### PyTorch {{ pytorch_icon }}
+
+PyTorch 2.0 or above is required.
+
+## Switch the backend
+
+### Training
+
+When training and freezing a model, you can use `dp --tf` or `dp --pt` in the command line to switch the backend.
+
+### Inference
+
+When doing inference, DeePMD-kit detects the backend from the model filename.
+For example, when the model filename ends with `.pb` (the ProtoBuf file), DeePMD-kit will consider it using the TensorFlow backend.
diff --git a/doc/conf.py b/doc/conf.py
@@ -94,6 +94,7 @@ def setup(app):
     "breathe",
     "exhale",
     "sphinxcontrib.bibtex",
+    "sphinx_design",
 ]
 
 # breathe_domain_by_extension = {

diff --git a/doc/credits.rst b/doc/credits.rst
@@ -49,6 +49,13 @@ Cite DeePMD-kit and methods
 
    Zhang_2022_DPA1
 
+- If DPA-2 descriptor (`dpa2`) is used,
+
+.. bibliography::
+   :filter: False
+
+   Zhang_2023_DPA2
+
 - If frame-specific parameters (`fparam`, e.g. electronic temperature) is used,
 
 .. bibliography::

diff --git a/doc/index.rst b/doc/index.rst
@@ -34,6 +34,7 @@ DeePMD-kit is a package written in Python/C++, designed to minimize the effort r
    :numbered:
    :caption: Advanced
 
+   backend
    install/index
    data/index
    model/index

diff --git a/doc/inference/python.md b/doc/inference/python.md
@@ -26,9 +26,14 @@ graphs = [DP("graph.000.pb"), DP("graph.001.pb")]
 model_devi = calc_model_devi(coord, cell, atype, graphs)
 ```
 
-Note that if the model inference or model deviation is performed cyclically, one should avoid calling the same model multiple times. Otherwise, tensorFlow will never release the memory and this may lead to an out-of-memory (OOM) error.
+Note that if the model inference or model deviation is performed cyclically, one should avoid calling the same model multiple times.
+Otherwise, TensorFlow or PyTorch will never release the memory, and this may lead to an out-of-memory (OOM) error.
 
-## External neighbor list algorithm
+## External neighbor list algorithm {{ tensorflow_icon }}
+
+:::{note}
+**Supported backends**: TensorFlow {{ tensorflow_icon }}
+:::
 
 The native neighbor list algorithm of the DeePMD-kit is in $O(N^2)$ complexity ($N$ is the number of atoms).
 While this is not a problem for small systems that quantum methods can afford, the large systems for molecular dynamics have slow performance.

diff --git a/doc/install/easy-install-dev.md b/doc/install/easy-install-dev.md
@@ -19,12 +19,16 @@ For CUDA 11.8 support, use the `devel_cu11` tag.
 Below is an one-line shell command to download the [artifact](https://nightly.link/deepmodeling/deepmd-kit/workflows/build_wheel/devel/artifact.zip) containing wheels and install it with `pip`:
 
 ```sh
-pip install -U --pre deepmd-kit[gpu,cu12,lmp] --extra-index-url https://deepmodeling.github.io/deepmd-kit/simple
+pip install -U --pre deepmd-kit[gpu,cu12,lmp,torch] --extra-index-url https://deepmodeling.github.io/deepmd-kit/simple
 ```
 
 `cu12` and `lmp` are optional, which is the same as the stable version.
 
-## Download pre-compiled C Library
+## Download pre-compiled C Library {{ tensorflow_icon }}
+
+:::{note}
+**Supported backends**: TensorFlow {{ tensorflow_icon }}
+:::
 
 The [pre-comiled C library](./install-from-c-library.md) can be downloaded from [here](https://nightly.link/deepmodeling/deepmd-kit/workflows/package_c/devel/libdeepmd_c-0-libdeepmd_c.tar.gz.zip), or via a shell command:
 

diff --git a/doc/install/easy-install.md b/doc/install/easy-install.md
@@ -19,9 +19,9 @@ Python 3.8 or above is required for Python interface.
 
 
 ## Install off-line packages
-Both CPU and GPU version offline packages are available in [the Releases page](https://github.com/deepmodeling/deepmd-kit/releases).
+Both CPU and GPU version offline packages are available on [the Releases page](https://github.com/deepmodeling/deepmd-kit/releases).
 
-Some packages are splited into two files due to size limit of GitHub. One may merge them into one after downloading:
+Some packages are split into two files due to the size limit of GitHub. One may merge them into one after downloading:
 ```bash
 cat deepmd-kit-2.1.1-cuda11.6_gpu-Linux-x86_64.sh.0 deepmd-kit-2.1.1-cuda11.6_gpu-Linux-x86_64.sh.1 > deepmd-kit-2.1.1-cuda11.6_gpu-Linux-x86_64.sh
 ```
@@ -73,50 +73,47 @@ A docker for installing the DeePMD-kit is available [here](https://github.com/or
 
 To pull the CPU version:
 ```bash
-docker pull ghcr.io/deepmodeling/deepmd-kit:2.1.1_cpu
+docker pull ghcr.io/deepmodeling/deepmd-kit:2.2.8_cpu
 ```
 
 To pull the GPU version:
 ```bash
-docker pull ghcr.io/deepmodeling/deepmd-kit:2.1.1_cuda11.6_gpu
-```
-
-To pull the ROCm version:
-```bash
-docker pull deepmodeling/dpmdkit-rocm:dp2.0.3-rocm4.5.2-tf2.6-lmp29Sep2021
+docker pull ghcr.io/deepmodeling/deepmd-kit:2.2.8_cuda12.0_gpu
 ```
 
 ## Install Python interface with pip
 
 If you have no existing TensorFlow installed, you can use `pip` to install the pre-built package of the Python interface with CUDA 12 supported:
 
 ```bash
-pip install deepmd-kit[gpu,cu12]
+pip install deepmd-kit[gpu,cu12,torch]
 ```
 
 `cu12` is required only when CUDA Toolkit and cuDNN were not installed.
 
 To install the package built against CUDA 11.8, use
 
 ```bash
+pip install torch --index-url https://download.pytorch.org/whl/cu118
 pip install deepmd-kit-cu11[gpu,cu11]
 ```
 
 Or install the CPU version without CUDA supported:
 ```bash
+pip install torch --index-url https://download.pytorch.org/whl/cpu
 pip install deepmd-kit[cpu]
 ```
 
-[The LAMMPS module](../third-party/lammps-command.md) and [the i-Pi driver](../third-party/ipi.md) are only provided on Linux and macOS. To install LAMMPS and/or i-Pi, add `lmp` and/or `ipi` to extras:
+[The LAMMPS module](../third-party/lammps-command.md) and [the i-Pi driver](../third-party/ipi.md) are only provided on Linux and macOS for the TensorFlow backend. To install LAMMPS and/or i-Pi, add `lmp` and/or `ipi` to extras:
 ```bash
-pip install deepmd-kit[gpu,cu12,lmp,ipi]
+pip install deepmd-kit[gpu,cu12,torch,lmp,ipi]
 ```
 MPICH is required for parallel running. (The macOS arm64 package doesn't support MPI yet.)
 
 It is suggested to install the package into an isolated environment.
 The supported platform includes Linux x86-64 and aarch64 with GNU C Library 2.28 or above, macOS x86-64 and arm64, and Windows x86-64.
-A specific version of TensorFlow which is compatible with DeePMD-kit will be also installed.
+A specific version of TensorFlow and PyTorch which is compatible with DeePMD-kit will be also installed.
 
 :::{Warning}
-If your platform is not supported, or want to build against the installed TensorFlow, or want to enable ROCM support, please [build from source](install-from-source.md).
+If your platform is not supported, or you want to build against the installed TensorFlow, or you want to enable ROCM support, please [build from source](install-from-source.md).
 :::
diff --git a/doc/install/install-from-c-library.md b/doc/install/install-from-c-library.md
@@ -1,4 +1,8 @@
-# Install from pre-compiled C library
+# Install from pre-compiled C library {{ tensorflow_icon }}
+
+:::{note}
+**Supported backends**: TensorFlow {{ tensorflow_icon }}
+:::
 
 DeePMD-kit provides pre-compiled C library package (`libdeepmd_c.tar.gz`) in each [release](https://github.com/deepmodeling/deepmd-kit/releases). It can be used to build the [LAMMPS plugin](./install-lammps.md) and [GROMACS patch](./install-gromacs.md), as well as many [third-party software packages](../third-party/out-of-deepmd-kit.md), without building TensorFlow and DeePMD-kit on one's own.
 It can be downloaded via the shell command:

diff --git a/doc/install/install-from-source.md b/doc/install/install-from-source.md
@@ -14,45 +14,74 @@ cd deepmd-kit
 deepmd_source_dir=`pwd`
 ```
 
-## Install the python interface
-### Install Tensorflow's python interface
-First, check the python version on your machine.
+## Install the Python interface
+### Install Backend's Python interface
+First, check the Python version on your machine.
 Python 3.8 or above is required.
 ```bash
 python --version
 ```
 
-We follow the virtual environment approach to install TensorFlow's Python interface. The full instruction can be found on the official [TensorFlow website](https://www.tensorflow.org/install/pip). TensorFlow 2.2 or later is supported. Now we assume that the Python interface will be installed to the virtual environment directory `$tensorflow_venv`
+We follow the virtual environment approach to install the backend's Python interface.
+Now we assume that the Python interface will be installed in the virtual environment directory `$deepmd_venv`:
+
 ```bash
-virtualenv -p python3 $tensorflow_venv
-source $tensorflow_venv/bin/activate
+virtualenv -p python3 $deepmd_venv
+source $deepmd_venv/bin/activate
 pip install --upgrade pip
+```
+
+::::{tab-set}
+
+:::{tab-item} TensorFlow {{ tensorflow_icon }}
+
+The full instruction to install TensorFlow can be found on the official [TensorFlow website](https://www.tensorflow.org/install/pip). TensorFlow 2.2 or later is supported.
+```bash
 pip install --upgrade tensorflow
 ```
-It is important that every time a new shell is started and one wants to use `DeePMD-kit`, the virtual environment should be activated by
+
+If one does not need the GPU support of DeePMD-kit and is concerned about package size, the CPU-only version of TensorFlow should be installed by
 ```bash
-source $tensorflow_venv/bin/activate
+pip install --upgrade tensorflow-cpu
 ```
-if one wants to skip out of the virtual environment, he/she can do
+
+To verify the installation, run
 ```bash
-deactivate
+python -c "import tensorflow as tf;print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
 ```
-If one has multiple python interpreters named something like python3.x, it can be specified by, for example
+
+One can also [build the TensorFlow Python interface from source](https://www.tensorflow.org/install/source) for customized hardware optimization, such as CUDA, ROCM, or OneDNN support.
+
+:::
+
+:::{tab-item} PyTorch {{ pytorch_icon }}
+
+To install PyTorch, run
+
+```sh
+pip install torch
+```
+
+Follow [PyTorch documentation](https://pytorch.org/get-started/locally/) to install PyTorch built against different CUDA versions or without CUDA.
+
+:::
+
+::::
+
+It is important that every time a new shell is started and one wants to use `DeePMD-kit`, the virtual environment should be activated by
 ```bash
-virtualenv -p python3.8 $tensorflow_venv
+source $deepmd_venv/bin/activate
 ```
-If one does not need the GPU support of DeePMD-kit and is concerned about package size, the CPU-only version of TensorFlow should be installed by
+if one wants to skip out of the virtual environment, he/she can do
 ```bash
-pip install --upgrade tensorflow-cpu
+deactivate
 ```
-To verify the installation, run
+If one has multiple python interpreters named something like python3.x, it can be specified by, for example
 ```bash
-python -c "import tensorflow as tf;print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
+virtualenv -p python3.8 $deepmd_venv
 ```
 One should remember to activate the virtual environment every time he/she uses DeePMD-kit.
 
-One can also [build the TensorFlow Python interface from source](https://www.tensorflow.org/install/source) for custom hardware optimization, such as CUDA, ROCM, or OneDNN support.
-
 ### Install the DeePMD-kit's python interface
 
 Check the compiler version on your machine
@@ -106,7 +135,7 @@ Valid subcommands:
     test               test the model
 ```
 
-### Install horovod and mpi4py
+### Install horovod and mpi4py {{ tensorflow_icon }}
 
 [Horovod](https://github.com/horovod/horovod) and [mpi4py](https://github.com/mpi4py/mpi4py) are used for parallel training. For better performance on GPU, please follow the tuning steps in [Horovod on GPU](https://github.com/horovod/horovod/blob/master/docs/gpus.rst).
 ```bash
@@ -152,14 +181,29 @@ If you don't install Horovod, DeePMD-kit will fall back to serial mode.
 
 If one does not need to use DeePMD-kit with Lammps or I-Pi, then the python interface installed in the previous section does everything and he/she can safely skip this section.
 
-### Install Tensorflow's C++ interface (optional)
+### Install Backends' C++ interface (optional)
+
+::::{tab-set}
+
+:::{tab-item} TensorFlow {{ tensorflow_icon }}
 
 Since TensorFlow 2.12, TensorFlow C++ library (`libtensorflow_cc`) is packaged inside the Python library. Thus, you can skip building TensorFlow C++ library manually. If that does not work for you, you can still build it manually.
 
 The C++ interface of DeePMD-kit was tested with compiler GCC >= 4.8. It is noticed that the I-Pi support is only compiled with GCC >= 4.8. Note that TensorFlow may have specific requirements for the compiler version.
 
 First, the C++ interface of Tensorflow should be installed. It is noted that the version of Tensorflow should be consistent with the python interface. You may follow [the instruction](install-tf.2.12.md) or run the script `$deepmd_source_dir/source/install/build_tf.py` to install the corresponding C++ interface.
 
+:::
+
+:::{tab-item} PyTorch {{ pytorch_icon }}
+
+If you have installed PyTorch using pip, you can use libtorch inside the PyTorch Python package.
+You can also download libtorch prebuilt library from the [PyTorch website](https://pytorch.org/get-started/locally/).
+
+:::
+
+::::
+
 ### Install DeePMD-kit's C++ interface
 
 Now go to the source code directory of DeePMD-kit and make a building place.
@@ -175,25 +219,46 @@ The installation requires CMake 3.16 or later for the CPU version, CMake 3.23 or
 pip install -U cmake
 ```
 
+You must enable at least one backend.
+If you enable two or more backends, these backend libraries must be built in a compatible way, e.g. using the same `_GLIBCXX_USE_CXX11_ABI` flag.
+
+::::{tab-set}
+
+:::{tab-item} TensorFlow {{ tensorflow_icon }}
+
 I assume you have activated the TensorFlow Python environment and want to install DeePMD-kit into path `$deepmd_root`, then execute CMake
 ```bash
-cmake -DUSE_TF_PYTHON_LIBS=TRUE -DCMAKE_INSTALL_PREFIX=$deepmd_root ..
+cmake -DENABLE_TENSORFLOW=TRUE -DUSE_TF_PYTHON_LIBS=TRUE -DCMAKE_INSTALL_PREFIX=$deepmd_root ..
 ```
 
 If you specify `-DUSE_TF_PYTHON_LIBS=FALSE`, you need to give the location where TensorFlow's C++ interface is installed to `-DTENSORFLOW_ROOT=${tensorflow_root}`.
 
+:::
+
+:::{tab-item} PyTorch {{ pytorch_icon }}
+
+I assume you have installed the PyTorch (either Python or C++ interface) to `$torch_root`, then execute CMake
+```bash
+cmake -DENABLE_PYTORCH=TRUE -DCMAKE_PREFIX_PATH=$torch_root -DCMAKE_INSTALL_PREFIX=$deepmd_root ..
+```
+:::
+
+::::
+
 One may add the following arguments to `cmake`:
 
 | CMake Aurgements         | Allowed value       | Default value | Usage                   |
 | ------------------------ | ------------------- | ------------- | ------------------------|
-| -DTENSORFLOW_ROOT=&lt;value&gt;  | Path              | -             | The Path to TensorFlow's C++ interface. |
+| -DENABLE_TENSORFLOW=&lt;value&gt;  | `TRUE` or `FALSE` | `FALSE`     | {{ tensorflow_icon }} Whether building the TensorFlow backend. |
+| -DENABLE_PYTORCH=&lt;value&gt;  | `TRUE` or `FALSE` | `FALSE`     | {{ pytorch_icon }} Whether building the PyTorch backend. |
+| -DTENSORFLOW_ROOT=&lt;value&gt;  | Path              | -             | {{ tensorflow_icon }} The Path to TensorFlow's C++ interface. |
 | -DCMAKE_INSTALL_PREFIX=&lt;value&gt; | Path          | -             | The Path where DeePMD-kit will be installed. |
 | -DUSE_CUDA_TOOLKIT=&lt;value&gt; | `TRUE` or `FALSE` | `FALSE`       | If `TRUE`, Build GPU support with CUDA toolkit. |
 | -DCUDAToolkit_ROOT=&lt;value&gt; | Path         | Detected automatically | The path to the CUDA toolkit directory. CUDA 9.0 or later is supported. NVCC is required. |
 | -DUSE_ROCM_TOOLKIT=&lt;value&gt; | `TRUE` or `FALSE` | `FALSE`       | If `TRUE`, Build GPU support with ROCM toolkit. |
 | -DCMAKE_HIP_COMPILER_ROCM_ROOT=&lt;value&gt; | Path         | Detected automatically | The path to the ROCM toolkit directory. |
 | -DLAMMPS_SOURCE_ROOT=&lt;value&gt; | Path         | - | Only neccessary for LAMMPS plugin mode. The path to the [LAMMPS source code](install-lammps.md). LAMMPS 8Apr2021 or later is supported. If not assigned, the plugin mode will not be enabled. |
-| -DUSE_TF_PYTHON_LIBS=&lt;value&gt; | `TRUE` or `FALSE` | `FALSE`       | If `TRUE`, Build C++ interface with TensorFlow's Python libraries(TensorFlow's Python Interface is required). And there's no need for building TensorFlow's C++ interface.|
+| -DUSE_TF_PYTHON_LIBS=&lt;value&gt; | `TRUE` or `FALSE` | `FALSE`       | {{ tensorflow_icon }} If `TRUE`, Build C++ interface with TensorFlow's Python libraries (TensorFlow's Python Interface is required). And there's no need for building TensorFlow's C++ interface.|
 | -DENABLE_NATIVE_OPTIMIZATION=&lt;value&gt;       | `TRUE` or `FALSE` | `FALSE`       | Enable compilation optimization for the native machine's CPU type. Do not enable it if generated code will run on different CPUs. |
 | -DCMAKE_&lt;LANG&gt;_FLAGS=&lt;value&gt; (`<LANG>`=`CXX`, `CUDA` or `HIP`)   | str            | -             | Default compilation flags to be used when compiling `<LANG>` files. See [CMake documentation](https://cmake.org/cmake/help/latest/variable/CMAKE_LANG_FLAGS.html). |
 

diff --git a/doc/model/dpa2.md b/doc/model/dpa2.md
@@ -0,0 +1,5 @@
+# Descriptor DPA-2 {{ pytorch_icon }}
+
+:::{note}
+**Supported backends**: PyTorch {{ pytorch_icon }}
+:::