Skip to content

Commit

Permalink
pd: support paddle backend and water/se_e2_a (deepmodeling#4302)
Browse files Browse the repository at this point in the history
Split <deepmodeling#4157> into
several pull requests.

1. Add core modules of paddle backend(`deepmd.pd.*`) and related backend
module unitests.
2. Support training/testing/freeze(C++ inference will be supported in
subsequent pull request) for example water/se_e2_a.
3. Add se_e2_a related uinttests

Related PR to be merged:

- [x] <PaddlePaddle/Paddle#69139>

## Accuracy test

### pytorch

![image](https://github.com/user-attachments/assets/cea8f313-4a57-4575-b55a-b6cf577654a2)

### paddle:
``` log
deepmd.utils.batch_size                       Adjust batch size from 1024 to 2048
deepmd.utils.batch_size                       Adjust batch size from 2048 to 4096
deepmd.entrypoints.test                       # number of test data : 30 ,
deepmd.entrypoints.test                       Energy MAE         : 7.467160e-02 eV
deepmd.entrypoints.test                       Energy RMSE        : 8.981154e-02 eV
deepmd.entrypoints.test                       Energy MAE/Natoms  : 3.889146e-04 eV
deepmd.entrypoints.test                       Energy RMSE/Natoms : 4.677685e-04 eV
deepmd.entrypoints.test                       Force  MAE         : 4.495974e-02 eV/A
deepmd.entrypoints.test                       Force  RMSE        : 5.883696e-02 eV/A
deepmd.entrypoints.test                       Virial MAE         : 4.683873e+00 eV
deepmd.entrypoints.test                       Virial RMSE        : 6.298489e+00 eV
deepmd.entrypoints.test                       Virial MAE/Natoms  : 2.439517e-02 eV
deepmd.entrypoints.test                       Virial RMSE/Natoms : 3.280463e-02 eV
```
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

- **New Features**
- Introduced support for PaddlePaddle in the DeePMD framework, enhancing
model training and evaluation capabilities.
- Added new backend options and configuration files for multitask
models.
- Implemented new classes and methods for handling Paddle-specific
functionalities, including descriptor calculations and model
evaluations.
- Enhanced the command-line interface to include Paddle as a backend
option.
- Expanded the functionality for managing Paddle dependencies and
configurations in the testing framework.

- **Bug Fixes**
- Improved error handling and robustness in various components across
the framework.

- **Tests**
- Expanded the test suite to include Paddle-specific tests, ensuring
consistency and reliability across different backends.
- Introduced unit tests for new functionalities related to Paddle,
including model evaluations and descriptor calculations.
- Added tests to validate force gradient calculations and smoothness
properties in models.
- Implemented tests for neighbor statistics and region transformations,
ensuring accuracy in calculations.

- **Documentation**
- Updated documentation across multiple modules to reflect new features
and usage instructions.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: HydrogenSulfate <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
  • Loading branch information
HydrogenSulfate and pre-commit-ci[bot] authored Nov 27, 2024
1 parent 3cdf407 commit 4a45fe5
Show file tree
Hide file tree
Showing 136 changed files with 21,039 additions and 25 deletions.
1 change: 1 addition & 0 deletions .github/workflows/test_cuda.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ jobs:
- run: |
export PYTORCH_ROOT=$(python -c 'import torch;print(torch.__path__[0])')
export TENSORFLOW_ROOT=$(python -c 'import importlib,pathlib;print(pathlib.Path(importlib.util.find_spec("tensorflow").origin).parent)')
source/install/uv_with_retry.sh pip install --system --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu123/
source/install/uv_with_retry.sh pip install --system -v -e .[gpu,test,lmp,cu12,torch,jax] mpi4py
env:
DP_VARIANT: cuda
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/test_python.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ jobs:
export PYTORCH_ROOT=$(python -c 'import torch;print(torch.__path__[0])')
source/install/uv_with_retry.sh pip install --system -e .[test,jax] mpi4py
source/install/uv_with_retry.sh pip install --system horovod --no-build-isolation
source/install/uv_with_retry.sh pip install --system --pre "paddlepaddle" -i https://www.paddlepaddle.org.cn/packages/nightly/cpu/
env:
# Please note that uv has some issues with finding
# existing TensorFlow package. Currently, it uses
Expand Down
133 changes: 133 additions & 0 deletions backend/find_paddle.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
# SPDX-License-Identifier: LGPL-3.0-or-later
import importlib
import os
import site
from functools import (
lru_cache,
)
from importlib.machinery import (
FileFinder,
)
from importlib.util import (
find_spec,
)
from pathlib import (
Path,
)
from sysconfig import (
get_path,
)
from typing import (
Optional,
Union,
)


@lru_cache
def find_paddle() -> tuple[Optional[str], list[str]]:
"""Find PaddlePadle library.
Tries to find PaddlePadle in the order of:
1. Environment variable `PADDLE_ROOT` if set
2. The current Python environment.
3. user site packages directory if enabled
4. system site packages directory (purelib)
Considering the default PaddlePadle package still uses old CXX11 ABI, we
cannot install it automatically.
Returns
-------
str, optional
PaddlePadle library path if found.
list of str
Paddle requirement if not found. Empty if found.
"""
if os.environ.get("DP_ENABLE_PADDLE", "0") == "0":
return None, []
requires = []
pd_spec = None

if (pd_spec is None or not pd_spec) and os.environ.get("PADDLE_ROOT") is not None:
site_packages = Path(os.environ.get("PADDLE_ROOT")).parent.absolute()
pd_spec = FileFinder(str(site_packages)).find_spec("paddle")

# get paddle spec
# note: isolated build will not work for backend
if pd_spec is None or not pd_spec:
pd_spec = find_spec("paddle")

if not pd_spec and site.ENABLE_USER_SITE:
# first search TF from user site-packages before global site-packages
site_packages = site.getusersitepackages()
if site_packages:
pd_spec = FileFinder(site_packages).find_spec("paddle")

if not pd_spec:
# purelib gets site-packages path
site_packages = get_path("purelib")
if site_packages:
pd_spec = FileFinder(site_packages).find_spec("paddle")

# get install dir from spec
try:
pd_install_dir = pd_spec.submodule_search_locations[0] # type: ignore
# AttributeError if ft_spec is None
# TypeError if submodule_search_locations are None
# IndexError if submodule_search_locations is an empty list
except (AttributeError, TypeError, IndexError):
pd_install_dir = None
requires.extend(get_pd_requirement()["paddle"])
return pd_install_dir, requires


@lru_cache
def get_pd_requirement(pd_version: str = "") -> dict:
"""Get PaddlePadle requirement when Paddle is not installed.
If pd_version is not given and the environment variable `PADDLE_VERSION` is set, use it as the requirement.
Parameters
----------
pd_version : str, optional
Paddle version
Returns
-------
dict
PaddlePadle requirement.
"""
if pd_version is None:
return {"paddle": []}
if pd_version == "":
pd_version = os.environ.get("PADDLE_VERSION", "")

return {
"paddle": [
"paddlepaddle>=3.0.0b1" if pd_version != "" else "paddlepaddle>=3.0.0b1",
],
}


@lru_cache
def get_pd_version(pd_path: Optional[Union[str, Path]]) -> str:
"""Get Paddle version from a Paddle Python library path.
Parameters
----------
pd_path : str or Path
Paddle Python library path, e.g. "/python3.10/site-packages/paddle/"
Returns
-------
str
version
"""
if pd_path is None or pd_path == "":
return ""
version_file = Path(pd_path) / "version" / "__init__.py"
spec = importlib.util.spec_from_file_location("paddle.version", version_file)
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
return module.full_version
124 changes: 124 additions & 0 deletions deepmd/backend/paddle.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
# SPDX-License-Identifier: LGPL-3.0-or-later
from importlib.util import (
find_spec,
)
from typing import (
TYPE_CHECKING,
Callable,
ClassVar,
)

from deepmd.backend.backend import (
Backend,
)

if TYPE_CHECKING:
from argparse import (
Namespace,
)

from deepmd.infer.deep_eval import (
DeepEvalBackend,
)
from deepmd.utils.neighbor_stat import (
NeighborStat,
)


@Backend.register("pd")
@Backend.register("paddle")
class PaddleBackend(Backend):
"""Paddle backend."""

name = "Paddle"
"""The formal name of the backend."""
features: ClassVar[Backend.Feature] = (
Backend.Feature.ENTRY_POINT
| Backend.Feature.DEEP_EVAL
| Backend.Feature.NEIGHBOR_STAT
| Backend.Feature.IO
)
"""The features of the backend."""
suffixes: ClassVar[list[str]] = [".json", ".pd"]
"""The suffixes of the backend."""

def is_available(self) -> bool:
"""Check if the backend is available.
Returns
-------
bool
Whether the backend is available.
"""
return find_spec("paddle") is not None

@property
def entry_point_hook(self) -> Callable[["Namespace"], None]:
"""The entry point hook of the backend.
Returns
-------
Callable[[Namespace], None]
The entry point hook of the backend.
"""
from deepmd.pd.entrypoints.main import main as deepmd_main

return deepmd_main

@property
def deep_eval(self) -> type["DeepEvalBackend"]:
"""The Deep Eval backend of the backend.
Returns
-------
type[DeepEvalBackend]
The Deep Eval backend of the backend.
"""
from deepmd.pd.infer.deep_eval import DeepEval as DeepEvalPD

return DeepEvalPD

@property
def neighbor_stat(self) -> type["NeighborStat"]:
"""The neighbor statistics of the backend.
Returns
-------
type[NeighborStat]
The neighbor statistics of the backend.
"""
from deepmd.pd.utils.neighbor_stat import (
NeighborStat,
)

return NeighborStat

@property
def serialize_hook(self) -> Callable[[str], dict]:
"""The serialize hook to convert the model file to a dictionary.
Returns
-------
Callable[[str], dict]
The serialize hook of the backend.
"""
from deepmd.pd.utils.serialization import (
serialize_from_file,
)

return serialize_from_file

@property
def deserialize_hook(self) -> Callable[[str, dict], None]:
"""The deserialize hook to convert the dictionary to a model file.
Returns
-------
Callable[[str, dict], None]
The deserialize hook of the backend.
"""
from deepmd.pd.utils.serialization import (
deserialize_to_file,
)

return deserialize_to_file
2 changes: 1 addition & 1 deletion deepmd/dpmodel/model/make_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -457,7 +457,7 @@ def format_nlist(
Returns
-------
formated_nlist
formatted_nlist
the formatted nlist.
"""
Expand Down
3 changes: 2 additions & 1 deletion deepmd/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,9 +99,10 @@ def main_parser() -> argparse.ArgumentParser:
formatter_class=RawTextArgumentDefaultsHelpFormatter,
epilog=textwrap.dedent(
"""\
Use --tf or --pt to choose the backend:
Use --tf, --pt or --pd to choose the backend:
dp --tf train input.json
dp --pt train input.json
dp --pd train input.json
"""
),
)
Expand Down
11 changes: 11 additions & 0 deletions deepmd/pd/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# SPDX-License-Identifier: LGPL-3.0-or-later

# import customized OPs globally

from deepmd.utils.entry_point import (
load_entry_point,
)

load_entry_point("deepmd.pd")

__all__ = []
1 change: 1 addition & 0 deletions deepmd/pd/entrypoints/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# SPDX-License-Identifier: LGPL-3.0-or-later
Loading

0 comments on commit 4a45fe5

Please sign in to comment.