-
Notifications
You must be signed in to change notification settings - Fork 520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate paddle backend in devel branch #4157
base: devel
Are you sure you want to change the base?
Changes from 105 commits
59b9af5
7f32618
217bf36
0ad720d
1d7b0d1
f6eeef6
ec021f7
55f71f6
ebdbed2
e0aeb73
0896c90
2e79d68
482d588
a3c4663
2b4832a
f1100e4
3068869
68a2d62
ff7e0ef
023ba53
8a1834f
396bd54
66734bc
ba02ae8
40157dd
8d53aec
e39d466
b97571e
50092c6
e02dd11
8343077
09c54f3
bc854b2
892fd80
04b9064
b3a6408
15a7e75
c792563
72241ea
8694476
e246a34
5ee8bcf
e3c1ceb
49ba5a5
f1cae59
a83fb63
f7f64b1
d5a313e
299548a
97828b3
d67e27c
87069e5
d6e3fdb
8004a52
8e951cd
3a0f700
73c0d17
8a59a53
7821997
ed51258
13c7f55
e013860
4c4568f
7525dac
312a3ef
5e4edd7
0146f24
834d512
9e90416
cfacca3
d492397
de24e27
264286f
fd6aff0
6651e87
39ca3b7
2de0d2b
11d0344
3650214
754b948
f007fb4
da4fb97
f8a4279
1643c6c
0d0662f
8b9ee50
d10a3f7
f6253cf
690dec2
ec47dd8
03084d2
10e4a0d
a1adc8a
5edc0ae
6d13376
9b1f322
2b34756
dad72c8
cbc9c65
afd4746
7b2476f
26047e9
d03702a
c611955
3b27c49
541dae6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -29,6 +29,7 @@ jobs: | |
source/install/uv_with_retry.sh pip install --system torch -i https://download.pytorch.org/whl/cpu | ||
export PYTORCH_ROOT=$(python -c 'import torch;print(torch.__path__[0])') | ||
source/install/uv_with_retry.sh pip install --system --only-binary=horovod -e .[cpu,test,jax] horovod[tensorflow-cpu] mpi4py | ||
source/install/uv_with_retry.sh pip install --system --pre "paddlepaddle" -i https://www.paddlepaddle.org.cn/packages/nightly/cpu/ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 💡 Codebase verification PaddlePaddle index URL is inaccessible or improperly configured The custom PaddlePaddle index URL returns a successful HTTP status but does not list any PaddlePaddle packages and fails to retrieve package metadata. This could lead to installation failures in the CI pipeline.
🔗 Analysis chainConsider the trade-offs of using the nightly PaddlePaddle version The addition of PaddlePaddle installation is crucial for integrating and testing the new Paddle backend. However, using the nightly (pre-release) version in a CI workflow presents both advantages and potential risks: Pros:
Cons:
To mitigate potential issues, consider the following:
source/install/uv_with_retry.sh pip install --system --pre "paddlepaddle" -i https://www.paddlepaddle.org.cn/packages/nightly/cpu/ || \
+ source/install/uv_with_retry.sh pip install --system "paddlepaddle" -i https://pypi.tuna.tsinghua.edu.cn/simple This approach balances the need for testing with the latest features while maintaining CI stability. 🏁 Scripts executedThe following scripts were executed for the analysis: Script: #!/bin/bash
# Verify the PaddlePaddle index URL
curl -sS https://www.paddlepaddle.org.cn/packages/nightly/cpu/ | grep -q "PaddlePaddle"
if [ $? -eq 0 ]; then
echo "PaddlePaddle index URL is accessible and contains expected content."
else
echo "Warning: PaddlePaddle index URL may be unreliable or inaccessible."
fi
Length of output: 248 Script: #!/bin/bash
# Check HTTP status code of the PaddlePaddle index URL
STATUS_CODE=$(curl -o /dev/null -s -w "%{http_code}\n" https://www.paddlepaddle.org.cn/packages/nightly/cpu/)
echo "HTTP Status Code: $STATUS_CODE"
# Verify if PaddlePaddle packages are listed on the index page
PACKAGE_LIST=$(curl -sS https://www.paddlepaddle.org.cn/packages/nightly/cpu/ | grep -E "paddlepaddle[-_](\d+\.)+\d+\.whl")
if [ -n "$PACKAGE_LIST" ]; then
echo "PaddlePaddle packages are available on the index."
else
echo "No PaddlePaddle packages found on the index."
fi
# Retrieve PaddlePaddle package metadata
METADATA=$(curl -sS https://www.paddlepaddle.org.cn/packages/nightly/cpu/paddlepaddle | grep "Metadata")
if [ -n "$METADATA" ]; then
echo "PaddlePaddle package metadata retrieved successfully."
else
echo "Failed to retrieve PaddlePaddle package metadata."
fi
Length of output: 643 |
||
env: | ||
# Please note that uv has some issues with finding | ||
# existing TensorFlow package. Currently, it uses | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The change to this file should be reverted There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,6 +7,9 @@ | |
Optional, | ||
) | ||
|
||
from .find_paddle import ( | ||
get_pd_requirement, | ||
) | ||
from .find_pytorch import ( | ||
get_pt_requirement, | ||
) | ||
|
@@ -34,7 +37,7 @@ def dynamic_metadata( | |
settings: Optional[dict[str, object]] = None, | ||
): | ||
assert field in ["optional-dependencies", "entry-points", "scripts"] | ||
_, _, find_libpython_requires, extra_scripts, tf_version, pt_version = ( | ||
_, _, find_libpython_requires, extra_scripts, tf_version, pt_version, pd_version = ( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 💡 Codebase verification Unused The variable
Analysis chainAddress inconsistency and unused variable
To address these issues:
def dynamic_metadata(
field: str,
settings: Optional[Dict[str, object]] = None,
pd_version: Optional[str] = None,
):
elif field == "optional-dependencies":
optional_dependencies = pyproject["tool"]["deepmd_build_backend"][
"optional-dependencies"
]
optional_dependencies["lmp"].extend(find_libpython_requires)
optional_dependencies["ipi"].extend(find_libpython_requires)
return {
**optional_dependencies,
**get_tf_requirement(tf_version),
**get_pt_requirement(pt_version),
**get_pd_requirement(pd_version), # Add this line
}
To ensure This will help verify the usage of Scripts executedThe following scripts were executed for the analysis: Script: #!/bin/bash
# Description: Check for occurrences of pd_version and potential Paddle-related functions
# Search for pd_version
echo "Occurrences of pd_version:"
rg 'pd_version' --type python
# Search for potential Paddle-related functions
echo "\nPotential Paddle-related functions:"
rg 'get_pd_requirement|paddle' --type python
Length of output: 286 Script: #!/bin/bash
# Description: Check for occurrences of pd_version and potential Paddle-related functions
# Search for pd_version in Python files
echo "Occurrences of pd_version:"
rg 'pd_version' --type py
# Search for potential Paddle-related functions in Python files
echo "\nPotential Paddle-related functions:"
rg 'get_pd_requirement|paddle' --type py
Length of output: 167111 |
||
get_argument_from_env() | ||
) | ||
with Path("pyproject.toml").open("rb") as f: | ||
|
@@ -55,4 +58,5 @@ def dynamic_metadata( | |
**optional_dependencies, | ||
**get_tf_requirement(tf_version), | ||
**get_pt_requirement(pt_version), | ||
**get_pd_requirement(pd_version), | ||
} |
Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,133 @@ | ||||||||||||||||||||||||||
# SPDX-License-Identifier: LGPL-3.0-or-later | ||||||||||||||||||||||||||
import importlib | ||||||||||||||||||||||||||
import os | ||||||||||||||||||||||||||
import site | ||||||||||||||||||||||||||
from functools import ( | ||||||||||||||||||||||||||
lru_cache, | ||||||||||||||||||||||||||
) | ||||||||||||||||||||||||||
from importlib.machinery import ( | ||||||||||||||||||||||||||
FileFinder, | ||||||||||||||||||||||||||
) | ||||||||||||||||||||||||||
from importlib.util import ( | ||||||||||||||||||||||||||
find_spec, | ||||||||||||||||||||||||||
) | ||||||||||||||||||||||||||
from pathlib import ( | ||||||||||||||||||||||||||
Path, | ||||||||||||||||||||||||||
) | ||||||||||||||||||||||||||
from sysconfig import ( | ||||||||||||||||||||||||||
get_path, | ||||||||||||||||||||||||||
) | ||||||||||||||||||||||||||
from typing import ( | ||||||||||||||||||||||||||
Optional, | ||||||||||||||||||||||||||
Union, | ||||||||||||||||||||||||||
) | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
@lru_cache | ||||||||||||||||||||||||||
def find_paddle() -> tuple[Optional[str], list[str]]: | ||||||||||||||||||||||||||
"""Find PaddlePadle library. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Tries to find PaddlePadle in the order of: | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
1. Environment variable `PADDLE_ROOT` if set | ||||||||||||||||||||||||||
2. The current Python environment. | ||||||||||||||||||||||||||
3. user site packages directory if enabled | ||||||||||||||||||||||||||
4. system site packages directory (purelib) | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Considering the default PaddlePadle package still uses old CXX11 ABI, we | ||||||||||||||||||||||||||
cannot install it automatically. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Returns | ||||||||||||||||||||||||||
------- | ||||||||||||||||||||||||||
str, optional | ||||||||||||||||||||||||||
PaddlePadle library path if found. | ||||||||||||||||||||||||||
list of str | ||||||||||||||||||||||||||
Paddle requirement if not found. Empty if found. | ||||||||||||||||||||||||||
""" | ||||||||||||||||||||||||||
if os.environ.get("DP_ENABLE_PADDLE", "0") == "0": | ||||||||||||||||||||||||||
return None, [] | ||||||||||||||||||||||||||
requires = [] | ||||||||||||||||||||||||||
pd_spec = None | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
if (pd_spec is None or not pd_spec) and os.environ.get("PADDLE_ROOT") is not None: | ||||||||||||||||||||||||||
site_packages = Path(os.environ.get("PADDLE_ROOT")).parent.absolute() | ||||||||||||||||||||||||||
pd_spec = FileFinder(str(site_packages)).find_spec("paddle") | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
# get paddle spec | ||||||||||||||||||||||||||
# note: isolated build will not work for backend | ||||||||||||||||||||||||||
if pd_spec is None or not pd_spec: | ||||||||||||||||||||||||||
pd_spec = find_spec("paddle") | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
if not pd_spec and site.ENABLE_USER_SITE: | ||||||||||||||||||||||||||
# first search TF from user site-packages before global site-packages | ||||||||||||||||||||||||||
site_packages = site.getusersitepackages() | ||||||||||||||||||||||||||
if site_packages: | ||||||||||||||||||||||||||
pd_spec = FileFinder(site_packages).find_spec("paddle") | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
if not pd_spec: | ||||||||||||||||||||||||||
# purelib gets site-packages path | ||||||||||||||||||||||||||
site_packages = get_path("purelib") | ||||||||||||||||||||||||||
if site_packages: | ||||||||||||||||||||||||||
pd_spec = FileFinder(site_packages).find_spec("paddle") | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
# get install dir from spec | ||||||||||||||||||||||||||
try: | ||||||||||||||||||||||||||
pd_install_dir = pd_spec.submodule_search_locations[0] # type: ignore | ||||||||||||||||||||||||||
# AttributeError if ft_spec is None | ||||||||||||||||||||||||||
# TypeError if submodule_search_locations are None | ||||||||||||||||||||||||||
# IndexError if submodule_search_locations is an empty list | ||||||||||||||||||||||||||
except (AttributeError, TypeError, IndexError): | ||||||||||||||||||||||||||
pd_install_dir = None | ||||||||||||||||||||||||||
requires.extend(get_pd_requirement()["paddle"]) | ||||||||||||||||||||||||||
return pd_install_dir, requires | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
@lru_cache | ||||||||||||||||||||||||||
def get_pd_requirement(pd_version: str = "") -> dict: | ||||||||||||||||||||||||||
"""Get PaddlePadle requirement when Paddle is not installed. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
If pd_version is not given and the environment variable `PADDLE_VERSION` is set, use it as the requirement. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Parameters | ||||||||||||||||||||||||||
---------- | ||||||||||||||||||||||||||
pd_version : str, optional | ||||||||||||||||||||||||||
Paddle version | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Returns | ||||||||||||||||||||||||||
------- | ||||||||||||||||||||||||||
dict | ||||||||||||||||||||||||||
PaddlePadle requirement. | ||||||||||||||||||||||||||
""" | ||||||||||||||||||||||||||
if pd_version is None: | ||||||||||||||||||||||||||
return {"paddle": []} | ||||||||||||||||||||||||||
if pd_version == "": | ||||||||||||||||||||||||||
pd_version = os.environ.get("PADDLE_VERSION", "") | ||||||||||||||||||||||||||
Comment on lines
+86
to
+104
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Inconsistent handling of In Apply this diff to address the issue: If -def get_pd_requirement(pd_version: str = "") -> dict:
+def get_pd_requirement(pd_version: Optional[str] = "") -> dict: Alternatively, if - if pd_version is None:
- return {"paddle": []}
|
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
return { | ||||||||||||||||||||||||||
"paddle": [ | ||||||||||||||||||||||||||
"paddlepaddle" if pd_version != "" else "paddlepaddle", | ||||||||||||||||||||||||||
], | ||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
@lru_cache | ||||||||||||||||||||||||||
def get_pd_version(pd_path: Optional[Union[str, Path]]) -> str: | ||||||||||||||||||||||||||
"""Get Paddle version from a Paddle Python library path. | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Parameters | ||||||||||||||||||||||||||
---------- | ||||||||||||||||||||||||||
pd_path : str or Path | ||||||||||||||||||||||||||
Paddle Python library path, e.g. "/python3.10/site-packages/paddle/" | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
Returns | ||||||||||||||||||||||||||
------- | ||||||||||||||||||||||||||
str | ||||||||||||||||||||||||||
version | ||||||||||||||||||||||||||
""" | ||||||||||||||||||||||||||
if pd_path is None or pd_path == "": | ||||||||||||||||||||||||||
return "" | ||||||||||||||||||||||||||
version_file = Path(pd_path) / "version" / "__init__.py" | ||||||||||||||||||||||||||
spec = importlib.util.spec_from_file_location("paddle.version", version_file) | ||||||||||||||||||||||||||
module = importlib.util.module_from_spec(spec) | ||||||||||||||||||||||||||
spec.loader.exec_module(module) | ||||||||||||||||||||||||||
Comment on lines
+130
to
+132
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add error handling for potential The function Apply this diff to enhance error handling: version_file = Path(pd_path) / "version" / "__init__.py"
+ if not version_file.exists():
+ return ""
spec = importlib.util.spec_from_file_location("paddle.version", version_file)
+ if spec is None or spec.loader is None:
+ return ""
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
return module.full_version 📝 Committable suggestion
Suggested change
|
||||||||||||||||||||||||||
return module.full_version |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,6 +10,10 @@ | |
Version, | ||
) | ||
|
||
from .find_paddle import ( | ||
find_paddle, | ||
get_pd_version, | ||
) | ||
from .find_pytorch import ( | ||
find_pytorch, | ||
get_pt_version, | ||
|
@@ -21,7 +25,7 @@ | |
|
||
|
||
@lru_cache | ||
def get_argument_from_env() -> tuple[str, list, list, dict, str, str]: | ||
def get_argument_from_env() -> tuple[str, list, list, dict, str, str, str]: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🛠️ Refactor suggestion Consider refactoring return values into a data structure for better maintainability As the number of return values from You can define a from typing import NamedTuple
class BuildArguments(NamedTuple):
cmake_minimum_required_version: str
cmake_args: list
find_libpython_requires: list
extra_scripts: dict
tf_version: str
pt_version: str
pd_version: str Update the function signature: -def get_argument_from_env() -> tuple[str, list, list, dict, str, str, str]:
+def get_argument_from_env() -> BuildArguments: Update the return statement: -return (
+return BuildArguments(
cmake_minimum_required_version,
cmake_args,
find_libpython_requires,
extra_scripts,
tf_version,
pt_version,
pd_version,
) This approach enhances code clarity when unpacking values: -def set_scikit_build_env():
+def set_scikit_build_env():
"""Set scikit-build environment variables before executing scikit-build."""
- cmake_minimum_required_version, cmake_args, _, _, _, _, _ = get_argument_from_env()
+ build_args = get_argument_from_env()
os.environ["SKBUILD_CMAKE_MINIMUM_VERSION"] = build_args.cmake_minimum_required_version
os.environ["SKBUILD_CMAKE_ARGS"] = ";".join(build_args.cmake_args) Also applies to: 47-48, 150-150 |
||
"""Get the arguments from environment variables. | ||
|
||
The environment variables are assumed to be not changed during the build. | ||
|
@@ -40,6 +44,8 @@ def get_argument_from_env() -> tuple[str, list, list, dict, str, str]: | |
The TensorFlow version. | ||
str | ||
The PyTorch version. | ||
str | ||
The Paddle version. | ||
""" | ||
cmake_args = [] | ||
extra_scripts = {} | ||
|
@@ -117,6 +123,18 @@ def get_argument_from_env() -> tuple[str, list, list, dict, str, str]: | |
cmake_args.append("-DENABLE_PYTORCH=OFF") | ||
pt_version = None | ||
|
||
if os.environ.get("DP_ENABLE_PADDLE", "0") == "1": | ||
pd_install_dir, _ = find_paddle() | ||
pd_version = get_pd_version(pd_install_dir) | ||
cmake_args.extend( | ||
[ | ||
"-DENABLE_PADDLE=ON", | ||
] | ||
) | ||
else: | ||
cmake_args.append("-DENABLE_PADDLE=OFF") | ||
pd_version = None | ||
|
||
cmake_args = [ | ||
"-DBUILD_PY_IF:BOOL=TRUE", | ||
*cmake_args, | ||
|
@@ -128,11 +146,12 @@ def get_argument_from_env() -> tuple[str, list, list, dict, str, str]: | |
extra_scripts, | ||
tf_version, | ||
pt_version, | ||
pd_version, | ||
) | ||
|
||
|
||
def set_scikit_build_env(): | ||
"""Set scikit-build environment variables before executing scikit-build.""" | ||
cmake_minimum_required_version, cmake_args, _, _, _, _ = get_argument_from_env() | ||
cmake_minimum_required_version, cmake_args, _, _, _, _, _ = get_argument_from_env() | ||
os.environ["SKBUILD_CMAKE_MINIMUM_VERSION"] = cmake_minimum_required_version | ||
os.environ["SKBUILD_CMAKE_ARGS"] = ";".join(cmake_args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve robustness of environment variable setup.
While the environment setup is functionally correct, it could be more maintainable and robust.
Consider consolidating the environment setup:
📝 Committable suggestion