Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

breaking: drop Python 3.8 support #4185

Merged
merged 5 commits into from
Oct 6, 2024
Merged

Conversation

njzjz
Copy link
Member

@njzjz njzjz commented Oct 5, 2024

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced new functionality for handling input data conversion between different versions.
    • Added EnvMatStatSe class for enhanced environmental matrix statistics calculations.
    • Implemented a mechanism to track the status of atoms (real vs. virtual) in BaseAtomicModel.
  • Bug Fixes

    • Updated Python version requirements across documentation and configuration files to Python 3.9 or above.
  • Documentation

    • Updated installation guides to reflect the new Python version requirement and clarified virtual environment setup instructions.
  • Chores

    • Refined dependency management in pyproject.toml to support newer Python versions and improve version control.

Copy link
Contributor

coderabbitai bot commented Oct 5, 2024

📝 Walkthrough
📝 Walkthrough
📝 Walkthrough

Walkthrough

The pull request introduces several changes across multiple files, primarily focusing on updating the minimum supported Python version from 3.8 to 3.9. This includes modifications to workflow configurations, import statements, class definitions, and method signatures. Additionally, new functionality is added for handling input data conversions and deprecations. The overall structure and logic of the code remain intact, but enhancements are made to improve compatibility with newer Python versions and streamline installation instructions.

Changes

File Path Change Summary
.github/workflows/test_python.yml Updated Python versions in job matrices from ["3.8", "3.12"] to ["3.9", "3.12"] for jobs testpython and update_durations.
deepmd/dpmodel/utils/neighbor_stat.py Updated import statements; added NeighborStatOP class; modified NeighborStat class and methods for neighbor statistics.
deepmd/pt/model/atomic_model/base_atomic_model.py Updated forward_common_atomic and atomic_output_def methods to include a new output key "mask".
deepmd/pt/utils/env_mat_stat.py Added EnvMatStatSe class with several methods; updated import statements.
deepmd/pt/utils/neighbor_stat.py Updated import statements; defined NeighborStatOP class; modified NeighborStat class.
deepmd/tf/entrypoints/transfer.py Updated import statements; changed shape parameter in convert_matrix to use Sequence[int].
deepmd/tf/infer/deep_eval.py Replaced lru_cache with cache decorator for properties in DeepEval class.
deepmd/tf/utils/neighbor_stat.py Updated import statements; minor adjustments to NeighborStat class constructor.
deepmd/tf/utils/parallel_op.py Updated import statements for Generator.
deepmd/utils/compat.py Updated import statements; added functions for input data conversion and deprecation handling.
deepmd/utils/data_system.py Updated caching strategy for default_mesh; added _make_auto_ts method; deprecated get_test method.
deepmd/utils/env_mat_stat.py Updated import statements; no significant changes to class functionality.
deepmd/utils/neighbor_stat.py Updated import statements; no significant changes to class functionality.
doc/development/coding-conventions.rst Updated minimum supported Python version from 3.8 to 3.9.
doc/install/easy-install.md Updated Python version requirement from 3.8 to 3.9.
doc/install/install-from-source.md Updated Python version requirement from 3.8 to 3.9; clarified virtual environment setup instructions.
pyproject.toml Updated requires-python from >=3.8 to >=3.9; added classifiers for Python 3.9 to 3.12.

Possibly related PRs

Suggested reviewers

  • wanghan-iapcm

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added the Docs label Oct 5, 2024
@njzjz njzjz added breaking change Breaking changes that should notify users. and removed Docs labels Oct 5, 2024
@njzjz
Copy link
Member Author

njzjz commented Oct 5, 2024

pre-commit.ci autofix

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (23)
deepmd/tf/utils/parallel_op.py (1)

Line range hint 68-69: Verified: Import change has no negative impact on existing code

The change from typing.Generator to collections.abc.Generator doesn't negatively impact the existing code. The Generator type is correctly used in the type hint for the feed parameter in the generate method.

For improved clarity and consistency with modern Python typing practices, consider updating the return type hint of the generate method:

    def generate(
        self, sess: tf.Session, feed: Generator[Dict[str, Any], None, None]
-   ) -> Generator[Tuple, None, None]:
+   ) -> Generator[Tuple[Any, ...], None, None]:

This change specifies that the generator yields tuples of any length containing any type, which is more precise than the current hint.

Also applies to: 70-71, 72-73, 74-75

.github/workflows/test_python.yml (1)

Line range hint 1-99: Consider the broader impact of dropping Python 3.8 support

The changes in this file consistently update the Python versions used in the CI workflow, which is a good start for dropping Python 3.8 support. However, consider the following points to ensure a smooth transition:

  1. Update project documentation, including README, installation instructions, and any version compatibility matrices.
  2. Review and update any version-specific code or dependencies throughout the project.
  3. Consider adding a note in the project's changelog or release notes about this breaking change.
  4. Ensure that all project dependencies are compatible with Python 3.9+.
  5. Update any setup files (e.g., setup.py, pyproject.toml) to reflect the new minimum Python version.

To help with this transition, you might want to run the following script to identify files that may need updating:

#!/bin/bash
# Description: Identify files that may need updating due to Python version change

# Search for files mentioning Python versions
rg -l 'python.*3\.[89]' --type-not yml

# Search for setup files
fd -e py -e toml '(setup|pyproject)'

# Search for documentation files
fd -e md -e rst '(README|INSTALL|CHANGELOG)'

This will help ensure that the entire project is consistent with the decision to drop Python 3.8 support.

doc/install/easy-install.md (2)

13-13: Approved: Python version requirement updated.

The Python version requirement has been correctly updated from 3.8 to 3.9, which aligns with the PR objective of dropping Python 3.8 support.

Consider adding a note about this being a breaking change, perhaps in a "What's New" or "Changelog" section, to clearly communicate this update to users. This could help users understand why they might need to update their Python environment.


Line range hint 1-128: Consider minor improvements for better user experience

The documentation is well-structured and informative. Consider the following suggestions to further enhance user experience:

  1. Add a brief explanation of what DeePMD-kit is at the beginning of the document for new users.
  2. Include a "Requirements" section at the top, summarizing all prerequisites (e.g., Python version, GNU C Library version, NVIDIA driver for GPU version).
  3. Add links to the official documentation or GitHub repository for each third-party tool mentioned (e.g., conda, docker).
  4. Consider adding a "Troubleshooting" section at the end with common installation issues and their solutions.

These additions could make the installation process smoother for users, especially those new to DeePMD-kit or less experienced with software installation.

deepmd/utils/env_mat_stat.py (1)

Line range hint 170-178: Verify: Limited impact of import change and suggestion for type hinting

The import change from typing to collections.abc for Iterator has a limited impact on the codebase. The only usage of Iterator is in the method signature of EnvMatStat.iter(), which remains correct after the change.

As a small improvement, consider updating the type hint for the data parameter in the iter method to use typing.List instead of List. This would make the import more explicit:

from typing import Dict, List as TypingList, Optional

# ...

def iter(self, data: TypingList[Dict[str, np.ndarray]]) -> Iterator[Dict[str, StatItem]]:

This change isn't strictly necessary but could improve code readability and import clarity.

deepmd/tf/utils/neighbor_stat.py (1)

Line range hint 141-144: Approve: Added default value for mixed_type parameter

The addition of a default value (False) for the mixed_type parameter in the NeighborStat constructor is a good improvement. It simplifies the instantiation of NeighborStat objects when mixed type behavior is not required, while maintaining backwards compatibility.

Consider updating the class docstring to reflect this change:

class NeighborStat(BaseNeighborStat):
    """Class for getting training data information.

    It loads data from DeepmdData object, and measures the data info, including neareest nbor distance between atoms, max nbor size of atoms and the output data range of the environment matrix.

    Parameters
    ----------
    ntypes : int
        The num of atom types
    rcut : float
        The cut-off radius
    mixed_type : bool, optional
        Treat all types as a single type. Default is False.
    """
pyproject.toml (3)

31-34: LGTM! Consider updating the Programming Language :: Python :: 3 :: Only classifier.

The changes to drop Python 3.8 support and add support for Python 3.9-3.12 are well implemented. The requires-python field has been correctly updated to ">=3.9".

Consider updating the existing classifier "Programming Language :: Python :: 3 :: Only" to "Programming Language :: Python :: 3.9 :: Only" to more accurately reflect the minimum supported Python version.

Also applies to: 57-57


Line range hint 258-259: LGTM! Consider documenting the PyTorch version constraint.

The addition of the PYTORCH_VERSION environment variable and the accompanying comment explaining the version constraint are good practices. This ensures compatibility between PyTorch and the project's CUDA requirements.

Consider adding a note in the project's documentation (e.g., README.md or a requirements file) about the PyTorch version constraint and the reason for it. This will help other developers understand the limitation without having to dig into the build configuration.


Add support for additional Python versions in cibuildwheel configuration

The current cibuildwheel setup only includes Python 3.11. To ensure compatibility with the required Python versions (3.9, 3.10, and 3.12), please update the build array in pyproject.toml accordingly.

  • Update build = ["cp311-*"] to include the additional versions:
    build = ["cp39-*", "cp310-*", "cp311-*", "cp312-*"]
🔗 Analysis chain

Line range hint 202-204: Verify cibuildwheel configuration compatibility

While there are no visible changes in the [tool.cibuildwheel] section, please ensure that the existing configuration, particularly the build = ["cp311-*"] line, is compatible with the new Python version requirements (3.9+). Consider if you need to add builds for Python 3.9, 3.10, and 3.12 as well.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify if cibuildwheel configuration is compatible with new Python requirements

# Check if there are any references to Python 3.8 in the cibuildwheel configuration
grep -n "cp38" pyproject.toml

# List all Python version references in the cibuildwheel configuration
grep -n "cp3" pyproject.toml

Length of output: 245

doc/install/install-from-source.md (1)

24-24: LGTM: Python version requirement updated correctly.

The change from Python 3.8 to 3.9 as the minimum required version is consistent with the PR objective of dropping Python 3.8 support.

Consider updating other parts of the documentation that might reference Python 3.8, such as example commands or compatibility notes, to ensure consistency throughout the document.

deepmd/tf/infer/deep_eval.py (1)

Line range hint 1-1230: Overall: Consistent modernization, but address potential issues

The changes in this file consistently update the caching mechanism from lru_cache to cache, aligning with the PR's objective of dropping Python 3.8 support. This modernization simplifies the code and leverages newer Python features.

However, there are several points to consider:

  1. Memory Management: The static analysis tool flagged potential memory leak issues with caching methods. Ensure that all cached properties don't depend on mutable state and that the number of unique instances is bounded.

  2. Session Handling: Pay special attention to the caching of the TensorFlow session. Implement proper cleanup procedures to prevent resource leaks.

  3. Code Maintenance: The updates to DeepEvalOld raise questions about code maintenance strategies. Consider deprecating or removing unused code to reduce technical debt.

  4. Documentation: Add comments explaining the caching strategy, cleanup procedures, and future plans for deprecated classes.

  5. Testing: Ensure that these changes are covered by comprehensive tests, especially focusing on memory usage and session management.

By addressing these points, you'll not only modernize the code but also improve its overall quality and maintainability.

deepmd/dpmodel/utils/neighbor_stat.py (2)

Line range hint 53-54: Typo in docstring: 'neareest' should be 'nearest'

In the docstring of the call method, the word "neareest" is misspelled. Please correct it to "nearest".

Apply this diff to fix the typo:

-        """Calculate the neareest neighbor distance between atoms, maximum nbor size of
+        """Calculate the nearest neighbor distance between atoms, maximum neighbor size of

Additionally, consider expanding "nbor" to "neighbor" for clarity.


Line range hint 127-129: Inconsistent parameter naming: mixed_type vs mixed_types

There is an inconsistency in the parameter naming between the NeighborStat and NeighborStatOP classes:

  • In NeighborStat.__init__, the parameter is named mixed_type (singular).
  • In NeighborStatOP.__init__, the parameter is named mixed_types (plural).

For consistency and to avoid confusion, please standardize the parameter name across both classes.

Apply this diff to standardize the parameter name to mixed_types:

In NeighborStat class:

 class NeighborStat(BaseNeighborStat):
     def __init__(
         self,
         ntypes: int,
         rcut: float,
-        mixed_type: bool = False,
+        mixed_types: bool = False,
     ) -> None:
-        super().__init__(ntypes, rcut, mixed_type)
-        self.op = NeighborStatOP(ntypes, rcut, mixed_type)
+        super().__init__(ntypes, rcut, mixed_types)
+        self.op = NeighborStatOP(ntypes, rcut, mixed_types)

Also, update the docstrings to reflect the parameter name change.

Also applies to: 39-42

deepmd/pt/utils/neighbor_stat.py (3)

Line range hint 94-95: Potential issue with masking diff tensor due to dimensionality mismatch

The masking operation on the diff tensor may not correctly exclude self-interactions. The mask created with torch.eye(nloc, nall, dtype=torch.bool, device=diff.device) has shape [nloc, nall], which may not align properly with diff of shape [nframes, nloc, nall, 3]. This can lead to incorrect masking or runtime errors.

Consider adjusting the mask to match the dimensions of diff by unsqueezing it along the frame dimension:

-mask = torch.eye(nloc, nall, dtype=torch.bool, device=diff.device)
+mask = torch.eye(nloc, nall, dtype=torch.bool, device=diff.device).unsqueeze(0)
 diff[mask] = torch.inf

Alternatively, you might use advanced indexing or broadcasting to apply the mask correctly across all frames.


Line range hint 104-106: Optimize neighbor count calculation by vectorizing over atom types

The loop over range(self.ntypes) to compute nnei can be a performance bottleneck for systems with many atom types. Vectorizing this operation can enhance performance by leveraging PyTorch's efficient tensor computations.

Refactor the code as follows to eliminate the Python loop:

 for ii in range(self.ntypes):
     nnei[:, :, ii] = torch.sum(
         mask & extend_atype.eq(ii)[:, None, :], dim=-1
     )
+else:
+    # Vectorized computation over atom types
+    type_mask = extend_atype[:, None, :, None] == torch.arange(self.ntypes, device=extend_atype.device)
+    nnei = torch.sum(mask[..., None] & type_mask, dim=-2)

This approach utilizes broadcasting and removes the explicit loop, improving scalability and performance.


Line range hint 134-139: Inconsistent parameter naming: mixed_type vs. mixed_types

The parameter mixed_type in NeighborStat is inconsistently named compared to mixed_types in NeighborStatOP. This mismatch can lead to confusion and potential errors.

To ensure consistency, rename mixed_type to mixed_types in NeighborStat:

 class NeighborStat(BaseNeighborStat):
     """Neighbor statistics using pure NumPy.

     def __init__(
         self,
         ntypes: int,
         rcut: float,
-        mixed_type: bool = False,
+        mixed_types: bool = False,
     ) -> None:
-        super().__init__(ntypes, rcut, mixed_type)
-        op = NeighborStatOP(ntypes, rcut, mixed_type)
+        super().__init__(ntypes, rcut, mixed_types)
+        op = NeighborStatOP(ntypes, rcut, mixed_types)
         self.op = torch.jit.script(op)
         self.auto_batch_size = AutoBatchSize()

Ensure all references to mixed_type within NeighborStat are updated accordingly.

deepmd/pt/utils/env_mat_stat.py (3)

Line range hint 22-24: Typo in Class Docstring

There's a typo in the class docstring of EnvMatStatSe:

"Environmental matrix statistics for the se_a/se_r environemntal matrix."

The word "environemntal" should be "environmental".


Line range hint 44-50: Typo in Exception Message

In the __init__ method's error handling, there's a typo in the exception message:

"last_dim should be 1 for raial-only or 4 for full descriptor."

It should be "radial-only".


Line range hint 56-66: Enhance Parameter Description in Docstring

In the iter method's docstring, the description for the data parameter is minimal:

data : List[Dict[str, Union[torch.Tensor, List[Tuple[int, int]]]]]

The data.

Consider providing a more detailed description of the expected format and contents of data to improve understandability for users of the method.

deepmd/utils/compat.py (4)

Line range hint 124-153: Ensure 'training' key exists in jdata before accessing

In the convert_input_v1_v2 function, accessing jdata["training"] without checking if the 'training' key exists may lead to a KeyError if 'training' is missing in jdata.

Consider adding a check to ensure 'training' exists:

 def convert_input_v1_v2(
     jdata: Dict[str, Any], warning: bool = True, dump: Optional[Union[str, Path]] = None
 ) -> Dict[str, Any]:
+    if "training" not in jdata:
+        raise KeyError("The 'training' key is missing in jdata.")
     tr_cfg = jdata["training"]

Line range hint 153-184: Handle missing 'training' key safely in deprecate_numb_test

Using jdata.get("training", {}) may inadvertently create a new empty dictionary if 'training' does not exist, and calling pop("numb_test") on it will raise a KeyError.

Modify the code to check for the existence of 'training' and 'numb_test' keys:

 def deprecate_numb_test(
     jdata: Dict[str, Any], warning: bool = True, dump: Optional[Union[str, Path]] = None
 ) -> Dict[str, Any]:
-    try:
-        jdata.get("training", {}).pop("numb_test")
-    except KeyError:
-        pass
+    training_cfg = jdata.get("training")
+    if training_cfg and "numb_test" in training_cfg:
+        training_cfg.pop("numb_test")
+        if warning:
+            warnings.warn(
+                "The argument training->numb_test has been deprecated since v2.0.0. "
+                "Use training->validation_data->batch_size instead."
+            )

Line range hint 185-198: Prevent KeyError when checking 'systems' in jdata["training"]

In the is_deepmd_v1_input function, accessing jdata["training"] without verifying its existence can cause a KeyError if 'training' is not a key in jdata.

Adjust the function to safely check for 'training':

 def is_deepmd_v1_input(jdata):
-    return "systems" in jdata["training"].keys()
+    return "training" in jdata and "systems" in jdata["training"]

Line range hint 185-198: Simplify conditional logic in update_deepmd_input

Refactoring the conditionals can enhance readability and maintainability.

You can restructure the update_deepmd_input function as follows:

 def update_deepmd_input(
     jdata: Dict[str, Any], warning: bool = True, dump: Optional[Union[str, Path]] = None
 ) -> Dict[str, Any]:
-    def is_deepmd_v0_input(jdata):
-        return "model" not in jdata.keys()
-
-    def is_deepmd_v1_input(jdata):
-        return "training" in jdata and "systems" in jdata["training"]
-
-    if is_deepmd_v0_input(jdata):
+    if "model" not in jdata:
         jdata = convert_input_v0_v1(jdata, warning, None)
-        jdata = convert_input_v1_v2(jdata, False, None)
-        jdata = deprecate_numb_test(jdata, False, dump)
-    elif is_deepmd_v1_input(jdata):
-        jdata = convert_input_v1_v2(jdata, warning, None)
-        jdata = deprecate_numb_test(jdata, False, dump)
+        warning = False
+    if "training" in jdata and "systems" in jdata["training"]:
+        jdata = convert_input_v1_v2(jdata, warning, None)
+        warning = False
     jdata = deprecate_numb_test(jdata, warning, dump)
     return jdata
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 7ce5b03 and d619adc.

📒 Files selected for processing (17)
  • .github/workflows/test_python.yml (2 hunks)
  • deepmd/dpmodel/utils/neighbor_stat.py (1 hunks)
  • deepmd/pt/model/atomic_model/base_atomic_model.py (1 hunks)
  • deepmd/pt/utils/env_mat_stat.py (1 hunks)
  • deepmd/pt/utils/neighbor_stat.py (1 hunks)
  • deepmd/tf/entrypoints/transfer.py (1 hunks)
  • deepmd/tf/infer/deep_eval.py (7 hunks)
  • deepmd/tf/utils/neighbor_stat.py (1 hunks)
  • deepmd/tf/utils/parallel_op.py (1 hunks)
  • deepmd/utils/compat.py (1 hunks)
  • deepmd/utils/data_system.py (2 hunks)
  • deepmd/utils/env_mat_stat.py (1 hunks)
  • deepmd/utils/neighbor_stat.py (1 hunks)
  • doc/development/coding-conventions.rst (1 hunks)
  • doc/install/easy-install.md (1 hunks)
  • doc/install/install-from-source.md (2 hunks)
  • pyproject.toml (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • deepmd/utils/neighbor_stat.py
🧰 Additional context used
🪛 Ruff
deepmd/tf/infer/deep_eval.py

271-271: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)


296-296: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)


315-315: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)


1200-1200: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)


1211-1211: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)


1230-1230: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)

deepmd/utils/data_system.py

244-244: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)

🔇 Additional comments (28)
deepmd/tf/utils/parallel_op.py (1)

2-4: Approved: Import change improves Python version compatibility

The change from typing.Generator to collections.abc.Generator is a positive step towards better compatibility with newer Python versions. This modification aligns with the PR objective of dropping Python 3.8 support.

Key points:

  1. collections.abc.Generator is available since Python 3.9 and is the preferred way to import Generator in newer Python versions.
  2. This change doesn't affect the functionality of the code but improves its forward compatibility.
  3. It's a part of the natural evolution of the Python typing system, moving from the typing module to more specific ABC (Abstract Base Class) implementations.
.github/workflows/test_python.yml (1)

72-72: LGTM: Consistent Python version update in update_durations job.

The change from Python 3.8 to 3.9 in the update_durations job matrix is consistent with the previous update in the testpython job. This ensures that duration updates are performed for the same Python versions being tested, maintaining consistency throughout the workflow.

To ensure all parts of the workflow are updated consistently, please run the following script:

#!/bin/bash
# Description: Check for any remaining references to Python 3.8 in the workflow file

# Search for Python 3.8 references in this workflow file
rg -i 'python.*3\.8' .github/workflows/test_python.yml

# Search for other potential version-specific configurations
rg -i 'matrix\.python\s*==\s*' .github/workflows/test_python.yml

If any results are found, consider updating those references or configurations to align with the new minimum supported Python version.

doc/development/coding-conventions.rst (2)

Line range hint 1-33: Consider additional updates related to Python 3.9+

While the Python version update is correct, consider reviewing the following:

  1. Are there any coding conventions or best practices that could be updated to leverage Python 3.9+ features?
  2. Does the pre-commit configuration need updating to reflect the new minimum Python version?

To check the pre-commit configuration, please run:

#!/bin/bash
# Description: Check pre-commit configuration for Python version

echo "Checking pre-commit configuration:"
grep -Hn 'python_version' .pre-commit-config.yaml

This will help ensure that the pre-commit hooks are configured for Python 3.9+.

Also applies to: 34-149


33-33: LGTM: Python version update is correct.

The update to Python 3.9 as the minimum supported version is consistent with the PR objective of dropping Python 3.8 support.

To ensure consistency across the codebase, please run the following script:

This script will help identify any remaining references to Python 3.8 that may need updating.

doc/install/easy-install.md (1)

Line range hint 1-24: Verify versions in installation examples

The installation methods appear to be up-to-date. However, it would be beneficial to double-check the following:

  1. In the offline packages section, verify if 2.2.9 is still the latest version to reference.
  2. In the conda installation section, confirm if 2.2.9 is still the appropriate version to use in the example.
  3. In the docker installation section, check if 2.2.8 is the latest available version for both CPU and GPU images.

Keeping these version numbers current will help users install the most recent stable release.

Also applies to: 25-50, 51-76, 77-102, 103-128

deepmd/utils/env_mat_stat.py (2)

10-12: Approved: Import change improves Python version compatibility

The change from from typing import Iterator to from collections.abc import Iterator is a positive step. This modification:

  1. Aligns with modern Python practices (Python 3.9+).
  2. Maintains backwards compatibility.
  3. Prepares the codebase for future Python versions.

This change is consistent with the PR objective of dropping Python 3.8 support and doesn't affect the functionality of the code.


Line range hint 1-178: Verified: File is compatible with Python 3.9+

After reviewing the entire file, I can confirm that:

  1. The code is compatible with Python 3.9 and above.
  2. No Python 3.8-specific features or syntax are used.
  3. The type hinting used throughout the file is consistent with Python 3.9+ standards.

No further changes are necessary in this file for dropping Python 3.8 support.

To double-check for any Python 3.8-specific features, you can run:

This script searches for common Python 3.8-specific features. If it returns no results, it further confirms our manual review.

deepmd/tf/entrypoints/transfer.py (3)

Line range hint 41-43: Approve function signature update

The update to the convert_matrix function signature, specifying shape: Sequence[int], improves type clarity and is consistent with the import changes. This provides better type information for users of the function and potentially enables better static type checking.


Line range hint 1-283: Summary of changes and their impact

The changes in this file are focused on improving type annotations and aligning with modern Python practices, consistent with dropping support for Python 3.8:

  1. Updated import of Sequence from collections.abc instead of typing.
  2. Improved type annotation for the shape parameter in the convert_matrix function.

These changes enhance type clarity and potentially enable better static type checking without altering the functionality of the code. They represent a step towards more robust and future-proof code.

To ensure a smooth transition:

  1. Verify that all uses of Sequence throughout the codebase are compatible with collections.abc.Sequence.
  2. Update any documentation or developer guidelines to reflect the new minimum Python version requirement.
  3. Consider adding a note in the changelog about the dropped support for Python 3.8 and any potential actions users might need to take.

6-8: Approve import change and verify compatibility

The change from typing.Sequence to collections.abc.Sequence is appropriate, especially when dropping support for Python 3.8. This aligns with modern Python practices for better runtime type checking.

To ensure this change doesn't introduce any issues, please run the following verification script:

Please review the output to ensure all uses of Sequence are compatible with the new import.

✅ Verification successful

Verification Successful: Import Change Confirmed

All imports of Sequence from typing have been successfully replaced with collections.abc.Sequence. Additionally, all usages of Sequence in the codebase are now compatible with the updated import.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any remaining imports of Sequence from typing
# and verify that all uses of Sequence are compatible with collections.abc.Sequence

# Search for any remaining imports of Sequence from typing
echo "Checking for remaining imports of Sequence from typing:"
rg --type python "from\s+typing\s+import.*Sequence"

# Search for all uses of Sequence to verify compatibility
echo "Listing all uses of Sequence for manual verification:"
rg --type python "\bSequence\b"

Length of output: 387


Script:

#!/bin/bash
# Description: Check for any remaining imports of Sequence from typing
# and verify that all uses of Sequence are compatible with collections.abc.Sequence

# Search for any remaining imports of Sequence from typing in Python files
echo "Checking for remaining imports of Sequence from typing:"
rg "from\s+typing\s+import.*Sequence" --glob "*.py"

# Search for all uses of Sequence in Python files for manual verification
echo "Listing all uses of Sequence for manual verification:"
rg "\bSequence\b" --glob "*.py"

Length of output: 844

deepmd/tf/utils/neighbor_stat.py (2)

3-5: Approve: Updated import statement for Iterator

The change from typing.Iterator to collections.abc.Iterator is appropriate and aligns with the PR's objective of dropping Python 3.8 support. This update follows the evolution of Python's type hinting system, where collections.abc.Iterator became the standard in Python 3.9+.

The restructured import statement also improves readability and adheres to PEP 8 guidelines for line breaks in imports.


Line range hint 1-265: Summary: Changes align with PR objectives and improve code quality

The modifications in this file contribute to the PR's goal of dropping Python 3.8 support and enhance the overall code quality:

  1. The update of the Iterator import from typing to collections.abc aligns with Python 3.9+ standards.
  2. The addition of a default value for the mixed_type parameter in NeighborStat improves API usability.

These changes are well-implemented and do not introduce any apparent issues or backwards incompatibilities.

pyproject.toml (2)

Line range hint 1-368: Summary: Changes align with PR objectives, minor improvements suggested

The changes in pyproject.toml successfully implement the objective of dropping Python 3.8 support and adding support for newer Python versions up to 3.12. The requires-python field and classifiers have been updated appropriately.

Key points:

  1. The [project] section changes are well-implemented.
  2. The cibuildwheel configuration may need verification for compatibility with all supported Python versions.
  3. The PyTorch version constraint has been added with a helpful comment.
  4. Linting rules should be reviewed for compatibility with Python 3.9+.

Please address the suggestions in the previous comments to further improve the configuration. Once these are addressed, the changes look good to merge.


Line range hint 324-368: Verify linting rules compatibility

While there are no visible changes in the [tool.ruff.lint] section, it's important to ensure that the existing linting rules are compatible with the new minimum Python version (3.9+). Some linting rules may become obsolete or new ones may become relevant with newer Python versions.

Consider reviewing the output of these commands and updating the linting rules if necessary to align with Python 3.9+ best practices.

doc/install/install-from-source.md (3)

98-98: LGTM: Virtual environment setup updated correctly.

The virtual environment creation command has been appropriately updated to use Python 3.9, which is consistent with the new minimum version requirement.


Line range hint 1-24: Verify impact of dropping Python 3.8 support.

While updating the minimum Python version requirement is straightforward, it's important to consider and document any potential impacts on the project's dependencies or features that might result from dropping Python 3.8 support.

Please consider the following:

  1. Are there any dependencies that require updates due to this change?
  2. Are there any features that were specifically designed for Python 3.8 that need to be updated or removed?
  3. Does this change affect any other parts of the project documentation or code that aren't visible in this file?

It might be helpful to run a project-wide search for Python version-specific code or documentation:

#!/bin/bash
# Description: Search for Python version-specific code or documentation

# Test: Search for Python version checks in Python files. Expect: Review and update any found.
rg "sys.version_info\[0\] == 3 and sys.version_info\[1\] [<>=]= 8" --type py

# Test: Search for Python 3.8 mentions in documentation. Expect: Review and update any found.
rg "Python 3\.8" --type md

Line range hint 1-24: Verify consistency of Python version references throughout the document.

While the main changes to update the Python version requirement have been made, it's important to ensure that all references to Python versions throughout the document are consistent with the new requirement.

Please run the following script to check for any remaining references to Python 3.8:

Also applies to: 98-98

deepmd/tf/infer/deep_eval.py (4)

4-4: Approve: Updating to functools.cache

The change from lru_cache to cache is a good modernization step. The cache decorator, introduced in Python 3.9, is a simpler and generally more efficient memoization tool compared to lru_cache. This change aligns well with the PR's objective of dropping Python 3.8 support and leveraging newer Python features.


271-271: Approve: Updated caching decorator, but consider memory implications

The change from @lru_cache(maxsize=None) to @cache is consistent with the earlier import modification and simplifies the code. However, it's important to note that caching methods can potentially lead to memory leaks if not managed carefully.

To ensure this change doesn't introduce memory issues, please verify that:

  1. The model_type property doesn't depend on mutable instance state.
  2. The number of unique self instances calling this method is bounded.

You may want to add a comment explaining the caching strategy and any considerations for future maintainers.

🧰 Tools
🪛 Ruff

271-271: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)


296-296: Approve: Updated caching decorator, but verify memory safety

The change from @lru_cache(maxsize=None) to @cache for the model_version property is consistent with the previous modifications and simplifies the code. However, as with the model_type property, it's crucial to ensure this doesn't lead to potential memory leaks.

Please verify that:

  1. The model_version property doesn't depend on mutable instance state.
  2. The number of unique self instances calling this method is bounded.

Consider adding a comment explaining the caching strategy and any considerations for future maintainers.

🧰 Tools
🪛 Ruff

296-296: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)


315-315: Approve: Updated caching decorator for session, ensure proper management

The change from @lru_cache(maxsize=None) to @cache for the sess property is consistent with the previous modifications. Caching the TensorFlow session is crucial for performance and resource management.

Given the importance of proper session handling:

  1. Verify that the session is properly closed when no longer needed.
  2. Ensure that multiple instances of this class don't lead to resource contention or leaks.
  3. Consider adding a comment explaining the session caching strategy and any cleanup procedures.

It might be beneficial to implement a custom cleanup method or use context managers to ensure proper session closure.

🧰 Tools
🪛 Ruff

315-315: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)

deepmd/dpmodel/utils/neighbor_stat.py (1)

2-4: Update import of Iterator to collections.abc

Good job updating the import of Iterator to collections.abc. This change is appropriate for dropping Python 3.8 support and aligns with Python 3.9+, where Iterator is now imported from collections.abc.

deepmd/pt/utils/neighbor_stat.py (2)

2-5: Imports updated appropriately for Python 3.9+

The import of Iterator has been correctly updated to collections.abc.Iterator to reflect changes in Python 3.9 and above, aligning with the PR objective to drop Python 3.8 support.


Line range hint 52-115: Addition of NeighborStatOP enhances modularity and performance

The introduction of NeighborStatOP as a subclass of torch.nn.Module encapsulates the neighbor statistics computation effectively. The utilization of PyTorch tensors and operations leverages GPU acceleration, and scripting it with torch.jit.script can further optimize performance.

deepmd/pt/utils/env_mat_stat.py (1)

2-4: Update Import for Iterator

Importing Iterator from collections.abc is appropriate since Iterator has been moved from typing to collections.abc in Python 3.9. This change ensures compatibility with Python 3.9 and above, aligning with the decision to drop support for Python 3.8.

deepmd/utils/compat.py (1)

6-8: Update import of 'Sequence' to 'collections.abc'

Importing Sequence from collections.abc aligns with Python 3.9+ standards and is appropriate for dropping Python 3.8 support.

deepmd/pt/model/atomic_model/base_atomic_model.py (1)

483-485: Appropriate Use of Parentheses in with Statement

Now that Python 3.8 support is dropped, using parentheses in the with statement is acceptable and enhances readability. This syntax leverages features available in Python 3.9 and above.

deepmd/utils/data_system.py (1)

Line range hint 385-391: LGTM!

The _make_auto_ts method correctly calculates the test sizes based on the given percentage, enhancing flexibility for datasets with varying sizes and atom counts.

.github/workflows/test_python.yml Show resolved Hide resolved
deepmd/tf/infer/deep_eval.py Show resolved Hide resolved
deepmd/utils/data_system.py Show resolved Hide resolved
deepmd/utils/data_system.py Show resolved Hide resolved
njzjz added 2 commits October 5, 2024 18:07
Signed-off-by: Jinzhe Zeng <[email protected]>
Signed-off-by: Jinzhe Zeng <[email protected]>
@@ -12,10 +11,10 @@


@overload
def child_seed(seed: Union[int, List[int]], idx: int) -> List[int]: ...
def child_seed(seed: Union[int, list[int]], idx: int) -> list[int]: ...

Check notice

Code scanning / CodeQL

Statement has no effect Note

This statement has no effect.
@@ -29,7 +28,7 @@
real_f: Optional[np.ndarray] = None,
relative: Optional[float] = None,
atomic: Literal[False] = ...,
) -> Tuple[np.ndarray, np.ndarray, np.ndarray]: ...
) -> tuple[np.ndarray, np.ndarray, np.ndarray]: ...

Check notice

Code scanning / CodeQL

Statement has no effect Note

This statement has no effect.
@@ -38,7 +37,7 @@
real_f: Optional[np.ndarray] = None,
relative: Optional[float] = None,
atomic: Literal[True] = ...,
) -> Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]: ...
) -> tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]: ...

Check notice

Code scanning / CodeQL

Statement has no effect Note

This statement has no effect.
@@ -47,15 +46,15 @@
real_f: Optional[np.ndarray] = None,
relative: Optional[float] = None,
atomic: bool = False,
) -> Tuple[np.ndarray, ...]: ...
) -> tuple[np.ndarray, ...]: ...

Check notice

Code scanning / CodeQL

Statement has no effect Note

This statement has no effect.
value: Any

TEST_DICT = Dict[str, DATA]
TEST_DICT = dict[str, DATA]

Check notice

Code scanning / CodeQL

Unused global variable Note test

The global variable 'TEST_DICT' is not used.
Copy link

codecov bot commented Oct 5, 2024

Codecov Report

Attention: Patch coverage is 98.14324% with 7 lines in your changes missing coverage. Please review.

Project coverage is 83.45%. Comparing base (7ce5b03) to head (aa8765d).
Report is 1 commits behind head on devel.

Files with missing lines Patch % Lines
deepmd/utils/spin.py 57.14% 3 Missing ⚠️
deepmd/tf/lmp.py 0.00% 2 Missing ⚠️
deepmd/tf/entrypoints/ipi.py 0.00% 1 Missing ⚠️
deepmd/tf/utils/parallel_op.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##            devel    #4185      +/-   ##
==========================================
- Coverage   83.46%   83.45%   -0.01%     
==========================================
  Files         537      537              
  Lines       52168    52145      -23     
  Branches     3046     3046              
==========================================
- Hits        43542    43519      -23     
  Misses       7678     7678              
  Partials      948      948              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@njzjz njzjz requested a review from wanghan-iapcm October 6, 2024 00:15
@wanghan-iapcm wanghan-iapcm added this pull request to the merge queue Oct 6, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to no response for status checks Oct 6, 2024
@njzjz njzjz added this pull request to the merge queue Oct 6, 2024
Merged via the queue into deepmodeling:devel with commit 192a97a Oct 6, 2024
61 checks passed
@njzjz njzjz deleted the drop-py38 branch October 6, 2024 19:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change Breaking changes that should notify users. Docs Python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants