Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: dpmodel energy loss & consistent tests #4531

Merged
merged 6 commits into from
Jan 7, 2025

Conversation

njzjz
Copy link
Member

@njzjz njzjz commented Jan 5, 2025

Fix #4105. Fix #4429.

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced a new energy loss calculation framework with support for multiple machine learning backends.
    • Added serialization and deserialization capabilities for loss modules.
    • Added a new class EnergyLoss for computing energy-related loss metrics.
  • Documentation

    • Added SPDX license identifiers to multiple files.
    • Included docstrings for new classes and methods.
  • Tests

    • Implemented comprehensive test suite for energy loss functions across different platforms (TensorFlow, PyTorch, Paddle, JAX).
    • Introduced a new test class TestEner for evaluating energy loss functions.

@Copilot Copilot bot review requested due to automatic review settings January 5, 2025 10:40
@github-actions github-actions bot added the Python label Jan 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 5 out of 10 changed files in this pull request and generated no comments.

Files not reviewed (5)
  • deepmd/dpmodel/loss/ener.py: Evaluated as low risk
  • deepmd/dpmodel/loss/loss.py: Evaluated as low risk
  • deepmd/pt/loss/ener.py: Evaluated as low risk
  • deepmd/pt/loss/loss.py: Evaluated as low risk
  • deepmd/tf/loss/ener.py: Evaluated as low risk
deepmd/dpmodel/loss/ener.py Fixed Show fixed Hide fixed
deepmd/dpmodel/loss/ener.py Fixed Show fixed Hide fixed
deepmd/dpmodel/loss/ener.py Fixed Show fixed Hide fixed
deepmd/dpmodel/loss/ener.py Fixed Show fixed Hide fixed
deepmd/dpmodel/loss/ener.py Fixed Show fixed Hide fixed
deepmd/dpmodel/loss/ener.py Fixed Show fixed Hide fixed
deepmd/dpmodel/loss/ener.py Fixed Show fixed Hide fixed
deepmd/dpmodel/loss/ener.py Fixed Show fixed Hide fixed
deepmd/dpmodel/loss/ener.py Fixed Show fixed Hide fixed
deepmd/dpmodel/loss/ener.py Fixed Show fixed Hide fixed
Copy link
Contributor

coderabbitai bot commented Jan 5, 2025

📝 Walkthrough

Walkthrough

This pull request introduces enhancements to the loss modules across various backends (TensorFlow, PyTorch, and DeepMD) by adding serialization and deserialization capabilities. A new abstract base class Loss has been created to standardize the implementation of loss functions, with specific energy loss classes added to implement this new interface. Additionally, consistent test cases have been developed to validate loss calculations across different computational frameworks, alongside the inclusion of SPDX license identifiers in relevant files.

Changes

File Change Summary
deepmd/dpmodel/loss/__init__.py Added SPDX license identifier
deepmd/dpmodel/loss/ener.py New EnergyLoss class with comprehensive energy loss calculation methods
deepmd/dpmodel/loss/loss.py New abstract Loss base class defining loss calculation interface
deepmd/pt/loss/ener.py Added serialize and deserialize methods to EnergyStdLoss
deepmd/tf/loss/ener.py Added serialization and deserialization methods to EnerStdLoss
deepmd/tf/loss/loss.py Added serialize, deserialize, and init_variables methods to Loss class
source/tests/consistent/loss/common.py Added LossTest utility class
source/tests/consistent/loss/test_ener.py New TestEner class for cross-backend loss function testing
source/tests/consistent/loss/__init__.py Added SPDX license identifier and docstring

Assessment against linked issues

Objective Addressed Explanation
Universal/consistent tests for loss modules [#4105]
Array API loss implementation [#4429]

Possibly related PRs

  • feat(pt): support complete form energy loss #3782: This PR introduces a new class EnergyLoss that extends the Loss class, which is relevant as both involve loss calculations in the deep learning context, specifically related to energy metrics.
  • feat pt : Support property fitting #3867: This PR adds support for property fitting, including Hessian calculations, which directly relates to the main PR's focus on Hessian loss and its integration into the loss framework.
  • feat(pt): train with energy Hessian #4169: This PR implements functionality for training with energy Hessians, which is directly related to the main PR's introduction of Hessian loss calculations and their application in model training.

Suggested Labels

Docs

Suggested Reviewers

  • iProzd
  • wanghan-iapcm

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (8)
deepmd/dpmodel/loss/loss.py (2)

21-30: Consider expanding docstring details for the call method.

Currently, the docstring briefly states "Calculate loss from model results and labeled results," but it doesn't clarify the keys or shape expectations of the returned dictionary. Elaborating on the output structure and required dictionary keys can improve clarity for implementers.


37-55: Validate the method name to match its core purpose.

display_if_exist sets loss to NaN if a property is absent, which may not be immediately obvious from the function name. Consider renaming it to more clearly reflect its filtering behavior, such as mask_unavailable_loss.

deepmd/dpmodel/loss/ener.py (1)

283-283: Use boolean checks rather than numerical comparison.

self.has_gf is a boolean, so comparing it to > 0 may cause unexpected behavior or confusion.

- if self.has_gf > 0:
+ if self.has_gf:
deepmd/tf/loss/loss.py (2)

98-113: Reflect uniform serialization approach across backends.

Like in deepmd/dpmodel/loss/ener.py, consider adding version and class annotation keys, e.g., "@version" or "@class", to maintain uniform serialization across all frameworks. This consistency simplifies cross-backend model loading.


132-149: Clarify init_variables design.

Static analysis suggests making this method abstract or providing a minimal base implementation. If the plan is for derived classes to override it, consider using @abstractmethod. Otherwise, confirm that an empty default is intentional and that it shouldn't raise NotImplementedError.

🧰 Tools
🪛 Ruff (0.8.2)

132-149: Loss.init_variables is an empty method in an abstract base class, but has no abstract decorator

(B027)

deepmd/pt/loss/ener.py (2)

418-446: Use consistent class naming in serialized output
Currently, the serialize() method returns "@class": "EnergyLoss". Consider aligning this string identifier with the actual class name (EnergyStdLoss) to help avoid confusion across multiple codebases that implement serialization.

- "@class": "EnergyLoss",
+ "@class": "EnergyStdLoss",

447-465: Enhance error handling on deserialization
You might want to verify that all required fields exist in data before calling cls(**data). An explicit check can help detect missing fields at runtime.

+ required_fields = ["starter_learning_rate", "start_pref_e", "limit_pref_e", ...]
+ for field in required_fields:
+     if field not in data:
+         raise KeyError(f"Missing required field '{field}' in serialized data")
deepmd/tf/loss/ener.py (1)

408-441: Improve clarity in serialization
Serializing with "@class": "EnergyLoss" might be less descriptive than "EnerStdLoss" here, given this is the EnerStdLoss class. Consider updating the string for future maintainability.

- "@class": "EnergyLoss",
+ "@class": "EnerStdLoss",
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8d4c27b and d94010c.

📒 Files selected for processing (10)
  • deepmd/dpmodel/loss/__init__.py (1 hunks)
  • deepmd/dpmodel/loss/ener.py (1 hunks)
  • deepmd/dpmodel/loss/loss.py (1 hunks)
  • deepmd/pt/loss/ener.py (2 hunks)
  • deepmd/pt/loss/loss.py (1 hunks)
  • deepmd/tf/loss/ener.py (2 hunks)
  • deepmd/tf/loss/loss.py (1 hunks)
  • source/tests/consistent/loss/__init__.py (1 hunks)
  • source/tests/consistent/loss/common.py (1 hunks)
  • source/tests/consistent/loss/test_ener.py (1 hunks)
✅ Files skipped from review due to trivial changes (3)
  • source/tests/consistent/loss/common.py
  • deepmd/dpmodel/loss/init.py
  • source/tests/consistent/loss/init.py
🧰 Additional context used
🪛 Ruff (0.8.2)
deepmd/tf/loss/loss.py

132-149: Loss.init_variables is an empty method in an abstract base class, but has no abstract decorator

(B027)

🔇 Additional comments (7)
deepmd/dpmodel/loss/loss.py (1)

76-85: Ensure consistent serialization format across derived classes.

serialize is abstract here, and derived classes (e.g., EnergyLoss) implement it using keys like "@class" and "@version". For consistency, document or enforce the same structure in all subclasses, so third-party tools can reliably interpret the serialized data.
[verification]

deepmd/dpmodel/loss/ener.py (2)

63-66: Confirm usage of RuntimeError.

Raising RuntimeError when numb_generalized_coord < 1 is explicit and correct for your domain. However, consider providing a more detailed error message to guide users on recommended input constraints or fallback solutions for generalized forces.


70-227: ⚠️ Potential issue

Prevent confusion around natoms usage.

natoms is typed as an integer but is treated like an array in lines 143-144 (natoms[0]). Make sure natoms is consistently passed as a list or a similar sequence. If multiple frames are expected, clarify this in the docstring and function arguments to avoid type mismatches or logic errors.

- force_reshape_nframes = xp.reshape(force, [-1, natoms[0] * 3])
- force_hat_reshape_nframes = xp.reshape(force_hat, [-1, natoms[0] * 3])
+ force_reshape_nframes = xp.reshape(force, [-1, natoms * 3])
+ force_hat_reshape_nframes = xp.reshape(force_hat, [-1, natoms * 3])

Likely invalid or redundant comment.

deepmd/pt/loss/loss.py (1)

67-92: Implement or document default serialize and deserialize behavior.

Currently, both methods raise NotImplementedError. If this class is intended as an abstract base, mark them as abstract methods. Otherwise, clarify the intended usage by either providing a skeleton implementation or adding docstrings indicating how derived classes should implement serialization.

deepmd/pt/loss/ener.py (1)

21-23: Maintain consistent naming for version checks
The newly added import from deepmd.utils.version looks good, ensuring uniform version compatibility checks.

deepmd/tf/loss/ener.py (2)

19-21: Leverage version compatibility checks
The introduction of check_version_compatibility aligns this file with other modules, ensuring consistent handling of serialization versions.


442-462: 🛠️ Refactor suggestion

Verify presence of required data fields
Before passing data into the constructor, ensure it contains the necessary fields. A structured validation step can reduce runtime exceptions if users supply incomplete data.

+ required_fields = ["starter_learning_rate", "start_pref_e", "limit_pref_e", ...]
+ for field in required_fields:
+     if field not in data:
+         raise KeyError(f"Required field '{field}' missing in the serialized data")

source/tests/consistent/loss/test_ener.py Show resolved Hide resolved
njzjz added 2 commits January 5, 2025 20:11
Signed-off-by: Jinzhe Zeng <[email protected]>
Signed-off-by: Jinzhe Zeng <[email protected]>
deepmd/dpmodel/loss/ener.py Fixed Show fixed Hide fixed
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
source/tests/consistent/loss/test_ener.py (1)

57-86: Consider expanding test coverage for edge cases.

The test setup is well-structured, but could benefit from additional test cases:

  1. Test with zero or negative preference values
  2. Test with extreme learning rates
  3. Test with empty or invalid input arrays
deepmd/dpmodel/loss/ener.py (2)

20-67: Add input validation for preference values.

Consider adding validation to ensure that:

  1. All preference values are non-negative
  2. Learning rate is positive
  3. relative_f is positive when provided

291-337: Enhance error handling in deserialization.

Consider adding validation for required fields and type checking in the deserialization method to handle malformed input gracefully.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d94010c and 45f222b.

📒 Files selected for processing (2)
  • deepmd/dpmodel/loss/ener.py (1 hunks)
  • source/tests/consistent/loss/test_ener.py (1 hunks)
🔇 Additional comments (3)
source/tests/consistent/loss/test_ener.py (2)

87-120: Expand negative or edge case coverage.

Currently, the setup uses random arrays and sets find_* flags to 1.0. Add tests for scenarios where certain keys (e.g., "atom_pref", "find_energy") are missing or zero, ensuring the loss logic handles partially available labels gracefully.


230-241: LGTM!

The helper methods and properties are well-implemented with appropriate tolerance values for floating-point comparisons.

deepmd/dpmodel/loss/ener.py (1)

213-290: LGTM!

The label requirements are well-defined with appropriate flags for each data type.

source/tests/consistent/loss/test_ener.py Show resolved Hide resolved
deepmd/dpmodel/loss/ener.py Outdated Show resolved Hide resolved
deepmd/dpmodel/loss/ener.py Outdated Show resolved Hide resolved
njzjz added 3 commits January 5, 2025 21:34
Signed-off-by: Jinzhe Zeng <[email protected]>
Signed-off-by: Jinzhe Zeng <[email protected]>
Signed-off-by: Jinzhe Zeng <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
deepmd/pd/loss/ener.py (1)

458-475: Consider validating the '@Class' field in deserialize().
Currently, you remove "@class" after verifying the version, but there’s no check to ensure the serialized data belongs to the correct class. If you ever extend or share serialization logic across multiple classes, consider validating that "@class" matches "EnergyStdLoss" before popping it, to guard against user or developer error.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 219c417 and 8f32640.

📒 Files selected for processing (1)
  • deepmd/pd/loss/ener.py (2 hunks)
🔇 Additional comments (1)
deepmd/pd/loss/ener.py (1)

21-23: Ensure consistent version checks across the codebase.
These lines introduce check_version_compatibility. While this usage appears straightforward, be sure that any older references to version checks throughout the codebase also rely on the same mechanism for uniformity.

Run the following script to search for other version checks and confirm consistent practices:

✅ Verification successful

Version compatibility checks are consistently implemented across the codebase

The search results show that check_version_compatibility is used consistently throughout the codebase in deserialize methods to validate version compatibility. The implementation follows a standard pattern:

  • All modules use the same version checking mechanism from deepmd.utils.version
  • The function is called with consistent arguments format: current version, maximum supported version, and minimum supported version
  • The checks are performed at the start of deserialization before any data processing
  • The version is consistently popped from data dictionary with "@Version" key
  • Default values and version ranges are appropriate for each module's needs
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify consistent usage of check_version_compatibility across the repo.

rg -A 5 "check_version_compatibility"

Length of output: 63039

deepmd/pd/loss/ener.py Show resolved Hide resolved
Copy link

codecov bot commented Jan 5, 2025

Codecov Report

Attention: Patch coverage is 78.43137% with 44 lines in your changes missing coverage. Please review.

Project coverage is 84.55%. Comparing base (8d4c27b) to head (8f32640).
Report is 5 commits behind head on devel.

Files with missing lines Patch % Lines
deepmd/dpmodel/loss/ener.py 72.85% 38 Missing ⚠️
deepmd/dpmodel/loss/loss.py 92.00% 2 Missing ⚠️
deepmd/pt/loss/loss.py 60.00% 2 Missing ⚠️
deepmd/tf/loss/loss.py 71.42% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##            devel    #4531      +/-   ##
==========================================
- Coverage   84.57%   84.55%   -0.03%     
==========================================
  Files         675      677       +2     
  Lines       63695    63899     +204     
  Branches     3488     3486       -2     
==========================================
+ Hits        53872    54031     +159     
- Misses       8698     8742      +44     
- Partials     1125     1126       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@njzjz njzjz added this pull request to the merge queue Jan 7, 2025
Merged via the queue into deepmodeling:devel with commit dbdb9b9 Jan 7, 2025
61 checks passed
@njzjz njzjz deleted the ener-loss branch January 7, 2025 11:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Array API loss Add universal/consistent tests for loss modules
3 participants