Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(ci): skip more tests on GPU CI #4200

Merged
merged 2 commits into from
Oct 11, 2024

Conversation

njzjz
Copy link
Member

@njzjz njzjz commented Oct 10, 2024

Also, only skip these GPU tests on the CI. When we test locally, it's expected to run the tests.

Summary by CodeRabbit

  • New Features

    • Introduced a global variable CI to enhance test execution control based on the continuous integration environment.
  • Bug Fixes

    • Updated test skipping conditions across multiple test classes to ensure tests are only executed on CPU when the CI environment is active.
  • Documentation

    • Enhanced clarity on test conditions by including the CI variable in relevant test decorators.

Also, only skip these GPU tests on the CI. When we test locally, it's expected to run the tests.

Signed-off-by: Jinzhe Zeng <[email protected]>
Copy link
Contributor

coderabbitai bot commented Oct 10, 2024

📝 Walkthrough
📝 Walkthrough

Walkthrough

The pull request introduces modifications across multiple test files, primarily focusing on the addition of a global variable CI to control the execution of tests based on the environment. Several test methods in the CommonTest and ModelTestCase classes have been updated to include decorators that conditionally skip tests if the TEST_DEVICE is not "cpu" and the CI variable is true. This change enhances the flexibility of test execution in different environments while maintaining the existing structure and logic of the test classes.

Changes

File Path Change Summary
source/tests/consistent/common.py Added unittest import and new test methods with conditional skipping based on TEST_DEVICE and CI.
source/tests/universal/common/cases/model/utils.py Added CI import and updated @unittest.skipIf decorators for multiple test methods to include CI in the condition.
source/tests/universal/dpmodel/atomc_model/test_atomic_model.py Updated @unittest.skipIf decorators across multiple test classes to include CI in the condition for skipping tests.
source/tests/universal/dpmodel/descriptor/test_descriptor.py Added CI import and updated the skip condition for the TestDescriptorDP class to include CI.
source/tests/universal/dpmodel/fitting/test_fitting.py Updated @unittest.skipIf decorator in TestFittingDP to include CI in the condition for skipping tests.
source/tests/universal/dpmodel/model/test_model.py Updated @unittest.skipIf decorators in TestEnergyModelDP and TestSpinEnergyModelDP to include CI in the condition for skipping tests.
source/tests/universal/dpmodel/utils/test_type_embed.py Updated @unittest.skipIf decorator in TestTypeEmbd to include CI in the condition for skipping tests.
source/tests/utils.py Introduced a global variable CI that evaluates the environment variable for conditional test execution.

Possibly related PRs

  • test(pt/dp): add universal uts for all models #3873: The changes in this PR involve the addition of the CI variable in the source/tests/universal/common/cases/model/utils.py file, which is directly related to the introduction of the CI variable in the main PR. Both PRs modify test execution conditions based on the CI variable, enhancing control over test execution in different environments.

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between d470805 and 6c28e45.

📒 Files selected for processing (1)
  • source/tests/utils.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • source/tests/utils.py

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (7)
source/tests/universal/dpmodel/utils/test_type_embed.py (1)

20-20: Approve with suggestion: Update skip message for clarity.

The modification to the @unittest.skipIf decorator correctly implements the PR objective to skip more tests on GPU CI while allowing local runs on non-CPU devices. However, the skip message could be more accurate.

Consider updating the skip message to better reflect the new condition:

@unittest.skipIf(TEST_DEVICE != "cpu" and CI, "Skip on non-CPU devices in CI environment.")

This message more accurately describes when the test will be skipped.

source/tests/universal/dpmodel/fitting/test_fitting.py (1)

240-240: LGTM: Updated skip condition for CI environments

The modification to the @unittest.skipIf decorator is appropriate and aligns with the PR objective. The tests will now be skipped in CI environments when not running on CPU, which should help streamline the CI process.

For improved readability, consider extracting the skip condition into a descriptive variable:

skip_on_gpu_ci = TEST_DEVICE != "cpu" and CI
@unittest.skipIf(skip_on_gpu_ci, "Skip GPU tests in CI environment.")

This change would make the skip condition more self-explanatory and easier to maintain.

source/tests/universal/dpmodel/model/test_model.py (1)

116-116: LGTM: Modified test skip conditions.

The changes to the @unittest.skipIf decorators align with the PR objectives. Tests will now be skipped only when both the test device is not CPU and the CI environment is active. This allows the tests to run locally on non-CPU devices while being skipped in the CI environment.

For improved clarity, consider adding a comment explaining the skip condition:

# Skip test on non-CPU devices in CI environment
@unittest.skipIf(TEST_DEVICE != "cpu" and CI, "Only test on CPU.")

This comment would help future developers understand the intention behind the skip condition.

Also applies to: 204-204

source/tests/consistent/common.py (4)

Line range hint 349-367: LGTM: New test method for DP consistency with appropriate CI skip condition.

The test_dp_consistent_with_ref method is well-implemented and aligns with the PR objective. It correctly skips the test on GPU CI environments and thoroughly checks the consistency between DP and the reference backend.

Consider adding a brief comment explaining why this test is skipped on non-CPU devices in CI, to improve code readability and maintainability.


Line range hint 368-383: LGTM: New test method for DP self-consistency with appropriate CI skip condition.

The test_dp_self_consistent method is well-implemented and aligns with the PR objective. It correctly skips the test on GPU CI environments and thoroughly checks the self-consistency of DP.

For consistency with other test methods in this class, consider renaming the variables obj1 to obj2 in the second part of the test (lines 377-378). This would make it clearer that you're comparing two separate object instances.


Line range hint 458-491: LGTM: New test methods for array_api_strict consistency with appropriate CI skip conditions.

The test_array_api_strict_consistent_with_ref and test_array_api_strict_self_consistent methods are well-implemented and align with the PR objective. They correctly skip the tests on GPU CI environments and thoroughly check the consistency and self-consistency of the array_api_strict backend.

For consistency with the DP test methods, consider adding a brief comment explaining why these tests are skipped on non-CPU devices in CI, to improve code readability and maintainability.


Line range hint 1-591: Summary: Effective implementation of CI-specific test skipping.

The changes in this file successfully implement the PR objective of skipping more tests on GPU CI. The new test methods for DP and array_api_strict backends are well-structured and consistent with existing patterns in the CommonTest class. These modifications will help streamline the CI process by reducing the number of tests executed in GPU environments.

Consider creating a utility decorator that encapsulates the skip condition @unittest.skipIf(TEST_DEVICE != "cpu" and CI, "Only test on CPU."). This would improve code maintainability and reduce duplication across test methods.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 3939786 and d470805.

📒 Files selected for processing (8)
  • source/tests/consistent/common.py (6 hunks)
  • source/tests/universal/common/cases/model/utils.py (7 hunks)
  • source/tests/universal/dpmodel/atomc_model/test_atomic_model.py (7 hunks)
  • source/tests/universal/dpmodel/descriptor/test_descriptor.py (2 hunks)
  • source/tests/universal/dpmodel/fitting/test_fitting.py (2 hunks)
  • source/tests/universal/dpmodel/model/test_model.py (3 hunks)
  • source/tests/universal/dpmodel/utils/test_type_embed.py (2 hunks)
  • source/tests/utils.py (1 hunks)
🧰 Additional context used
🔇 Additional comments (24)
source/tests/utils.py (2)

Line range hint 1-7: Consider enhancing the test device selection logic

While the current implementation works, it could benefit from more robust error handling and CUDA availability checks.

Consider applying the following improvements:

  1. Add explicit CUDA availability checking.
  2. Implement logging for better debugging.
  3. Handle potential exceptions.

Here's a suggested implementation:

import os
import logging
import torch

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

def get_test_device():
    if os.environ.get("CUDA_VISIBLE_DEVICES") is not None:
        if torch.cuda.is_available():
            logger.info("CUDA is available. Using 'cuda' as TEST_DEVICE.")
            return "cuda"
        else:
            logger.warning("CUDA_VISIBLE_DEVICES is set, but CUDA is not available. Falling back to 'cpu'.")
    else:
        logger.info("CUDA_VISIBLE_DEVICES is not set. Using 'cpu' as TEST_DEVICE.")
    return "cpu"

TEST_DEVICE = get_test_device()

This implementation:

  • Checks for actual CUDA availability using torch.cuda.is_available().
  • Adds logging for better visibility into the device selection process.
  • Gracefully handles the case where CUDA_VISIBLE_DEVICES is set but CUDA is not available.

To verify this change, you can run the following script:

#!/bin/bash
# Description: Verify the test device selection logic

# Test 1: Check if torch is imported and CUDA availability is checked
grep -n "import torch" source/tests/utils.py
grep -n "torch.cuda.is_available()" source/tests/utils.py

# Test 2: Check if logging is implemented
grep -n "import logging" source/tests/utils.py
grep -n "logging.basicConfig" source/tests/utils.py

# Test 3: Check if the get_test_device function is implemented
sed -n '/def get_test_device/,/^$/p' source/tests/utils.py

This script will check for the presence of the suggested improvements in the file.


8-10: ⚠️ Potential issue

Fix the CI environment check implementation

There's a critical error in the implementation of the CI variable. The current code is comparing the os.environ.get method itself to a string, which will always evaluate to False.

Please apply the following fix:

 # see https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/store-information-in-variables#default-environment-variables
-CI = os.environ.get == "true"
+CI = os.environ.get("CI", "false").lower() == "true"

This change does the following:

  1. Correctly calls os.environ.get() with "CI" as the key.
  2. Provides a default value of "false" if the "CI" environment variable is not set.
  3. Converts the result to lowercase before comparison to handle potential variations like "TRUE" or "True".

To verify this change, you can run the following script:

This script will output the result of the CI check and verify if the correct implementation is present in the file.

source/tests/universal/dpmodel/utils/test_type_embed.py (2)

9-9: LGTM: Import of CI is correct and consistent.

The import of CI from ....utils is correctly added to the existing import statement. This import is necessary for its use in the @unittest.skipIf decorator below.


Line range hint 1-27: Overall assessment: Changes successfully implement PR objectives.

The modifications in this file effectively implement the desired behavior of skipping more tests on GPU CI while allowing local runs on non-CPU devices. The changes are minimal, focused, and align well with the PR objectives. No major issues were found during the review.

source/tests/universal/dpmodel/fitting/test_fitting.py (2)

23-23: LGTM: Import CI flag for conditional test execution

The addition of the CI import from the utils module is appropriate. This aligns with the PR objective to conditionally skip certain tests in the CI environment.


Line range hint 1-258: Summary: Effective implementation of CI-specific test skipping

The changes in this file successfully implement the PR objective of skipping more tests on GPU CI environments. The addition of the CI import and the modification of the @unittest.skipIf decorator on the TestFittingDP class achieve this goal without affecting the underlying test logic.

These changes will help streamline the CI process by reducing the number of tests executed in GPU environments during CI runs, while still allowing all tests to run locally or on CPU environments. This approach maintains the integrity of the test suite while optimizing CI performance.

The rest of the file, including the various fitting parameter functions and the TestFittingDP class implementation, remains unchanged, which is appropriate given the focused nature of this PR.

source/tests/universal/dpmodel/model/test_model.py (2)

28-28: LGTM: Import of CI variable.

The addition of the CI import is consistent with the PR objectives and follows the existing import structure.


28-28: Summary: Successfully implemented CI-specific test skipping.

The changes in this file effectively implement the PR objectives:

  1. The CI variable is imported, allowing its use in test conditions.
  2. The skip conditions for both TestEnergyModelDP and TestSpinEnergyModelDP classes have been updated to skip tests on non-CPU devices only in the CI environment.

These modifications allow for more flexible test execution:

  • Tests will run on all devices (including GPUs) in local environments.
  • Tests will be skipped on non-CPU devices in the CI environment.

The changes are consistent, maintain the existing code structure, and don't introduce any apparent issues. They successfully balance the need for comprehensive local testing with efficient CI processes.

Also applies to: 116-116, 204-204

source/tests/universal/dpmodel/descriptor/test_descriptor.py (2)

29-29: LGTM: CI import added

The addition of the CI import from the utils module is appropriate and aligns with the PR objective to modify the CI testing process.


523-523: LGTM: Improved test skip condition

The modification to the @unittest.skipIf decorator is well-implemented and aligns with the PR objective. This change ensures that:

  1. Tests are skipped on non-CPU devices only in CI environments.
  2. Tests can run on all devices (including GPUs) in non-CI environments, such as local development.

This improvement allows for better test coverage during local development while optimizing CI performance.

source/tests/universal/dpmodel/atomc_model/test_atomic_model.py (7)

29-29: LGTM: CI import added correctly

The CI variable is imported from the utils module, which is consistent with the existing import structure. This addition will allow for conditional test execution based on the CI environment.


102-102: Approved: Enhanced test skipping logic for CI

The modification to the @unittest.skipIf decorator now includes the CI condition. This change ensures that the test is skipped when running on non-CPU devices in a CI environment, which aligns with the PR's objective of reducing GPU tests in CI. This optimization will likely improve CI efficiency while maintaining the ability to run these tests locally or on CPU environments.


169-169: Approved: Consistent test skipping logic applied

The modification to the @unittest.skipIf decorator for TestDosAtomicModelDP is consistent with the changes made to other test classes. This ensures uniform behavior across different atomic model tests, skipping them when running on non-CPU devices in a CI environment.


231-231: Approved: Uniform test skipping logic maintained

The modification to the @unittest.skipIf decorator for TestDipoleAtomicModelDP follows the same pattern as the other test classes. This consistent approach ensures that all atomic model tests behave uniformly with respect to skipping on non-CPU devices in CI environments.


294-294: Approved: Consistent test skipping logic applied to TestPolarAtomicModelDP

The modification to the @unittest.skipIf decorator for TestPolarAtomicModelDP maintains the consistent approach seen in other test classes. This change ensures that the polar atomic model tests are also skipped when running on non-CPU devices in CI environments, aligning with the overall objective of the PR.


433-433: Approved: Comprehensive application of CI-aware test skipping

The modification to the @unittest.skipIf decorator for TestPropertyAtomicModelDP completes the consistent application of the new test skipping logic across all atomic model test classes in this file. This change, along with the previous ones, ensures that all these tests are skipped when running on non-CPU devices in CI environments. This comprehensive approach aligns perfectly with the PR's objective of optimizing GPU CI performance while maintaining the ability to run these tests in other environments.


Line range hint 29-433: Summary: Successful implementation of CI-aware test skipping

The changes in this file consistently modify the @unittest.skipIf decorators for all atomic model test classes to include the CI condition. This implementation achieves the following:

  1. Aligns with the PR objective of skipping more tests on GPU CI.
  2. Maintains the ability to run these tests locally or on CPU environments.
  3. Applies a uniform approach across all test classes, enhancing maintainability.
  4. Potentially improves CI efficiency by reducing the number of tests run on GPU environments.

These modifications strike a good balance between optimizing CI performance and ensuring comprehensive testing capabilities.

source/tests/consistent/common.py (1)

6-6: LGTM: New imports support CI-specific test execution.

The addition of the unittest module and the CI and TEST_DEVICE imports from ..utils are appropriate for implementing conditional test execution based on the CI environment and test device. These changes align with the PR objective of skipping more tests on GPU CI.

Also applies to: 37-40

source/tests/universal/common/cases/model/utils.py (6)

25-25: LGTM: CI import added for test control

The addition of the CI import is appropriate and aligns with the PR objective to skip more tests on GPU CI. This variable will be used to conditionally skip tests in CI environments.


331-331: LGTM: Updated skip condition for CI environments

The modification to the @unittest.skipIf decorator is appropriate. It now skips the test_permutation method when not running on a CPU device and in a CI environment. This change aligns with the PR objective to reduce the number of tests executed in GPU CI pipelines.


417-417: LGTM: Consistent skip conditions applied to translation and rotation tests

The @unittest.skipIf decorators for test_trans and test_rot methods have been updated consistently with the test_permutation method. These changes ensure that translation and rotation tests are also skipped when not running on a CPU device and in a CI environment, further aligning with the PR objective to optimize GPU CI test execution.

Also applies to: 486-486


676-676: LGTM: Skip conditions applied to smoothness and autodiff tests

The @unittest.skipIf decorators for test_smooth and test_autodiff methods have been updated consistently with the other test methods. These changes ensure that smoothness and automatic differentiation tests are also skipped when not running on a CPU device and in a CI environment, further supporting the PR objective to optimize GPU CI test execution.

Also applies to: 783-783


923-923: LGTM: Unique skip condition for device consistency test

The @unittest.skipIf decorator for the test_device_consistence method has been updated with a unique condition. Unlike the other tests, this one is skipped when running on a CPU device in a CI environment. This change is logical as the device consistency test is primarily relevant for non-CPU devices, and skipping it on CPU in CI environments aligns with the overall goal of optimizing CI test execution.


25-25: Summary: Effective test optimization for CI environments

The changes in this file successfully implement the PR objective of skipping more tests on GPU CI. Here's a summary of the modifications:

  1. Added CI import to control test execution.
  2. Updated skip conditions for multiple test methods (test_permutation, test_trans, test_rot, test_smooth, test_autodiff) to skip when not on CPU and in CI.
  3. Modified test_device_consistence to skip on CPU in CI environments.

These changes will significantly reduce the time required for GPU CI pipelines while maintaining full test coverage for local development and CPU CI environments. This approach strikes a good balance between CI efficiency and comprehensive testing.

Also applies to: 331-331, 417-417, 486-486, 676-676, 783-783, 923-923

@njzjz njzjz marked this pull request as draft October 10, 2024 02:36
Signed-off-by: Jinzhe Zeng <[email protected]>
@njzjz njzjz marked this pull request as ready for review October 10, 2024 04:03
Copy link

codecov bot commented Oct 10, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.50%. Comparing base (3939786) to head (6c28e45).
Report is 192 commits behind head on devel.

Additional details and impacted files
@@            Coverage Diff             @@
##            devel    #4200      +/-   ##
==========================================
- Coverage   83.50%   83.50%   -0.01%     
==========================================
  Files         539      539              
  Lines       52339    52339              
  Branches     3047     3047              
==========================================
- Hits        43708    43704       -4     
- Misses       7685     7687       +2     
- Partials      946      948       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@njzjz njzjz requested review from iProzd and wanghan-iapcm October 10, 2024 08:11
@wanghan-iapcm wanghan-iapcm added this pull request to the merge queue Oct 11, 2024
Merged via the queue into deepmodeling:devel with commit 8174cf1 Oct 11, 2024
60 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants