-
Notifications
You must be signed in to change notification settings - Fork 300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Development
: Execute only tests for changed modules in GitHub test job for non-default branches
#10036
base: develop
Are you sure you want to change the base?
Conversation
This reverts commit 786d172.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
gradle/jacoco.gradle (1)
190-191
:⚠️ Potential issuePrevent path traversal vulnerability
The module path construction should validate the module name to prevent path traversal.
+ if (!moduleName.matches("^[a-zA-Z0-9-_]+$")) { + throw new GradleException("Invalid module name: $moduleName") + } def modulePath = "$BasePath/$moduleName/**/*.class" task.sourceDirectories.setFrom(project.files("src/main/java/$modulePath"))
🧹 Nitpick comments (4)
gradle/jacoco.gradle (4)
2-5
: Consider increasing the aggregated coverage thresholdsThe aggregated instruction coverage threshold (0.895) seems reasonable, but allowing 56 uncovered classes might be too permissive for maintaining code quality.
Consider gradually reducing the maximum uncovered classes threshold over time to improve overall test coverage.
7-80
: Review low coverage thresholds for critical modulesSeveral modules have notably low instruction coverage thresholds:
- core: 65.7%
- exercise: 64.9%
- buildagent: 68.6%
These modules might be critical to the application's functionality. Consider:
- Setting up a plan to gradually increase these thresholds
- Adding more test coverage for these modules
- Documenting why these modules have lower thresholds if there are specific technical constraints
170-170
: Fix typo in task descriptionThere's a typo in the task description:
- task.description = "Validates JaCoCo coverage for vioalations for $moduleName" + task.description = "Validates JaCoCo coverage for violations for $moduleName"
184-200
: Consider adding module existence validationThe
prepareJacocoReportTask
method should validate that the module directory exists before configuring the task.private void prepareJacocoReportTask(JacocoReportBase task, String moduleName, JacocoReportBase rootTask) { + def moduleDir = project.file("src/main/java/$BasePath/$moduleName") + if (!moduleDir.exists()) { + throw new GradleException("Module directory does not exist: $moduleDir") + } task.group = "Reporting" // ... rest of the method }
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
build.gradle
(2 hunks)gradle/jacoco.gradle
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- build.gradle
🔇 Additional comments (3)
gradle/jacoco.gradle (3)
98-110
: LGTM! Well-structured report configurationThe report configuration:
- Properly generates both XML and HTML reports
- Dynamically creates per-module reports
- Uses clear output paths
112-124
: LGTM! Verification configuration aligns with PR objectivesThe coverage verification:
- Correctly handles module-specific vs. full coverage scenarios
- Properly applies thresholds from configuration
- Is integrated into the build lifecycle via check task
94-96
: Verify JaCoCo version compatibility and securityThe JaCoCo version 0.8.12 should be verified for:
- Compatibility with the project's Java version
- Latest security patches
✅ Verification successful
Let me check the Java version compatibility since we found that build.gradle contains the relevant configuration.
JaCoCo version 0.8.12 is appropriate and up-to-date
The verification confirms that:
- JaCoCo 0.8.12 is currently the latest available version
- The project uses Java 21, which is fully compatible with JaCoCo 0.8.12 (JaCoCo 0.8.11+ added support for Java 21)
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Check latest JaCoCo version and release notes curl -s https://repo1.maven.org/maven2/org/jacoco/jacoco/maven-metadata.xml | grep -oP '(?<=<version>)0\.8\.[0-9]+(?=</version>)' | sort -V | tail -n 1 # Check if build.gradle specifies Java version rg -l "sourceCompatibility|targetCompatibility" "build.gradle"Length of output: 236
Script:
#!/bin/bash # Check Java version configuration in build.gradle rg "sourceCompatibility|targetCompatibility" build.gradle -A 2Length of output: 165
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
supporting_scripts/code-coverage/per_module_cov_report/parse_module_coverage.py (2)
1-4
: Add type hints imports and module documentation.Add typing imports and a module docstring to improve code maintainability:
+""" +Process JaCoCo coverage reports and generate a summary in markdown format. + +This module provides functionality to: +- Scan directories for JaCoCo coverage reports +- Extract coverage metrics from XML reports +- Generate a markdown summary table +""" +from typing import Dict, List, Union import os import xml.etree.ElementTree as ET import argparse
67-76
: Enhance command-line interface and add edge case handling.The main execution block could be improved with better error handling and more descriptive help messages:
if __name__ == "__main__": parser = argparse.ArgumentParser(description="Process JaCoCo coverage reports.") - parser.add_argument("input_directory", type=str, help="Root directory containing JaCoCo coverage reports") - parser.add_argument("output_file", type=str, help="Output file to save the coverage results") + parser.add_argument( + "input_directory", + type=str, + help="Root directory containing module-specific JaCoCo coverage reports (jacocoTestReport.xml)" + ) + parser.add_argument( + "output_file", + type=str, + help="Path to the markdown file where coverage results will be saved" + ) args = parser.parse_args() reports = get_report_by_module(args.input_directory) + if not reports: + print("No coverage reports found. Ensure the input directory contains valid JaCoCo XML reports.") + exit(1) + coverage_by_module = extract_coverage(reports) + if not coverage_by_module: + print("Failed to extract coverage data from any of the reports.") + exit(1) + write_summary_to_file(coverage_by_module, args.output_file) + print(f"Coverage summary written to {args.output_file}")
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
supporting_scripts/code-coverage/per_module_cov_report/parse_module_coverage.py
(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
supporting_scripts/code-coverage/per_module_cov_report/parse_module_coverage.py
11-1111: f-string without any placeholders
Remove extraneous f
prefix
(F541)
35-3535: Comparison to None
should be cond is None
Replace with cond is None
(E711)
35-3535: Comparison to None
should be cond is None
Replace with cond is None
(E711)
🔇 Additional comments (3)
supporting_scripts/code-coverage/per_module_cov_report/parse_module_coverage.py (3)
1-76
: Implementation aligns well with PR objectives.The script effectively supports the goal of executing tests only for changed modules by:
- Processing per-module coverage reports
- Generating a clear markdown summary
- Providing useful feedback about missing or invalid reports
This implementation will help developers quickly understand the coverage impact of their changes when running tests only for modified modules.
🧰 Tools
🪛 Ruff (0.8.2)
11-1111: f-string without any placeholders
Remove extraneous
f
prefix(F541)
35-3535: Comparison to
None
should becond is None
Replace with
cond is None
(E711)
35-3535: Comparison to
None
should becond is None
Replace with
cond is None
(E711)
5-21
: 🛠️ Refactor suggestionAdd error handling and fix f-string usage.
The function needs error handling for OS operations and has an unnecessary f-string:
-def get_report_by_module(input_directory): +def get_report_by_module(input_directory: str) -> List[Dict[str, str]]: results = [] + if not os.path.isdir(input_directory): + raise ValueError(f"Input directory does not exist: {input_directory}") + for module_folder in os.listdir(input_directory): - module_path = os.path.join(input_directory, module_folder) + try: + module_path = os.path.join(input_directory, module_folder) - if os.path.isdir(module_path): - report_file = os.path.join(module_path, f"jacocoTestReport.xml") + if os.path.isdir(module_path): + report_file = os.path.join(module_path, "jacocoTestReport.xml") - if os.path.exists(report_file): - results.append({ - "module": module_folder, - "report_file": report_file - }) - else: - print(f"No XML report file found for module: {module_folder}. Skipping...") + if os.path.exists(report_file): + results.append({ + "module": module_folder, + "report_file": report_file + }) + else: + print(f"No XML report file found for module: {module_folder}. Skipping...") + except OSError as e: + print(f"Error accessing module {module_folder}: {e}") + continue return resultsLikely invalid or redundant comment.
🧰 Tools
🪛 Ruff (0.8.2)
11-1111: f-string without any placeholders
Remove extraneous
f
prefix(F541)
24-55
: 🛠️ Refactor suggestionFix None comparisons and add type hints.
The function needs proper None comparisons and type hints:
-def extract_coverage(reports): +def extract_coverage(reports: List[Dict[str, str]]) -> List[Dict[str, Union[str, float, int]]]: results = [] for report in reports: try: tree = ET.parse(report['report_file']) root = tree.getroot() instruction_counter = root.find("./counter[@type='INSTRUCTION']") class_counter = root.find("./counter[@type='CLASS']") - if instruction_counter == None or class_counter == None: + if instruction_counter is None or class_counter is None: continueLikely invalid or redundant comment.
🧰 Tools
🪛 Ruff (0.8.2)
35-3535: Comparison to
None
should becond is None
Replace with
cond is None
(E711)
35-3535: Comparison to
None
should becond is None
Replace with
cond is None
(E711)
Development
: Execute tests only for changed modules on non-default branchsDevelopment
: Execute only tests for changed modules in non-default GitHub test job
Development
: Execute only tests for changed modules in non-default GitHub test jobDevelopment
: Execute only tests for changed modules in GitHub test job for non-default branches
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Executing tests only for the modules with changes works as described, tested it with #10101.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code
then | ||
echo "The number of Server Starts should be equal to 4! Please adapt this check if the change is intended or try to fix the underlying issue causing a different number of server starts!" | ||
echo "The number of Server Starts should be lower than/equals 4! Please adapt this check if the change is intended or try to fix the underlying issue causing a different number of server starts!" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when running a reduced set of tests (changed modules), we might have less different configurations -> less server starts. Therefore, we cannot compare for equality
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea of running tests only for changed modules is interesting, but I see some challenges with this approach. While it should work fine with unit tests, Artemis' integration tests often cover multiple modules. For example, an exam integration test might indirectly test parts of text exercises that aren't directly covered by their own tests.
To better assess the impact, I suggest comparing test coverage for a few selected modules. Specifically, compare the coverage of each module when running only its module-specific tests against the coverage for the same module when running the entire test suite. If coverage is significantly lower for the module-specific tests, it could suggest weaker test reliability. In that case, we should carefully weigh the benefit of faster CI feedback against the potential risk of reduced test effectiveness.
That said, exploring this approach further could still be valuable. Generating module-specific coverage reports would offer more targeted insights and, in the long term, could help improve our test suite if module-specific tests become more specialized and testing functionality becomes less dependent on seemingly unrelated tests.
Would you be open to running this coverage comparison?
@DominikRemo Thanks for your valuable feedback. Maybe its worth mentioning that the mysql+postgres tests are executed once the in-memory ("normal") test pipeline has succeeded. Therefore, it's a faster feedback cycle and some kind of "first check", while all tests are still ran once that succeeded.
The main idea was to keep it simple. For that purpose of running change-affected tests, there are sophistic approaches like the Test Impact Analysis that was already evaluated as part of a bachelor thesis. Came to the conclusion, that it's too complex. In a follow-up, we want to add the possibility to define dependant modules, i.e. modules that use functionality of the changed module.
Regardless of whether the reduced set of tests would cover everything - in a future scenario, testing the functionality (i.e. covering) should be the responsibility of the respective module devs, not of the devs of other modules.
I general yes, but I would like to keep things rather simple and see the benefit of this changes rather in getting a fast feedback cycle, than "waterproofly" determine and run all related tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've looked through the workflow and since we still run all tests for MySql and Postgres / the default branch, I understand that the downsides are limited. I also really like the idea of coverage thresholds per module.
Thanks for the clarification and the improvement. Keep it up so we can see the benefit of module specific reports manifest :)
Checklist
General
Motivation and Context
Most often developers don't change functionality related to all of Artemis, but only to some modules. Thus, running the whole test-suite for each commit is ineffective. Instead, for pull requests etc. it might be fine to run the reduced test suite for the in-memory DB test pipeline on non-default branches. To ensure everything still works, all tests should be run for every default-branch push.
Description
Matching tests and source code
By determining the module-folders with changed files, the tests for the respective module are executed. For example changing
src/main/java/de/tum/cit/aet/artemis/text/service/ABCService.java
, all tests within the foldersrc/main/java/de/tum/cit/aet/artemis/text
are executed.This is a first step and in the future, it might be an idea to extend the reduced test set to modules on which the changed module is dependent. This however, is not that trivial because of transitivity.
Per-Module Coverage Report & Verification
When running a reduced set of tests for specific modules, we might be interested in having per-module reports and validate thresholds per-module. Validating on a module basis is not only more insightful but also required as running a reduced set of tests might lead to reduced coverage metrics causing verification to fail.
Steps for Testing
These testing steps are a bit different and don't involve Artemis.
git checkout -b feat/test-changed-modules-1
.github/workflows/test.yml
Screenshots
The job summary generated on a branch where a single file in the
fileupload
-module has been changed.Review Progress
Code Review
Manual Tests
Summary by CodeRabbit
New Features
Bug Fixes
Chores