Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto rerun test build in FAILURE status #5747

Merged
merged 2 commits into from
Nov 20, 2024

Conversation

llxia
Copy link
Contributor

@llxia llxia commented Nov 12, 2024

This PR will auto-rerun the child test build in FAILURE status (Part 1). To enable this feature, params.RERUN_FAILURE needs to be true.
We will enable the parent test build of rerun in Part 2.

related: #5346

@llxia
Copy link
Contributor Author

llxia commented Nov 12, 2024

TRSS:

image

@llxia llxia changed the title Auto failure rerun Auto rerun test build in FAILURE status Nov 12, 2024
@llxia llxia marked this pull request as ready for review November 12, 2024 15:26
Copy link
Contributor

@LongyuZhang LongyuZhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@smlambert
Copy link
Contributor

Since I have no visibility of tests run on hyc Jenkins, from your testing, what is the difference between
RERUN_FAILURE = true, RERUN_ITERATIONS = 1 and parallel Grinder/44712
RERUN_FAILURE = true, RERUN_ITERATIONS = 1 and parallel Grinder/44713

Signed-off-by: Lan Xia <[email protected]>
@llxia
Copy link
Contributor Author

llxia commented Nov 19, 2024

Sorry, Grinder links expired. I think the second link has a typo. It is for scenario RERUN_FAILURE = false, RERUN_ITERATIONS = 1 and parallel

@smlambert
Copy link
Contributor

smlambert commented Nov 19, 2024

Public test runs:

JDK11, x64Linux and ppc64Aix:
AQA_Test_Pipeline/381/ - RERUN_FAILURE=true, RERUN_ITERATIONS=1,PARALLEL=none
AQA_Test_Pipeline/382/ - RERUN_FAILURE=true, RERUN_ITERATIONS=0,PARALLEL=none
AQA_Test_Pipeline/383/ - RERUN_FAILURE=false, RERUN_ITERATIONS=0,PARALLEL=none

JDK17, x86-32_windows,s390x_linux:
AQA_Test_Pipeline/384/ - RERUN_FAILURE=true, RERUN_ITERATIONS=1,PARALLEL=Dynamic,TEST_TIME=2

JDK8, x64Linux:
AQA_Test_Pipeline/385/ - RERUN_FAILURE=true, RERUN_ITERATIONS=1,PARALLEL=Dynamic,TEST_TIME=1

@smlambert
Copy link
Contributor

I do not quite think I got all of the AQA_Test_Pipeline parameters correct, as many of the test jobs failed with

hudson.plugins.git.GitException: Command "git fetch --tags --force --progress --prune -- origin +refs/heads/autoFailureRerun:refs/remotes/origin/autoFailureRerun" returned status code 128:
stdout: 
stderr: fatal: couldn't find remote ref refs/heads/autoFailureRerun

	at PluginClassLoader for git-client//org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:2846)
	at PluginClassLoader for git-client//org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandWithCredentials(CliGitAPIImpl.java:2185)
	at PluginClassLoader for git-client//org.jenkinsci.plugins.gitclient.CliGitAPIImpl$1.execute(CliGitAPIImpl.java:635)
	at PluginClassLoader for git//jenkins.plugins.git.GitSCMFileSystem$BuilderImpl.build(GitSCMFileSystem.java:408)
Caused: java.io.IOException
	at PluginClassLoader for git//jenkins.plugins.git.GitSCMFileSystem$BuilderImpl.build(GitSCMFileSystem.java:413)
	at PluginClassLoader for scm-api//jenkins.scm.api.SCMFileSystem.of(SCMFileSystem.java:219)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsScmFlowDefinition.create(CpsScmFlowDefinition.java:126)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsScmFlowDefinition.create(CpsScmFlowDefinition.java:73)
	at PluginClassLoader for workflow-job//org.jenkinsci.plugins.workflow.job.WorkflowRun.run(WorkflowRun.java:311)
	at hudson.model.ResourceController.execute(ResourceController.java:101)
	at hudson.model.Executor.run(Executor.java:446)
Finished: FAILURE

But at least, it forced test job failures, which then triggered reruns which is the functionality that I hoped the runs would demonstrate.

TRSS instance does not show the rerun link in the tooltip, but it has not been synched in a while, so perhaps that is why. It does show the info circle in the top corner of the chiclet in the Grid view though, indicating there was a rerun.

Screenshot 2024-11-20 at 4 16 39 PM

@llxia
Copy link
Contributor Author

llxia commented Nov 20, 2024

re #5747 (comment), the rerun job was generated and triggered.

image

https://ci.adoptium.net/job/Test_openjdk17_hs_extended.openjdk_s390x_linux/196/console

However, LIGHT_WEIGHT_CHECKOUT has to be set to false for running any personal repo. Otherwise, there will be an error in checking out the repo stderr: fatal: couldn't find remote ref refs/heads/autoFailureRerun

Note: This PR (Step 1) does not cover rerun at the parent level (i.e., PARALLEL=none case). It will be covered in Step 2.

Copy link
Contributor

@smlambert smlambert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AQA_Test_Pipeline runs showed reruns were triggered for failed jobs.

@smlambert smlambert merged commit b7c212c into adoptium:master Nov 20, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants