Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arm64: failing test VMIlifecycle Softreboot a VirtualMachineInstance soft reboot vmi with agent connected should succeed #13501

Open
dhiller opened this issue Dec 12, 2024 · 5 comments · May be fixed by #13526 or kubevirt/project-infra#3833
Assignees
Labels
kind/failing-test Categorizes issue or PR as related to a failing test. kind/flake Categorizes issue or PR as related to a flaky test. priority/critical-urgent Categorizes an issue or pull request as critical and of urgent priority. sig/ci Denotes an issue or PR as being related to sig-ci, marks changes to the CI system. sig/compute wg/arch-arm Denotes an issue or PR that relates to the ARM architecture working group.

Comments

@dhiller
Copy link
Contributor

dhiller commented Dec 12, 2024

What happened

Failing test detected: VMIlifecycle Softreboot a VirtualMachineInstance soft reboot vmi with agent connected should succeed 1
Example failure 2

/kind failing-test
/priority critical-urgent

/sig compute
/wg arch-arm

/assign @zhlhahaha

Additional context

Add any other context about the problem here.

Flake Action Plan

As the assignee, thoroughly review the issue and put the resulting report as comment into this issue.

Then decide on one of the following actions:

The flake is a bug

  • Add label commenting /triage accepted
  • Create a PR to fix the bug
  • Reference this issue in the PR
  • Keep this issue open until the testcase does not fail anymore

The flake is non-critical or an issue that is hard to fix

  • Quarantine the test by creating a pull request assigning [QUARANTINE] tag to test name and Quarantine decorator to the test
  • Reference this issue on the PR

There was an infrastructure issue

An infra issue is anything "below" the testcase

  • Add label commenting /triage infra-issue
  • Close the issue, adding a comment with details about the infrastructure issue.

After the flake has been fixed, document the learning from it inside the fix PR

@dhiller dhiller added the kind/flake Categorizes issue or PR as related to a flaky test. label Dec 12, 2024
@kubevirt-bot kubevirt-bot added kind/failing-test Categorizes issue or PR as related to a failing test. priority/critical-urgent Categorizes an issue or pull request as critical and of urgent priority. sig/compute wg/arch-arm Denotes an issue or PR that relates to the ARM architecture working group. labels Dec 12, 2024
@dhiller
Copy link
Contributor Author

dhiller commented Dec 12, 2024

/sig ci

@kubevirt-bot kubevirt-bot added the sig/ci Denotes an issue or PR as being related to sig-ci, marks changes to the CI system. label Dec 12, 2024
@dhiller
Copy link
Contributor Author

dhiller commented Dec 12, 2024

@zhlhahaha hey, I am opening this as a tracker item. The failure seems to have started around 7 days ago to happen regularly

@zhlhahaha
Copy link
Contributor

@zhlhahaha hey, I am opening this as a tracker item. The failure seems to have started around 7 days ago to happen regularly

Ok, I may not have time today, I will take a look tomorrow.

@zhlhahaha
Copy link
Contributor

After some detection, it seems that the issue related to ACPI. I need to do more investigation on this.

@zhlhahaha zhlhahaha linked a pull request Dec 16, 2024 that will close this issue
@zhlhahaha
Copy link
Contributor

Root Cause
The soft reboot tests include scenarios where virtual machines start without ACPI enabled. However, on the Arm64 platform, the ACPI feature is required for successful UEFI boot.

Why the Issue Started Two Weeks Ago
Previously, the soft reboot tests were part of a separate test file and were not executed on the Arm64 platform. However, these tests were moved to vmi_lifecycle_test.go by this patch about two weeks ago. Since most test cases in this file run on the Arm64 platform, the tests began failing consistently in the Arm64 end-to-end test lane.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/failing-test Categorizes issue or PR as related to a failing test. kind/flake Categorizes issue or PR as related to a flaky test. priority/critical-urgent Categorizes an issue or pull request as critical and of urgent priority. sig/ci Denotes an issue or PR as being related to sig-ci, marks changes to the CI system. sig/compute wg/arch-arm Denotes an issue or PR that relates to the ARM architecture working group.
Projects
None yet
3 participants