Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the advanced reboot test to collect tcpdump on the server physical port #15498

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

congh-nvidia
Copy link
Contributor

Description of PR

Summary:
We have been observing random packet drops on the ptf in multiple tests, the advanced reboot test also suffers from this.
We suspect it could be some performance issue of the ptf.
For the advanced reboot test, we can avoid it by moving the tcpdump from the logical interfaces in ptf container to the physical interface of the test server(vmhost).
With this change, the random packet drop is no longer observed in this test.

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 202012
  • 202205
  • 202305
  • 202311
  • 202405

Approach

What is the motivation for this PR?

To fix the ptf random packet drop issue in the advanced-reboot test.

How did you do it?

By moving the tcpdump from the logical interfaces in ptf container to the physical interface of the test server.

How did you verify/test it?

Run all the advanced reboot tests(fast/warm reboot, upgrade, and etc.) in our internal regression on all the platforms. No issues observed.

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

@congh-nvidia congh-nvidia requested a review from prgeor as a code owner November 12, 2024 03:13
@congh-nvidia
Copy link
Contributor Author

/azpw run

@mssonicbld
Copy link
Collaborator

/AzurePipelines run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@congh-nvidia congh-nvidia force-pushed the advanced_reboot branch 4 times, most recently from d58c64c to 534412c Compare November 25, 2024 03:48
@mssonicbld
Copy link
Collaborator

The pre-commit check detected issues in the files touched by this pull request.
The pre-commit check is a mandatory check, please fix detected issues.

Detailed pre-commit check results:
trim trailing whitespace.................................................Passed
fix end of files.........................................................Passed
check yaml...........................................(no files to check)Skipped
check for added large files..............................................Passed
check python ast.........................................................Passed
flake8...................................................................Failed
- hook id: flake8
- exit code: 1

ansible/roles/test/files/ptftests/py3/advanced-reboot.py:678:35: F821 undefined name 'capture_pcap'

flake8...............................................(no files to check)Skipped
check conditional mark sort..........................(no files to check)Skipped

To run the pre-commit checks locally, you can follow below steps:

  1. Ensure that default python is python3. In sonic-mgmt docker container, default python is python2. You can run
    the check by activating the python3 virtual environment in sonic-mgmt docker container or outside of sonic-mgmt
    docker container.
  2. Ensure that the pre-commit package is installed:
sudo pip install pre-commit
  1. Go to repository root folder
  2. Install the pre-commit hooks:
pre-commit install
  1. Use pre-commit to check staged file:
pre-commit
  1. Alternatively, you can check committed files using:
pre-commit run --from-ref <commit_id> --to-ref <commit_id>

@bingwang-ms
Copy link
Collaborator

@saiarcot895, @Ryangwaite Can you help review this change?

@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@congh-nvidia
Copy link
Contributor Author

/azpw run

@mssonicbld
Copy link
Collaborator

/AzurePipelines run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

…ical port

Get the tcpdump on the physical interface of the test server instead of the logical ptf interfaces. This is to fix the issue that sometimes there could be random packet drop on the ptf.
Keep the original approch to capture on the ptf interfaces for the kvm testbed which doesn't have the vmhost external port.
@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants