Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SmartSwitch] Enhance PCIe device check to skip the warning log, if device is in detaching mode #546

Open
wants to merge 22 commits into
base: master
Choose a base branch
from

Conversation

vvolam
Copy link

@vvolam vvolam commented Oct 5, 2024

Description

Add a new function "is_device_in_detaching_mode" to query state_db for devices in detaching mode. Enhance "check_pcie_devices" to invoke this function to skip the warning log, if device is in detaching mode.

Motivation and Context

Changes are being done based on the reboot HLD

How Has This Been Tested?

Pending

Additional Information (Optional)

@vvolam vvolam changed the title Enhance PCIe device check to skip the warning log, if device is in detaching mode [SmartSwitch] Enhance PCIe device check to skip the warning log, if device is in detaching mode Nov 11, 2024
@vvolam vvolam marked this pull request as ready for review November 13, 2024 03:24
sonic-pcied/scripts/pcied Outdated Show resolved Hide resolved
@vvolam
Copy link
Author

vvolam commented Nov 18, 2024

/azpw run

@mssonicbld
Copy link
Collaborator

/AzurePipelines run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@vvolam vvolam requested a review from prgeor November 19, 2024 00:12
@vvolam
Copy link
Author

vvolam commented Dec 12, 2024

@prgeor please help review and merge, if no other changes needed.

oleksandrivantsiv and others added 6 commits January 3, 2025 16:43
…c-net#535)

* Added new dynamic field 'last_sync_time' that shows when STORAGE_INFO for disk was last synced to STATE_DB

* Moved 'start' message to actual starting point of the daemon

* Added functions for formatted and epoch time for user friendly time display

* Made changes per prgeor review comments

* Pivot to SysLogger for all logging

* Increased log level so that they are seen in syslogs

* Code coverage improvement
…net#542)

When LC is absent for 30 minutes, the database cleanup kicks in. When LagId is released, it needs to be appended to the SYSTEM_LAG_IDS_FREE_LIST

This PR works with the following 2 PRs:
sonic-net/sonic-swss#3303
sonic-net/sonic-buildimage#20369

Signed-off-by: mlok <[email protected]>
…ATE_DB (sonic-net#560)

Fixed the bug in chassisd due to which incorrect number of ASICs were being pushed to CHASSIS_STATE_DB.
* thermalctld: Add support for fans on non-CPU modules

* Add module fan to unit tests
assrinivasan and others added 5 commits January 3, 2025 16:43
Description
This PR advances the azure pipeline on sonic_platform_daemons from bullseye to bookworm. This fixes the issue where sonic-platform-daemons azp is having some issues due to upgrade to bookworm. See Pipelines - Run 20241210.8 logs for details.
Description
Fix non-CMIS transceivers in down state by bringing them out of low power mode in the SFF Manager Task.
This is intended to work together with the change in sonic-net/sonic-buildimage#20886.

Motivation and Context
Non-CMIS transceivers were not functioning correctly when put into Low Power mode. So XCVRD now brings them out of lpmode.

How Has This Been Tested?
Loaded an image containing this change alongside the change from sonic-net/sonic-buildimage#20886 on an Arista chassis containing a Clearwater2 linecard.
Verified that without this image some interfaces were in a down state but with the image all interfaces came up as expected.
…t#467)

Added SmartSwitch support in chassisd and enabling chassisd
…p run function from the initialization function (sonic-net#576)

Description
Move the PSU parent information generation to the loop run function from the initialization function

Motivation and Context
Fixes sonic-net#575

How Has This Been Tested?
Tested on Cisco chassis, the PHYSICAL_ENTITY_INFO|PSU * can be re-inserted after thermalctld restart.
And monitored the stated db for memory for hours, works well:
…net#573)

Description
On Nokia platform, slot name of Supervisor is string "A" instead of a number. Using "int" to convert it could cause issue backtrace. We should use slot value to any checking without any conversion. This will fixes sonic-net/sonic-buildimage#21131

Motivation and Context
Modify the _get_module_info not to convert "slot" to a string value. And also modify the code not to convert slot value to an to do any checking. Just directly use the returned value of get_slot(). Also add UT test_moduleupdater_check_slot_string() to valid it.

How Has This Been Tested?
Tested on 202405 branch


Signed-off-by: mlok <[email protected]>
@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@@ -174,6 +180,52 @@ class ModuleConfigUpdater(logger.Logger):
self.log_info("Changing module {} to admin {} state".format(key, 'DOWN' if admin_state == MODULE_ADMIN_DOWN else 'UP'))
try_get(self.chassis.get_module(module_index).set_admin_state, admin_state, default=False)

#
# SmartSwitch Module Config Updater ========================================================
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vvolum Looks like you need to re-sync right?


if xcvr_inserted:
set_lp_success = (
sfp.set_lpmode(False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vvolum Do you think we need error handling here to handle failure case?

@mssonicbld
Copy link
Collaborator

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.