-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
featured: use run() to run cli commands in place of check_call() #177
Conversation
Hi @abdosi , please review. |
feffc38
to
7b38356
Compare
Signed-off-by: anamehra [email protected]
a97c285
to
36cec0a
Compare
With run(), seeign extra data in buffer and causing order check failure: 2024-11-01T00:00:02.3225936Z E Actual: [call(['sudo', 'systemctl', 'daemon-reload'], capture_output=True, check=True, text=True), 2024-11-01T00:00:02.3226361Z E call().stdout.__str__(), 2024-11-01T00:00:02.3226594Z E call().stderr.__str__(), 2024-11-01T00:00:02.3227055Z E call(['sudo', 'systemctl', 'unmask', 'dhcp_relay.service'], capture_output=True, check=True, text=True), 2024-11-01T00:00:02.3227342Z E call().stdout.__str__(), 2024-11-01T00:00:02.3227570Z E call().stderr.__str__(),
01bf6a1
to
c268fb7
Compare
Hi @judyjoseph , please help with this PR review. Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm
Hi @yejianquan , please help to get this one in 202405. Thanks |
@bingwang-ms , could you help to cherry-pick? Thanks! |
…ic-net#177) Fixes: sonic-net/sonic-buildimage#20662 During some reboots, it was observed that some times featured.service script command fails to start the services like pmon, snmp, lldp etc. From logs, it was observed that 'sudo systemctl enable ' command failed with errorcode 13 (SIGPIPE. 2024 Oct 29 01:31:26.191236 aaa14-rp INFO featured: Running cmd: '['sudo', 'systemctl', 'unmask', 'pmon.service']' 2024 Oct 29 01:31:26.211167 aaa14-rp INFO systemd[1]: Reloading. 2024 Oct 29 01:31:27.212381 aaa14-rp INFO featured: Running cmd: '['sudo', 'systemctl', 'enable', 'pmon.service']' 2024 Oct 29 01:31:27.232428 aaa14-rp INFO systemd[1]: Reloading. 2024 Oct 29 01:31:28.135667 aaa14-rp ERR featured: ['sudo', 'systemctl', 'enable', 'pmon.service'] - failed: return code - -13, output:#012None 2024 Oct 29 01:31:28.135746 aaa14-rp ERR featured: Feature 'pmon.service' failed to be enabled and started 2024 Oct 29 01:34:08.661711 aaa14-rp INFO featured: Running cmd: '['sudo', 'systemctl', 'enable', 'gnmi.service']' 2024 Oct 29 01:34:08.677242 aaa14-rp INFO systemd[1]: Reloading. 2024 Oct 29 01:34:09.316554 aaa14-rp ERR featured: ['sudo', 'systemctl', 'enable', 'gnmi.service'] - failed: return code - -13, output:#012None 2024 Oct 29 01:34:09.316791 aaa14-rp ERR featured: Feature 'gnmi.service' failed to be enabled and started The issue does not recover and the pmon and other services never starts. On supervisor this also leads to swss, syncd and other related docker to stay down. In general systemctl enable does not work for some services like pmon, snmp, lldp etc as there is no WantBy directive set for these services in unit file. The command returns stderr : "The unit files have no installation config (WantedBy=, RequiredBy=, Also=, Alias= settings in the [Install] section, and DefaultInstance= for template units). This means they are not meant to be enabled using systemctl. Possible reasons for having this kind of units are: • A unit may be statically enabled by being symlinked from another unit's .wants/ or .requires/ directory. • A unit's purpose may be to act as a helper for some other unit which has a requirement dependency on it. • A unit may be started when needed via activation (socket, path, timer, D-Bus, udev, scripted systemctl call, ...). • In case of template units, the unit is meant to be enabled with some instance name specified. ” featured python script uses subprocess.check_call() function to invoke the command which looks like is not very reliable at handling the stderr and may cause SIGPIPE with big buffer data. Modifying the function to use subprocess.run() resolves this issue. run() is more reliable at handing the return data. Validated the change with multiple reboots.
Cherry-pick PR to 202405: #188 |
Fixes: sonic-net/sonic-buildimage#20662 During some reboots, it was observed that some times featured.service script command fails to start the services like pmon, snmp, lldp etc. From logs, it was observed that 'sudo systemctl enable ' command failed with errorcode 13 (SIGPIPE. 2024 Oct 29 01:31:26.191236 aaa14-rp INFO featured: Running cmd: '['sudo', 'systemctl', 'unmask', 'pmon.service']' 2024 Oct 29 01:31:26.211167 aaa14-rp INFO systemd[1]: Reloading. 2024 Oct 29 01:31:27.212381 aaa14-rp INFO featured: Running cmd: '['sudo', 'systemctl', 'enable', 'pmon.service']' 2024 Oct 29 01:31:27.232428 aaa14-rp INFO systemd[1]: Reloading. 2024 Oct 29 01:31:28.135667 aaa14-rp ERR featured: ['sudo', 'systemctl', 'enable', 'pmon.service'] - failed: return code - -13, output:#012None 2024 Oct 29 01:31:28.135746 aaa14-rp ERR featured: Feature 'pmon.service' failed to be enabled and started 2024 Oct 29 01:34:08.661711 aaa14-rp INFO featured: Running cmd: '['sudo', 'systemctl', 'enable', 'gnmi.service']' 2024 Oct 29 01:34:08.677242 aaa14-rp INFO systemd[1]: Reloading. 2024 Oct 29 01:34:09.316554 aaa14-rp ERR featured: ['sudo', 'systemctl', 'enable', 'gnmi.service'] - failed: return code - -13, output:#012None 2024 Oct 29 01:34:09.316791 aaa14-rp ERR featured: Feature 'gnmi.service' failed to be enabled and started The issue does not recover and the pmon and other services never starts. On supervisor this also leads to swss, syncd and other related docker to stay down. In general systemctl enable does not work for some services like pmon, snmp, lldp etc as there is no WantBy directive set for these services in unit file. The command returns stderr : "The unit files have no installation config (WantedBy=, RequiredBy=, Also=, Alias= settings in the [Install] section, and DefaultInstance= for template units). This means they are not meant to be enabled using systemctl. Possible reasons for having this kind of units are: • A unit may be statically enabled by being symlinked from another unit's .wants/ or .requires/ directory. • A unit's purpose may be to act as a helper for some other unit which has a requirement dependency on it. • A unit may be started when needed via activation (socket, path, timer, D-Bus, udev, scripted systemctl call, ...). • In case of template units, the unit is meant to be enabled with some instance name specified. ” featured python script uses subprocess.check_call() function to invoke the command which looks like is not very reliable at handling the stderr and may cause SIGPIPE with big buffer data. Modifying the function to use subprocess.run() resolves this issue. run() is more reliable at handing the return data. Validated the change with multiple reboots.
Signed-off-by: anamehra [email protected]
Fixes: sonic-net/sonic-buildimage#20662
During some reboots, it was observed that some times featured.service script command fails to start the services like pmon, snmp, lldp etc.
From logs, it was observed that 'sudo systemctl enable ' command failed with errorcode 13 (SIGPIPE.
The issue does not recover and the pmon and other services never starts. On supervisor this also leads to swss, syncd and other related docker to stay down.
In general systemctl enable does not work for some services like pmon, snmp, lldp etc as there is no WantBy directive set for these services in unit file.
The command returns stderr :
featured python script uses subprocess.check_call() function to invoke the command which looks like is not very reliable at handling the stderr and may cause SIGPIPE with big buffer data.
Modifying the function to use subprocess.run() resolves this issue.
run() is more reliable at handing the return data.
Validated the change with multiple reboots.