Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve MIBUpdater to re-connect DBConnector when re-init data. #290

Merged
merged 10 commits into from
Oct 30, 2023

Conversation

liuh-80
Copy link
Contributor

@liuh-80 liuh-80 commented Sep 14, 2023

Improve MIBUpdater to re-connect DBConnector when re-init data.

Work item tracking

Microsoft ADO (number only): 24705208

- What I did
Fix when redis restart, some MIBUpdater's db connection will broken and keeps report error to syslog issue.

There will be following message repeat in syslog:

#012  File "/usr/local/lib/python3.7/dist-packages/sonic_ax_impl/mibs/ietf/rfc2737.py", line 674, in _update_per_namespace_data#
#012    msg = pubsub.get_message()#
#012  File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 1626, in get_message#
#012    return _swsscommon.PubSub_get_message(self, timeout)#
#012RuntimeError: RedisError: Failed to select, err=3: errstr=Server closed the connection

- How I did it
Re-connect DBConnector in every MIBUpdater's reinit_data method.

- How to verify it
Pass all UT

Manually test with following steps:

  1. in database container kill redis server.
  2. start redis in database container later: service redis-server start
  3. check syslog, confirm following log exist but now new log few minutes later after mibs re-init:

sudo cat /var/log/syslog | grep rfc2737.py | grep PubSub_get_message

Sep 22 03:30:17.223944 vlab-01 ERR snmp#snmp-subagent [ax_interface] ERROR: MIBUpdater.start() caught an unexpected exception during update_data()#012Traceback (most recent call last):#12 File "/usr/local/lib/python3.9/dist-packages/ax_interface/mib.py", line 43, in start#012 self.update_data()#12 File "/usr/local/lib/python3.9/dist-packages/sonic_ax_impl/mibs/ietf/rfc2737.py", line 326, in update_data#012 updater.update_data(i, self.statedb[i])#12 File "/usr/local/lib/python3.9/dist-packages/sonic_ax_impl/mibs/ietf/rfc2737.py", line 666, in update_data#012 self._update_per_namespace_data(self.pub_sub_dict[db_index])#12 File "/usr/local/lib/python3.9/dist-packages/sonic_ax_impl/mibs/ietf/rfc2737.py", line 675, in _update_per_namespace_data#012 msg = pubsub.get_message()#12 File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 1996, in get_message#012 return _swsscommon.PubSub_get_message(self, timeout, interrupt_on_signal)#012RuntimeError: RedisError: Failed to select, err=3: errstr=Server closed the connection

- Description for the changelog
Improve MIBUpdater to re-connect DBConnector when re-init data.

@liuh-80
Copy link
Contributor Author

liuh-80 commented Sep 15, 2023

Pipeline build break, already create a fix PR. pending for #293 merge first

@liuh-80
Copy link
Contributor Author

liuh-80 commented Sep 18, 2023

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@liuh-80 liuh-80 marked this pull request as ready for review September 19, 2023 01:03
@liuh-80 liuh-80 requested a review from qiluo-msft September 19, 2023 01:04
@qiluo-msft

This comment was marked as resolved.

@liuh-80

This comment was marked as resolved.

@@ -41,6 +42,13 @@ async def start(self):

# run the background update task
self.update_data()
self.redis_exception_happen = False
Copy link
Contributor

@qiluo-msft qiluo-msft Oct 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redis_exception_happen

I am worried about the contract of member redis_exception_happen between multiple classes. How about define a new argument reconnect=False in reinit_data(), and keep redis_exception_happen as a private member only in this class. #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, move redis_exception_happen to local variable. also because the python method override issue, add new method reinit_connection() to MIBUpdater interface.

while self.run_event.is_set():
try:
# reinit internal structures
if self.update_counter > self.reinit_rate:
# reconnect when redis exception happen
if redis_exception_happen:
self.reinit_connection()
Copy link
Contributor

@qiluo-msft qiluo-msft Oct 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[](http://example.com/codeflow?start=24&length=4)

You can remove one extra level of indentation. #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@liuh-80 liuh-80 merged commit 347a6e4 into sonic-net:master Oct 30, 2023
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants