Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Name resolution failure in manager._slack_notification() causes manager to crash #15

Open
douglatornell opened this issue Oct 13, 2022 · 0 comments
Assignees
Labels
bug Something isn't working major
Milestone

Comments

@douglatornell
Copy link
Member

I think we should allow the urllib3.exceptions.NewConnectionError to pass and live with the missed Slack notification.

Traceback:

2022-10-13 04:59:13,179 CRITICAL [manager] unhandled exception:
Traceback (most recent call last):
  File "/SalishSeaCast/nowcast-env/lib/python3.10/site-packages/urllib3/connection.py", line 174, in _new_conn
    conn = connection.create_connection(
  File "/SalishSeaCast/nowcast-env/lib/python3.10/site-packages/urllib3/util/connection.py", line 72, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/SalishSeaCast/nowcast-env/lib/python3.10/socket.py", line 955, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/SalishSeaCast/nowcast-env/lib/python3.10/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "/SalishSeaCast/nowcast-env/lib/python3.10/site-packages/urllib3/connectionpool.py", line 386, in _make_request
    self._validate_conn(conn)
  File "/SalishSeaCast/nowcast-env/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1040, in _validate_conn
    conn.connect()
  File "/SalishSeaCast/nowcast-env/lib/python3.10/site-packages/urllib3/connection.py", line 358, in connect
    self.sock = conn = self._new_conn()
  File "/SalishSeaCast/nowcast-env/lib/python3.10/site-packages/urllib3/connection.py", line 186, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f0ca333e6b0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/SalishSeaCast/nowcast-env/lib/python3.10/site-packages/requests/adapters.py", line 440, in send
    resp = conn.urlopen(
  File "/SalishSeaCast/nowcast-env/lib/python3.10/site-packages/urllib3/connectionpool.py", line 785, in urlopen
    retries = retries.increment(
  File "/SalishSeaCast/nowcast-env/lib/python3.10/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='hooks.slack.com', port=443): Max retries exceeded with url: /services/TFR25L4LU/BN6BE538V/CoaynXHafLaNF0L2VBTSqXFT (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f0ca333e6b0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/SalishSeaCast/NEMO_Nowcast/nemo_nowcast/manager.py", line 267, in _process_messages
    self._try_messages()
  File "/SalishSeaCast/NEMO_Nowcast/nemo_nowcast/manager.py", line 287, in _try_messages
    reply, next_workers = self._message_handler(message)
  File "/SalishSeaCast/NEMO_Nowcast/nemo_nowcast/manager.py", line 309, in _message_handler
    reply, next_workers = self._handle_continue_msg(msg)
  File "/SalishSeaCast/NEMO_Nowcast/nemo_nowcast/manager.py", line 357, in _handle_continue_msg
    self._slack_notification(msg)
  File "/SalishSeaCast/NEMO_Nowcast/nemo_nowcast/manager.py", line 467, in _slack_notification
    requests.post(slack_url, json=slack_msg)
  File "/SalishSeaCast/nowcast-env/lib/python3.10/site-packages/requests/api.py", line 117, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "/SalishSeaCast/nowcast-env/lib/python3.10/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/SalishSeaCast/nowcast-env/lib/python3.10/site-packages/requests/sessions.py", line 529, in request
    resp = self.send(prep, **send_kwargs)
  File "/SalishSeaCast/nowcast-env/lib/python3.10/site-packages/requests/sessions.py", line 645, in send
    r = adapter.send(request, **kwargs)
  File "/SalishSeaCast/nowcast-env/lib/python3.10/site-packages/requests/adapters.py", line 519, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='hooks.slack.com', port=443): Max retries exceeded with url: /services/TFR25L4LU/BN6BE538V/CoaynXHafLaNF0L2VBTSqXFT (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f0ca333e6b0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))
2022-10-13 04:59:13,621 CRITICAL [manager] shutting down
2022-10-13 04:59:13,648 CRITICAL [manager] ZMQError:
Traceback (most recent call last):
  File "/SalishSeaCast/NEMO_Nowcast/nemo_nowcast/manager.py", line 267, in _process_messages
    self._try_messages()
  File "/SalishSeaCast/NEMO_Nowcast/nemo_nowcast/manager.py", line 286, in _try_messages
    message = self._socket.recv_string()
  File "/SalishSeaCast/nowcast-env/lib/python3.10/site-packages/zmq/sugar/socket.py", line 742, in recv_string
    msg = self.recv(flags=flags)
  File "zmq/backend/cython/socket.pyx", line 781, in zmq.backend.cython.socket.Socket.recv
  File "zmq/backend/cython/socket.pyx", line 817, in zmq.backend.cython.socket.Socket.recv
  File "zmq/backend/cython/socket.pyx", line 191, in zmq.backend.cython.socket._recv_copy
  File "zmq/backend/cython/socket.pyx", line 186, in zmq.backend.cython.socket._recv_copy
  File "zmq/backend/cython/checkrc.pxd", line 28, in zmq.backend.cython.checkrc._check_rc
zmq.error.ZMQError: Operation cannot be accomplished in current state
2022-10-13 04:59:13,678 CRITICAL [manager] shutting down
@douglatornell douglatornell added bug Something isn't working major labels Oct 13, 2022
@douglatornell douglatornell added this to the v22.1 milestone Oct 13, 2022
@douglatornell douglatornell self-assigned this Oct 13, 2022
@douglatornell douglatornell modified the milestones: v22.1, v23.1 Jan 23, 2023
@douglatornell douglatornell modified the milestones: v24.1, v25.1 Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working major
Projects
None yet
Development

No branches or pull requests

1 participant