beech hung with all relays on relay board energized #306

jessicamillar · 2025-01-12T16:16:19Z

Date: Jan 12 2025

Biggest issue - beech hung with relays on board energized for 45 minutes

2025-01-12 09:09:58.350 Checking tls_paths_present
2025-01-12 09:09:58.350 Getting requested_names
2025-01-12 09:09:58.351 Loading layout
2025-01-12 09:09:58.391 Getting nodes run by scada
2025-01-12 09:09:58.392 Done
################################
# BELIEVE THIS WAS NOT ACTUALLY 10:02 FROM HERE ON
#######################################
2025-01-12 09:13:25.055 
2025-01-12 09:13:25.056 Env file: </home/pi/gw-scada-spaceheat-python/.env>  exists:True
2025-01-12 09:13:25.056 Settings:

2025-01-12 09:13:25.250 Checking tls_paths_present
2025-01-12 09:13:25.251 Getting requested_names
2025-01-12 09:13:25.251 Loading layout
2025-01-12 09:13:25.293 Getting nodes run by scada
2025-01-12 09:13:25.295 Done


2025-01-12 10:02:12.562 
Subscription info for <hw1.isone.me.versant.keene.beech.scada> [construction]
  Client name: <local>  topic_dst: <s2>

Other issues at beech

Since we are continuing to have problems with the beech power meter, I have moved the power-meter ShNode from primary to secondary scada. The code was in a somewhat broken state on beech2 until 11:30 am.

The beech dashboard is failing often - sometimes with a TLS error and sometimes with core dumps. See related TLS issue in proactor ... This is almost certainly due in part to the dashboard requiring power data (todo: get rid of HpHack in the dashboard) but also looks related to an underlying gwproactor issue. Note that there are essentially no crashes on the oak and fir dashboards.

Various notes and timeline

06:15 am journalctl strange report re starting and deactivating gwspaceheat-restart.service

Jan 12 06:15:03 beech systemd[1]: Starting gwspaceheat-restart.service - Start gwspaceheat service if is not running; Designed to catch manually stopping and forgetting to restart service....
Jan 12 06:15:07 beech python[418550]: 2025-01-12 06:15:07.199 [relay1] sending DeEnergize to multiplexer
Jan 12 06:15:09 beech python[418550]: 2025-01-12 06:15:07.725 [pico-cycler] primary-flow pico_607636 flatlined
Jan 12 06:15:09 beech python[418550]: 2025-01-12 06:15:07.771 [pico-cycler] dist-flow2 pico_2a7e22 flatlined
Jan 12 06:15:12 beech systemd[1]: gwspaceheat-restart.service: Deactivated successfully.
Jan 12 06:15:12 beech systemd[1]: Finished gwspaceheat-restart.service - Start gwspaceheat service if is not running; Designed to catch manually stopping and forgetting to restart service.

06:20 am: First strange Scada restart

2025-01-12 06:20:10.572 ERROR in process_message
Traceback (most recent call last):
  File "/home/pi/gw-scada-spaceheat-python/gw_spaceheat/venv/lib/python3.11/site-packages/gwproactor/proactor_implementation.py", line 362, in process_messages
    await self.process_message(message)
  File "/home/pi/gw-scada-spaceheat-python/gw_spaceheat/venv/lib/python3.11/site-packages/gwproactor/proactor_implementation.py", line 477, in process_message
    self._watchdog.process_message(message)
  File "/home/pi/gw-scada-spaceheat-python/gw_spaceheat/venv/lib/python3.11/site-packages/gwproactor/watchdog.py", line 77, in process_message
    self._pat_external_watchdog()
  File "/home/pi/gw-scada-spaceheat-python/gw_spaceheat/venv/lib/python3.11/site-packages/gwproactor/watchdog.py", line 147, in _pat_external_watchdog
    subprocess.run(self._pat_external_watchdog_process_args, check=True)  # noqa: S603
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pi/.pyenv/versions/3.11.9/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['systemd-notify', '--pid=418550', 'WATCHDOG=1']' returned non-zero exit status 1.

6:20 -> 6:47 am: Scada stuck in startup

2025-01-12 06:20:25.349 Getting nodes run by scada
2025-01-12 06:20:27.394 Done
2025-01-12 06:47:49.758

The text was updated successfully, but these errors were encountered:

jessicamillar · 2025-01-12T16:39:43Z

I asked George to stop the scada code as soon as he ssh'd in (I was realizing that I was logged out of tailscale). Attached is the proactor.log code that I got at that point, before restarting the Scada.
beech.up_to_just_after_power_cycle.log.

Also attached are journalctl logs (from journalctl --since "2025-01-12 06:00:00" --until "2025-01-12 11:00:00" > journalctl.log)
jouranlctl.log

jessicamillar added the bug Something isn't working label Jan 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

beech hung with all relays on relay board energized #306

beech hung with all relays on relay board energized #306

jessicamillar commented Jan 12, 2025 •

edited

Loading

jessicamillar commented Jan 12, 2025 •

edited

Loading

beech hung with all relays on relay board energized #306

beech hung with all relays on relay board energized #306

Comments

jessicamillar commented Jan 12, 2025 • edited Loading

Biggest issue - beech hung with relays on board energized for 45 minutes

Other issues at beech

Various notes and timeline

jessicamillar commented Jan 12, 2025 • edited Loading

jessicamillar commented Jan 12, 2025 •

edited

Loading

jessicamillar commented Jan 12, 2025 •

edited

Loading