Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

geographic gatherer stopped after IndexError exception #100

Open
TAlonglong opened this issue Mar 22, 2022 · 13 comments
Open

geographic gatherer stopped after IndexError exception #100

TAlonglong opened this issue Mar 22, 2022 · 13 comments

Comments

@TAlonglong
Copy link
Collaborator

TAlonglong commented Mar 22, 2022

geographic gatherer stopped working, but did not exit with a value supervisor recognized to trigger a restart. This cause supervisord not to restart the gatherer. Supervisor reported:

2022-03-21 21:38:51,889 INFO exited: gatherer-winds (exit status 0; expected)

The log from geographic gatherer

Exception in thread Thread-7:
Traceback (most recent call last):
  File "/software/miniconda/envs/pytroll/lib/python3.9/threading.py", line 973, in _bootstrap_inner
    self.run()
  File "/software/miniconda/envs/pytroll/lib/python3.9/site-packages/pytroll_collectors/triggers/_base.py", line 227, in run
    self.publish_collection(next_timeout[0].finish_without_reset())
  File "/software/miniconda/envs/pytroll/lib/python3.9/site-packages/pytroll_collectors/triggers/_base.py", line 91, in publish_collection
    subject = self._get_topic(metadata[0])
IndexError: list index out of range
[ERROR: 2022-03-21 21:38:40 : pytroll_collectors.geographic_gatherer] Something went wrong
Traceback (most recent call last):
  File "/software/miniconda/envs/pytroll/lib/python3.9/site-packages/pytroll_collectors/geographic_gatherer.py", line 201, in run
    raise RuntimeError
RuntimeError
[INFO: 2022-03-21 21:38:40 : pytroll_collectors.geographic_gatherer] Ending publication the gathering of granules...
[INFO: 2022-03-21 21:38:40 : gatherer] GeographicGatherer has stopped.
Setting timezone to UTC

I guess the metadata does not have an element, but I don't understand why

def publish_collection(self, metadata):
"""Terminate the gathering."""
subject = self._get_topic(metadata[0])
mda = _merge_metadata(metadata)

@pnuu
Copy link
Member

pnuu commented Mar 22, 2022

What's the incoming message triggering this? And the configuration. I have a feeling I've seen something like this before.

@TAlonglong
Copy link
Collaborator Author

config

[DEFAULT]
regions = eurol

[ears-metop-ascat]
regions=ears_high_res
pattern = ascat_{start_time:%Y%m%d_%H%M%S}_{platform_name:s}_{orbit_number:s}_ear_o_coa_{ears_station:s}
format = EPS
type = binary
sensor = ascat
timeliness = 30
level = 0
variant = EARS
publish_topic = /CAT/EARS/METOP-ASCAT/BUFR
service=move_it_metop-ascat-ears-coa
topics=/metop/ascat/ears/coa/l2_bufr
duration=600
publish_message_after_each_reception=True

[ears-metop-ascat-global-057]
regions=ears_high_res
pattern = ascat_{start_time:%Y%m%d_%H%M%S}_{platform_name:s}_{orbit_number:s}_eps_o_057_ovw.l2_bufr
format = EPS
type = binary
sensor = ascat
timeliness = 120
level = 0
variant = EARS
publish_topic = /CAT/GLOBAL/METOP-ASCAT-057/BUFR
service=move_it_metop-ascat-global-057
topics=/metop/ascat/global/057/l2_bufr
duration=600
publish_message_after_each_reception=True

[global-metop-amv-l2d]
regions=amv_collection
pattern = W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,{platform_name:s}+AVHRR_C_EUMP_{start_time:%Y%m%d%H%M%S}_{orbit_number:5s}_eps_o_amv_l2d.bin
format=BUFR
type=binary
data_processing_level=2
sensor=avhrr
#timeliness in minutes
timeliness=120
duration=180
variant=GLOBAL
publish_topic=/GLOBAL/METOP/AMV/2d
service=move_it_global-metop-amv-l2d
topics=/CHECK/GLOBAL/SEGMENTS/METOP/AMV/2d
publish_message_after_each_reception=True

@TAlonglong
Copy link
Collaborator Author

The last message printed before the error:

[DEBUG: 2022-03-21 21:38:39 : pytroll_collectors.triggers._base] Waiting 4583.930227 seconds until timeout
[DEBUG: 2022-03-21 21:38:39 : pytroll_collectors.triggers._base] mda: {'sensor': 'avhrr', 'platform_name': 'Metop-B', 'start_time': datetime.datetime(2022, 3, 21, 20, 52, 3), 'orbit_number': '49329', 'uri': 'ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321205203_49329_eps_o_amv_l2d.bin', 'uid': 'W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321205203_49329_eps_o_amv_l2d.bin', 'origin': '157.249.16.182:9062', 'end_time': datetime.datetime(2022, 3, 21, 20, 55, 3)}
[DEBUG: 2022-03-21 21:38:39 : pytroll_collectors.region_collector] Adding area ID amv_collection to metadata for Metop-B
[INFO: 2022-03-21 21:38:39 : pytroll_collectors.region_collector] Added expected granule Metop-B (2022-03-21 20:52:03) to area amv_collection
[INFO: 2022-03-21 21:38:39 : pytroll_collectors.region_collector] Collection finished for Metop-B area amv_collection
[DEBUG: 2022-03-21 21:38:39 : pytroll_collectors.triggers._base] Composing topic.
[DEBUG: 2022-03-21 21:38:39 : pytroll_collectors.triggers._base] Is last file added: False
[INFO: 2022-03-21 21:38:39 : pytroll_collectors.triggers._base] sending pytroll://GLOBAL/METOP/AMV/2d collection ubuntu@satellite-wind-processing-b 2022-03-21T21:38:39.079863 v1.01 application/json {"sensor": "avhrr", "platform_name": "Metop-B", "start_time": "2022-03-21T20:13:03", "orbit_number": "49329", "origin": "157.249.16.188:9062", "end_time": "2022-03-21T20:55:03", "collection_area_id": "amv_collection", "collection": [{"start_time": "2022-03-21T20:13:03", "end_time": "2022-03-21T20:16:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321201303_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321201303_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:16:03", "end_time": "2022-03-21T20:19:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321201603_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321201603_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:19:03", "end_time": "2022-03-21T20:22:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321201903_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321201903_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:22:03", "end_time": "2022-03-21T20:25:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321202203_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321202203_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:25:03", "end_time": "2022-03-21T20:28:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321202503_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321202503_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:28:03", "end_time": "2022-03-21T20:31:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321202803_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321202803_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:31:03", "end_time": "2022-03-21T20:34:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321203103_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321203103_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:34:03", "end_time": "2022-03-21T20:37:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321203403_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321203403_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:37:03", "end_time": "2022-03-21T20:40:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321203703_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321203703_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:40:03", "end_time": "2022-03-21T20:43:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321204003_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321204003_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:43:03", "end_time": "2022-03-21T20:46:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321204303_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321204303_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:46:03", "end_time": "2022-03-21T20:49:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321204603_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321204603_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:49:03", "end_time": "2022-03-21T20:52:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321204903_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321204903_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:52:03", "end_time": "2022-03-21T20:55:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321205203_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321205203_49329_eps_o_amv_l2d.bin"}]}

It's a bit tricky to follow the log. I'm not sure if this is a mix of two gatherer configs. But it looks like it gets a file, add granule. This completes the gathering, but at the same time it says the last segment is not added. and then sends the message. And then the error comes.

@pnuu
Copy link
Member

pnuu commented Mar 23, 2022

If the log messages are hard to follow, try to follow the code 😅

In short, I have, at least yet, no good ideas why pytroll_collectors.region_collector.RegionCollector.granules is an empty list when it's being accessed on the publishing step.

@pnuu
Copy link
Member

pnuu commented Mar 23, 2022

I think the above incoming message isn't what triggered the crash. The completed collection were published correctly (the area and file naming match). I'd think the crashing one is the previous incoming message, from which the Waiting for ... and Is last file added log messages are from.

@TAlonglong
Copy link
Collaborator Author

OK, so here is the log with the message just before that. So it looks like this is not included and then we might have a race condition still handling of one segment when the next one comes in and is added, triggers complete and sends the message. All this when the previous message is still under handeling. Then the message is cleaned and when it tries to do some handeling then it is already removed?

Just a guess

[DEBUG: 2022-03-21 21:38:36 : pytroll_collectors.triggers._base] mda: {'sensor': 'avhrr', 'platform_name': 'Metop-B', 'start_time': datetime.datetime(2022, 3, 21, 20, 55, 3), 'orbit_number': '49329', 'uri': 'ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321205503_49329_eps_o_amv_l2d.bin', 'uid': 'W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321205503_49329_eps_o_amv_l2d.bin', 'origin': '157.249.16.188:9062', 'end_time': datetime.datetime(2022, 3, 21, 20, 58, 3)}
[DEBUG: 2022-03-21 21:38:36 : pytroll_collectors.region_collector] Adding area ID amv_collection to metadata for Metop-B
[DEBUG: 2022-03-21 21:38:36 : pytroll_collectors.region_collector] Platform name Metop-B and sensor avhrr: Start and end times = 20220321 20:55:03 20220321 20:58:03
[DEBUG: 2022-03-21 21:38:36 : trollsched.satpass] kwargs: {'instrument': 'avhrr'}
[DEBUG: 2022-03-21 21:38:36 : trollsched.satpass] instrument: avhrr
[DEBUG: 2022-03-21 21:38:36 : pyorbital.tlefile] Reading TLE from /data/pytroll/tle/tle_db/tle-latest-pytroll.txt
[DEBUG: 2022-03-21 21:38:36 : trollsched.boundary] Instrument = avhrr
[DEBUG: 2022-03-21 21:38:39 : pytroll_collectors.triggers._base] Waiting 4583.930227 seconds until timeout
[DEBUG: 2022-03-21 21:38:39 : pytroll_collectors.triggers._base] mda: {'sensor': 'avhrr', 'platform_name': 'Metop-B', 'start_time': datetime.datetime(2022, 3, 21, 20, 52, 3), 'orbit_number': '49329', 'uri': 'ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321205203_49329_eps_o_amv_l2d.bin', 'uid': 'W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321205203_49329_eps_o_amv_l2d.bin', 'origin': '157.249.16.182:9062', 'end_time': datetime.datetime(2022, 3, 21, 20, 55, 3)}
[DEBUG: 2022-03-21 21:38:39 : pytroll_collectors.region_collector] Adding area ID amv_collection to metadata for Metop-B
[INFO: 2022-03-21 21:38:39 : pytroll_collectors.region_collector] Added expected granule Metop-B (2022-03-21 20:52:03) to area amv_collection
[INFO: 2022-03-21 21:38:39 : pytroll_collectors.region_collector] Collection finished for Metop-B area amv_collection
[DEBUG: 2022-03-21 21:38:39 : pytroll_collectors.triggers._base] Composing topic.
[DEBUG: 2022-03-21 21:38:39 : pytroll_collectors.triggers._base] Is last file added: False

@pnuu
Copy link
Member

pnuu commented Mar 23, 2022

Could very well be. Were there any on-going collections for the other configured sections? The Waiting 4583.930227 seconds until timeout seems a bit odd being terminated so far ahead of time, but I guess that is becouse timeliness in the config has been set to 120 minutes, which is more than one full orbit. How large is the collection area?

@TAlonglong
Copy link
Collaborator Author

Hm, looks very big. I don't remember why this is so huge. Centerd around the north pole as far as can understanmd from the config

amv_collection:
  description: Norway - Area for amv collection
  projection:
    proj: stere
    lat_0: 90
    lon_0: 0
    lat_ts: 60
    ellps: WGS84
  shape:
    height: 3776
    width: 3776
  area_extent:
    lower_left_xy: [-7552173.63, -7552173.63]
    upper_right_xy: [7552173.63, 7552173.63]

@pnuu
Copy link
Member

pnuu commented Mar 23, 2022

That is indeed a huge area:

from pyproj import Proj

prj = Proj({"proj": "stere", "lat_0": 90, "lon_0": 0, "lat_ts": 60, "ellps": "WGS84"})
print(prj(-7552173.63, -7552173.63, inverse=True))

shows

(-45.0, 6.044472281577602)

So the area goes 6° north of Equator covering most of the northern hemisphere. The description is a bit optimistic for the Norwegian territory 😅

@gerritholl
Copy link
Member

Related?

#79
pytroll/pyresample#329

@TAlonglong
Copy link
Collaborator Author

#79 no, thats just the glob of the TLE environment variable that is empty. The other I have no idea.

@gerritholl
Copy link
Member

Both the TLE environment variable being empty and #329 lead to the geographic gatherer stopping, in case of #329 due to a strange area.

@gerritholl
Copy link
Member

Ah, no, never mind. That was trollflow2, not the gatherer, which silently stops in case of 329.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants