-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Size of message problems with hopskotch? #87
Comments
It would be good to know which error was thrown. Does anyone have a log file? I suspect this is unrelated, but just in case, scimma included a bug fix in hop-client |
I have an idea for a solution. According to Kafka documentation, there is a maximum message size of 1,048,588 bytes (~1 MB), but it is configurable. For the example that @KaraMelih mentioned in PR #98 of 100,000 integers, the size of those integers are 800,056 bytes, which is less that the threshold above, but we're sending multiple SNEWS messages in one go, so it quickly adds up. There are multiple paths in between
That last class— So... We should be able to increase the maximum message size by passing the kwarg into here: SNEWS_Publishing_Tools/snews_pt/messages.py Lines 51 to 53 in f9b990a
This would be the appropriate change (for 5x max message size): self.stream = Stream(until_eos=True, auth=self.auth).open(
self.obs_broker,
'w',
message_max_bytes=5242940 # <--- New
) I have not tested this. |
The scimma people report that the default max message size is 1MB. They're leery to increase that to head off future scaling problems, and are working on a large file offload service. But: now knowing this we can plan for it, with efficiencies like this PR helping. Could imagine daisy-chaining things together like SMS messages that go over 140 characters, too. |
@habig: and if that default is something we can override, like in the example above, then problem solved! No need for them to change the default on their end. |
We were just discussing this now. If raising the infrastructure limit is the answer (we need to test this), then can we raise it enough to solve the problem? If not, then we'd need to "packetize" things anyway. If we have to write that code, then living within our 1MB per packet means is probably the way to go. Marta says about 1s of data in the last test fit in the 1MB. So we'd need an order of magnitude more overhead to make ~10s of light curve go in one shot. |
We should in any case, warn the user about the large message size and what we do to process. Something like; "Input data is larger than >1mb, we will chunk it into pieces and send several messages" etc. So that if something goes wrong, it is more obvious where it might have gone wrong. Alternative to chunking, @justinvasel looked into compression options. We should probably check if kafka offers any of these internally. Then, we also need to validate whatever chunked/compressed gets unchunked/decompressed properly on the other end. |
When trying to push through the timing tier messages, we got some kafka warnings suggesting that we were trying to jam through too much.
We need to replicate that with a well-specified test case, collect the exact error message, and take it to SCIMMA to see what they think is happening.
The text was updated successfully, but these errors were encountered: