-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Client errors are difficult to handle correctly #24
Comments
Tracked down the problematic messages served by jetstream us-east-1 this morning since they are still in retention for the next few hours bad message 1
bad message 2
edit: oh hey it's @DavidBuchanan314 👋🏼 |
The client can't assume a jetstream server will always produce valid messages, or that it won't forward evil messages that break parsing [like a JSON object with 10,000 levels nested `[[[[[[]]]]]]`](bluesky-social#24). This change makes the reader continue after encountering any messages that can't be decompressed or unmarshalled into JSON -- they will be logged as errors but the client will continue. The handler will not be called on these events. The sequential scheduler is also modified to no longer quit with error if its handler func returns an error -- it just logs. This matches the behaviour of the parallel scheduler. Maybe different behaviour is fine, I guess it's to taste.
this might not be a fix for your problem but other folks i know who use jetstream have vouched for jetstream-2 being generally more stable. your mileage may vary, though |
Thanks for the tip! I'm tired and my search attempts are failing -- any chance you have a link for jetstream-2? |
@heikadog I keep forgetting to get back to this but it eventually occurred to me that by "jetstream-2" you probably meant using If so, it's a good suggestion but separate from the problem described in this issue. For example, "evil" events in the actual stream like those I referenced in this comment will be transmitted by all jetstream instances and cause the read loop from the client helper in this repo to exit, even though the websocket connection was fine. |
Opening this issue just so that it's written down -- obviously y'all have way more important things happening right now 📈.
I'm struggling to figure out how to use the jetstream client package to both
When my app starts up, it fetches a
cursor
from the event it last received in order to pick up where it left off to use in its new connection. Unfortunately, it gets stuck where it can't advance at all if an event from the jetstream server cannot be processed: the app will retry starting exactly at that unprocessable message, fail, and try again, never getting a new value for its cursor.Unprocessable messages
My app was stuck this morning, and I found this as the cause:
Apparently the max depth for json unmarshalling is 10,000. So this was an unusual event 😅
The jetstream client currently just bails out on this (uncleanly), and since it's not a transient failure, any apps using this client with their last event as a cursor get stuck.
Possible options
Some ideas that come to my mind:
Client users could accept losing some events and manually advance the cursor forward some amount if they keep being unable to make progress after reconnecting. This will lose events. If clients should do this, it might be helpful to document that expectation.
Client users could choose a conservative
maxMessageSizeBytes
to avoid receiving some unprocessable events. Obviously not a complete solution, and I'm not even sure how low the threshold would have to be in this case. A big[[[[[[[[[[]]]]]]]]]]
in the event probably compresses very well.The jetstream client could ignore events that error on
c.decoder.DecodeAll
andjson.Unmarshal
, instead of ending reading. This will lose the problematic events, but maybe that's a reasonable outcome. But if websockets don't offer integrity for messages already, then it's possible that some messages will be lost that are actually valid. Maybe we get assurance of integrity already from TLS?The jetstream client could receive the error payload for unprocessable events, eg
and send a
nil
event with the error, letting the client user decide what to do. This is a bit annoying for client users....and probably other options
For now I'll fork to drop and log events that fail decoding or unmarshalling. Happy to PR it if you think it's a reasonable standard behaviour.
Sorry this became such an essay and thanks for making jetstream!!
The text was updated successfully, but these errors were encountered: