You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In my application, I've seen the following behavior happen twice this week (but it seems to be rare, though hard to know how many times it hasn't been reported):
User X can trigger database modifications (via methods) and other users see those changes.
However, when other users do the same, User X can't see the changes made (despite having a live subscription).
Reloading the page (triggering a new subscription) refreshes the database: User X can then see now see the current database (including all the new updates since last page load), but user X still won't receive any future updates.
However, if User X has two tabs open with the same page and same subscriptions, reloading either tab has no effect: no new database records arrive. Presumably this is because the subscription isn't destroyed and renewed when only one of the two tabs reloads (because only one of the two copies of the subscription gets rebuilt).
Restarting all the Meteor servers fixes the problem (until it happens a few days later, with a different user X).
The only relevant logs I can see are rare instances of
RedisOplog - Connection to redis ended
RedisOplog - There was an error when re-connecting to redis {"delay":10000,"attempt":1,"error":{"errno":"ECONNRESET","code":"ECONNRESET","syscall":"read"},"total_retry_time":0,"times_connected":1}
RedisOplog - Established connection to redis.
So I would suspect there's some edge-case in the reconnect behavior. Possibly relevant is that I sometimes see some logs (from method calls) in between "an error when re-connecting" and "Established connection", so maybe if methods get called or subscriptions happen in between disconnection and reconnection something bad happens?
I read #291 which seems possibly relevant. Perhaps this line should be changed from onConnect to onReady as suggested in #291? But I'm still having trouble seeing why the redis subscription would never take (not get any new updates) instead of the cache just having a few invalid documents...
In case it's relevant, my app runs in the following configuration:
One server runs nginx reverse proxy and redis server
One server runs Mongo
Four servers run Meteor app. All of them have the same number of hiccups dis/reconnecting to redis (in the logs I have, each dis/reconnected exactly once). [This is weird, because the Redis server has been running uninterrupted for 15 days.] Presumably one of these servers is getting borked, but I'm not sure whether all users/subscriptions on the same server are having trouble (not sure either way).
About 100 simultaneous users, but each user makes a lot of db operations via methods (up to 15/sec). Each method is lightweight, but some of them block. (I could handle around 50 simultaneous users with oplog tailing on one server, after which I got method call delays of 50-1000 seconds, but have had overall great performance since moving to the above configuration.)
FWIW, I also see some of these error messages:
RedisOplog - Warning - A race condition occurred when running upsert.
But I'm not worried about the races occurring during upsert, so I don't think this is a big issue.
The text was updated successfully, but these errors were encountered:
edemaine
changed the title
redis in write-only state?
redis in write-only/unsubscribed state after reconnect?
Oct 2, 2020
In my application, I've seen the following behavior happen twice this week (but it seems to be rare, though hard to know how many times it hasn't been reported):
The only relevant logs I can see are rare instances of
So I would suspect there's some edge-case in the reconnect behavior. Possibly relevant is that I sometimes see some logs (from method calls) in between "an error when re-connecting" and "Established connection", so maybe if methods get called or subscriptions happen in between disconnection and reconnection something bad happens?
I read #291 which seems possibly relevant. Perhaps this line should be changed from
onConnect
toonReady
as suggested in #291? But I'm still having trouble seeing why the redis subscription would never take (not get any new updates) instead of the cache just having a few invalid documents...In case it's relevant, my app runs in the following configuration:
FWIW, I also see some of these error messages:
But I'm not worried about the races occurring during
upsert
, so I don't think this is a big issue.The text was updated successfully, but these errors were encountered: