Have a new attache-check which goes into WARN (HTTP 201) when redis is syncing data #29

beautifulentropy · 2022-05-26T22:35:49Z

Redis replicas can take a while to sync data when they're being added to an existing cluster with a lot of data. It would be nice to show operators that even if the node is not ok it's at least working on getting there.

The text was updated successfully, but these errors were encountered:

jcjones · 2022-06-23T16:42:07Z

I'll expand on this. In a cluster redeployment, the node whose logs we are watching was a master, and its replica 656dc9b7acaefd7065849db76bf0648460aa83e9 is promoted while it shut down. After it comes back online, it becomes a replica again.

1436339:M 23 Jun 2022 16:31:49.560 # Configuration change detected. Reconfiguring myself as a replica of 656dc9b7acaefd7065849db76bf0648460aa83e9

Once it reports healthy, Nomad goes on to 656dc9b7acaefd7065849db76bf0648460aa83e9 and shuts it down while the sync is ongoing:

Jun 23, '22 09:33:04 -0700 	Killed 	Task successfully killed
Jun 23, '22 09:33:04 -0700 	Terminated 	Exit Code: 0
Jun 23, '22 09:33:02 -0700 	Killing 	Sent interrupt. Waiting 5m0s before force killing

This shows up in the Redis replica as:

1436339:S 23 Jun 2022 16:33:04.065 # I/O error trying to sync with MASTER: connection lost
1436339:S 23 Jun 2022 16:33:06.803 # Error condition on socket for SYNC: (null)
1436339:S 23 Jun 2022 16:33:09.823 # Error condition on socket for SYNC: (null)
1436339:S 23 Jun 2022 16:33:17.878 # Error condition on socket for SYNC: (null)
1436339:S 23 Jun 2022 16:33:18.783 # Currently unable to failover: Disconnected from master for longer than allowed. Please check the 'cluster-replica-validity-factor' configuration option.

And then we lose data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Have a new attache-check which goes into WARN (HTTP 201) when redis is syncing data #29

Have a new attache-check which goes into WARN (HTTP 201) when redis is syncing data #29

beautifulentropy commented May 26, 2022

jcjones commented Jun 23, 2022

Have a new attache-check which goes into WARN (HTTP 201) when redis is syncing data #29

Have a new attache-check which goes into WARN (HTTP 201) when redis is syncing data #29

Comments

beautifulentropy commented May 26, 2022

jcjones commented Jun 23, 2022