-
Notifications
You must be signed in to change notification settings - Fork 995
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add EventCacheCount as member of BinlogSyncerConfig to limit streamer's event channel size #830
Conversation
When consuming events from stream is blocked, the chan may use too much memory
When the consumption speed of the event chan in streamer cannot catch up with its production speed, leading to an accumulation of events, the current fixed channel size of 10240 might occupy significant memory, potentially triggering an Out Of Memory (OOM) condition. Making the size of the event chan configurable would allow for controlling the memory usage of the streamer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest LGTM.
This PR is good enough. And in future if you find it's hard to set a fixed value of cache count due to binlog events vary in event size, we can introduce another memory-based limit. For example, in GetEvent
and parseEvent
we maintain the sum of approximate memory consumption in cache queue, and block the streamer from reading the binlog when it costs too much memory.
replication/binlogsyncer.go
Outdated
@@ -122,6 +122,8 @@ type BinlogSyncerConfig struct { | |||
RowsEventDecodeFunc func(*RowsEvent, []byte) error | |||
|
|||
DiscardGTIDSet bool | |||
|
|||
EventCacheSize int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name "XXXSize" makes me think it's using the unit of memory, (EventCacheSize = 1024 means the cache will not exceed 1KB), maybe "EventCacheCount" is a better name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed! I'll work on the fixes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Directly limiting memory usage requires dynamically calculating the memory occupancy of the Event Cache during the streaming process, which might lead to a decrease in overall efficiency.
In scenarios where the network conditions are not too poor, even setting the EventCacheCount to a single digit won't have much worse performance compared to 10240, while at the same time, the memory of the Event Cache will be limited to the size of a few Events, which is enough to achieve a balance between efficiency and resource occupancy.
…me more reflective of its actual meaning.
#829
I have tested on my own MySQL instance with an 8GB size Binlog, when the channel size is the default 10240, during the Binlog transmission, occasional nearly 2GB memory usage will be seen.
Additionally, I have written a test program to calculate the theoretical maximum memory usage size under different channel size situations. The theoretical result match the test.
This is the theoretical usage calc function:
`
func (c *MySQLClient) GetMaxWindowSize(ctx context.Context, streamerLength uint16) (es *EventStatistic, err error) {
es = new(EventStatistic)
es.StreamerChannelLength = streamerLength
startTime := time.Now()
}`