-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only retrieve updated query statement metrics #19321
base: master
Are you sure you want to change the base?
Only retrieve updated query statement metrics #19321
Conversation
hostname=self._check.resolved_hostname, | ||
) | ||
|
||
monotonic_rows = self._filter_query_rows(monotonic_rows) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to filter the rows after retrieving from the database? can you still leverage WHERE
digest_textNOT LIKE 'EXPLAIN %' OR
digest_text IS NULL
with last_seen
filter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could but it slows down the query and since we're not limiting anymore it's simple enough to filter them on the client. In practice it seems like EXPLAINs are never a substantial number of rows.
What does this PR do?
Updates MySQL statement collection to only gather statements that have executed since the last collection. This functionality is behind a flag to allow for slow rollout and testing before hopefully making it standard.
Four variant builds were compared:
Testing was performed on orders app with 1 minute bursts of high cardinality queries every 5 minutes.
Somewhat surprisingly, querying for the digest and then for digest text made little difference in performance, even with large random statements (~1kb) executed during bursts. When querying for only updated rows, the much smaller row count allows for fetching the text as well, avoiding a second database query.
CPU usage is greatly reduced during normal load, and somewhat reduced during high cardinality bursts.
Query time and overall statement metrics collection time (note that y-axis is in seconds, not ms) are greatly reduced.
Motivation
Customers have complained about CPU usage by the MySQL check. This change uses a similar technique to the other database integrations to minimize the amount of unnecessary data retrieved for each statement collection.
Review checklist (to be filled by reviewers)
qa/skip-qa
label if the PR doesn't need to be tested during QA.backport/<branch-name>
label to the PR and it will automatically open a backport PR once this one is merged