Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bloom gateway: Add metrics for store operations and chunk ref counts #11677

Merged
merged 7 commits into from
Jan 17, 2024

Conversation

chaudum
Copy link
Contributor

@chaudum chaudum commented Jan 15, 2024

What this PR does / why we need it:

For better observability of the bloom gateway, this PR adds two additional metrics that expose the amount of chunk refs pre and post filtering. This can be used to calculate the filter ratio of the gateways.

Also, the ForEach operation on the bloom store is measured so that the latency of fetching/extracting the blocks can be observed.

@chaudum
Copy link
Contributor Author

chaudum commented Jan 15, 2024

There is still something wrong with the metric, as well as with the log line reporting the filtered/unfiltered chunks:

level=debug component=bloom-gateway msg="got partial result" task=01HM6BZ3JC0000000000000000 tenant=pSvmdWgapHuV fp_int=2996312756633854740 fp_hex=29950a93b721e314 chunks_to_remove=1 progress=1/10
level=debug component=bloom-gateway msg="got partial result" task=01HM6BZ3JC0000000000000000 tenant=pSvmdWgapHuV fp_int=3563335267916568957 fp_hex=3173828bfbc9d97d chunks_to_remove=1 progress=2/10
level=debug component=bloom-gateway msg="got partial result" task=01HM6BZ3JC0000000000000000 tenant=pSvmdWgapHuV fp_int=4498369798574941795 fp_hex=3e6d6b7f9867a663 chunks_to_remove=1 progress=3/10
level=debug component=bloom-gateway msg="got partial result" task=01HM6BZ3JC0000000000000000 tenant=pSvmdWgapHuV fp_int=6987154430012527099 fp_hex=60f75bef3fb191fb chunks_to_remove=1 progress=4/10
level=debug component=bloom-gateway msg="got partial result" task=01HM6BZ3JC0000000000000000 tenant=pSvmdWgapHuV fp_int=7247273752785407266 fp_hex=64937d1498c94d22 chunks_to_remove=1 progress=5/10
level=debug component=bloom-gateway msg="got partial result" task=01HM6BZ3JC0000000000000000 tenant=pSvmdWgapHuV fp_int=7408050871828394598 fp_hex=66ceaf04b55ab266 chunks_to_remove=1 progress=6/10
level=debug component=bloom-gateway msg="got partial result" task=01HM6BZ3JC0000000000000000 tenant=pSvmdWgapHuV fp_int=8189557340463894322 fp_hex=71a72702d20b0f32 chunks_to_remove=1 progress=7/10
level=debug component=bloom-gateway msg="got partial result" task=01HM6BZ3JC0000000000000000 tenant=pSvmdWgapHuV fp_int=14783519863536325070 fp_hex=cd299cfd4c0391ce chunks_to_remove=1 progress=8/10
level=debug component=bloom-gateway msg="got partial result" task=01HM6BZ3JC0000000000000000 tenant=pSvmdWgapHuV fp_int=15902666293836446393 fp_hex=dcb19ebd01b2fab9 chunks_to_remove=1 progress=9/10
level=debug component=bloom-gateway msg="got partial result" task=01HM6BZ3JC0000000000000000 tenant=pSvmdWgapHuV fp_int=16924052664881228960 fp_hex=eade503b323faca0 chunks_to_remove=0 progress=10/10
level=info component=index-gateway msg="return filtered chunk refs" unfiltered=10 filtered=10

@pull-request-size pull-request-size bot added size/L and removed size/M labels Jan 15, 2024
@chaudum
Copy link
Contributor Author

chaudum commented Jan 15, 2024

There is still something wrong with the metric, as well as with the log line reporting the filtered/unfiltered chunks:

level=debug component=bloom-gateway msg="got partial result" task=01HM6BZ3JC0000000000000000 tenant=pSvmdWgapHuV fp_int=2996312756633854740 fp_hex=29950a93b721e314 chunks_to_remove=1 progress=1/10
level=debug component=bloom-gateway msg="got partial result" task=01HM6BZ3JC0000000000000000 tenant=pSvmdWgapHuV fp_int=3563335267916568957 fp_hex=3173828bfbc9d97d chunks_to_remove=1 progress=2/10
level=debug component=bloom-gateway msg="got partial result" task=01HM6BZ3JC0000000000000000 tenant=pSvmdWgapHuV fp_int=4498369798574941795 fp_hex=3e6d6b7f9867a663 chunks_to_remove=1 progress=3/10
level=debug component=bloom-gateway msg="got partial result" task=01HM6BZ3JC0000000000000000 tenant=pSvmdWgapHuV fp_int=6987154430012527099 fp_hex=60f75bef3fb191fb chunks_to_remove=1 progress=4/10
level=debug component=bloom-gateway msg="got partial result" task=01HM6BZ3JC0000000000000000 tenant=pSvmdWgapHuV fp_int=7247273752785407266 fp_hex=64937d1498c94d22 chunks_to_remove=1 progress=5/10
level=debug component=bloom-gateway msg="got partial result" task=01HM6BZ3JC0000000000000000 tenant=pSvmdWgapHuV fp_int=7408050871828394598 fp_hex=66ceaf04b55ab266 chunks_to_remove=1 progress=6/10
level=debug component=bloom-gateway msg="got partial result" task=01HM6BZ3JC0000000000000000 tenant=pSvmdWgapHuV fp_int=8189557340463894322 fp_hex=71a72702d20b0f32 chunks_to_remove=1 progress=7/10
level=debug component=bloom-gateway msg="got partial result" task=01HM6BZ3JC0000000000000000 tenant=pSvmdWgapHuV fp_int=14783519863536325070 fp_hex=cd299cfd4c0391ce chunks_to_remove=1 progress=8/10
level=debug component=bloom-gateway msg="got partial result" task=01HM6BZ3JC0000000000000000 tenant=pSvmdWgapHuV fp_int=15902666293836446393 fp_hex=dcb19ebd01b2fab9 chunks_to_remove=1 progress=9/10
level=debug component=bloom-gateway msg="got partial result" task=01HM6BZ3JC0000000000000000 tenant=pSvmdWgapHuV fp_int=16924052664881228960 fp_hex=eade503b323faca0 chunks_to_remove=0 progress=10/10
level=info component=index-gateway msg="return filtered chunk refs" unfiltered=10 filtered=10

Fixed with e732ea3

@chaudum chaudum marked this pull request as ready for review January 15, 2024 13:33
@chaudum chaudum requested a review from a team as a code owner January 15, 2024 13:33
Copy link
Contributor

@vlad-diachenko vlad-diachenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks awesome. just one question

@chaudum
Copy link
Contributor Author

chaudum commented Jan 16, 2024

The test is still flaky

@chaudum
Copy link
Contributor Author

chaudum commented Jan 16, 2024

The test is still flaky

I ran the test with -count=20 and did not get an error. Still not sure whether it actually solved. From the logs I could verify the correctness, though.

Copy link
Contributor

@vlad-diachenko vlad-diachenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks brilliant 💎

pkg/bloomgateway/bloomgateway.go Show resolved Hide resolved
This commit adds counter metrics for chunk refs pre filtering and post
filtering.
These metrics can be used to calculate the filter-ratio of the bloom
filters.

Signed-off-by: Christian Haudum <[email protected]>
After chunks were successfully filtered, the bug was within the removal
of the chunks from the original list. It only filtered out the chunks
from the last partial response from the block querier.

Signed-off-by: Christian Haudum <[email protected]>
Signed-off-by: Christian Haudum <[email protected]>
@chaudum chaudum force-pushed the chaudum/additional-bloomgateway-metrics branch from 8db007e to e144b42 Compare January 17, 2024 07:12
@chaudum chaudum enabled auto-merge (squash) January 17, 2024 07:15
@chaudum chaudum merged commit bdcb695 into main Jan 17, 2024
8 checks passed
@chaudum chaudum deleted the chaudum/additional-bloomgateway-metrics branch January 17, 2024 07:28
rhnasc pushed a commit to inloco/loki that referenced this pull request Apr 12, 2024
…rafana#11677)

For better observability of the bloom gateway, this PR adds two
additional metrics that expose the amount of chunk refs pre and post
filtering. This can be used to calculate the filter ratio of the
gateways.

The PR also adds a metric that observes the latency of the actual
processing time of bloom filters within the worker.

---------

Signed-off-by: Christian Haudum <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants