Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bloom-Gateway cache #11380

Merged
merged 11 commits into from
Dec 5, 2023
Merged

Bloom-Gateway cache #11380

merged 11 commits into from
Dec 5, 2023

Conversation

salvacorts
Copy link
Contributor

@salvacorts salvacorts commented Dec 4, 2023

What this PR does / why we need it:

This PR adds caching to the bloom-gateway client. It uses the result cache from #11343.

Here's how we:

  • Merge responses: group all chunks by FP and remove duplicated chunk checksums.
  • Extract responses based on time span: For all chunks in each FP, add to the extracted response only the chunks that overlaps with the desired start and end time.

Copy link
Contributor

github-actions bot commented Dec 4, 2023

Trivy scan found the following vulnerabilities:

@github-actions github-actions bot added the type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories label Dec 4, 2023
@salvacorts salvacorts force-pushed the salvacorts/bloom-gateway-cache branch from 035e329 to 957ce11 Compare December 4, 2023 15:25
pkg/bloomgateway/cache.go Outdated Show resolved Hide resolved
pkg/bloomgateway/cache.go Outdated Show resolved Hide resolved
pkg/bloomgateway/cache.go Outdated Show resolved Hide resolved
pkg/logproto/compat.go Show resolved Hide resolved

// GetQuery returns the query of the request.
// The query is the hash for the input chunks refs and the filter expressions.
func (m *FilterChunkRefRequest) GetQuery() string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is really expensive. Is it used as part of the cache key?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's used to compute the key. I agree it's probably too expensive.

pkg/bloomgateway/cache.go Outdated Show resolved Hide resolved
pkg/bloomgateway/cache.go Outdated Show resolved Hide resolved
sort.Slice(chunkRefs, func(i, j int) bool {
return chunkRefs[i].Fingerprint < chunkRefs[j].Fingerprint
})
return slices.CompactFunc(chunkRefs, func(next, prev *logproto.GroupedChunkRefs) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL: There is a slices.CompactFunc :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm ;) great function ;) previously I would do it manually ;))))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like it will join the chunks only for the first neighbor elements with the same fingerprint. if we have more than 2 neighbors with the same fingerprint, we will just lose chunk ids...
I am not sure if it's expected right now to have more than 2 elements with the same fingerprint but I would not leave this function because in the future somebody will spend days trying to find this bug.

look at this code snippet, https://go.dev/play/p/c_bt6c1HNUj?v= ,
I created a snippet with users, the first 6 users have ID a , and the next 4 users have ID b. if we use CompactFunc to aggregate all the scores of the users, we won't have the expected results...

compacted 2
&{id:a scores:1}
&{id:b scores:13}

but it has to be:

compacted 2
&{id:a scores:15}
&{id:b scores:30}

Copy link
Contributor Author

@salvacorts salvacorts Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! It's quite useful to remove / aggregate duplicated items in a sorted list As Vlad mentioned, this is buggy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right @vlad-diachenko good catch. I pushed a new commit with a test for this and a fix.

docs/sources/configure/_index.md Outdated Show resolved Hide resolved
pkg/bloomgateway/cache.go Outdated Show resolved Hide resolved
Copy link
Contributor

@vlad-diachenko vlad-diachenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewed partially and left some comments, I believe Chirstian has more context here, so completely rely on his review

Comment on lines +55 to +60
for _, ref := range chunkRef.Refs {
if model.Time(end) < ref.From || ref.Through <= model.Time(start) {
continue
}
refs = append(refs, ref)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but sorting also requires the same count of iterations to ensure each item is in order. and after this, we will do a binary search. so, I believe it's better to leave it as is. wdyt?

sort.Slice(chunkRefs, func(i, j int) bool {
return chunkRefs[i].Fingerprint < chunkRefs[j].Fingerprint
})
return slices.CompactFunc(chunkRefs, func(next, prev *logproto.GroupedChunkRefs) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm ;) great function ;) previously I would do it manually ;))))

sort.Slice(chunkRefs, func(i, j int) bool {
return chunkRefs[i].Fingerprint < chunkRefs[j].Fingerprint
})
return slices.CompactFunc(chunkRefs, func(next, prev *logproto.GroupedChunkRefs) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like it will join the chunks only for the first neighbor elements with the same fingerprint. if we have more than 2 neighbors with the same fingerprint, we will just lose chunk ids...
I am not sure if it's expected right now to have more than 2 elements with the same fingerprint but I would not leave this function because in the future somebody will spend days trying to find this bug.

look at this code snippet, https://go.dev/play/p/c_bt6c1HNUj?v= ,
I created a snippet with users, the first 6 users have ID a , and the next 4 users have ID b. if we use CompactFunc to aggregate all the scores of the users, we won't have the expected results...

compacted 2
&{id:a scores:1}
&{id:b scores:13}

but it has to be:

compacted 2
&{id:a scores:15}
&{id:b scores:30}

pkg/bloomgateway/client.go Outdated Show resolved Hide resolved
@@ -109,9 +116,25 @@ func (i *ClientConfig) RegisterFlags(f *flag.FlagSet) {
// RegisterFlagsWithPrefix registers flags for the Bloom Gateway client configuration with a common prefix.
func (i *ClientConfig) RegisterFlagsWithPrefix(prefix string, f *flag.FlagSet) {
i.GRPCClientConfig.RegisterFlagsWithPrefix(prefix+"grpc", f)
i.Cache.RegisterFlagsWithPrefix(f, prefix+"cache.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of a trap:

You call RegisterFlagsWithPrefix on the inlined object resultscache.Config. Therefore not all flags are registered.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by implementing RegisterFlagsWithPrefix in CacheConfig

h := xxhash.New()
for _, ref := range m.Refs {
encodeBuf = encodeBuf[:0]
_, _ = h.Write(fmt.Appendf(encodeBuf, "%d", ref.Fingerprint))
Copy link
Contributor

@chaudum chaudum Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative that does not require string->[]byte conversions:

Suggested change
_, _ = h.Write(fmt.Appendf(encodeBuf, "%d", ref.Fingerprint))
_, _ = h.Write(binary.LittleEndian.AppendUint64(encodeBuf[:0], ref.Fingerprint))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!, leaving fmt.Appendf for the operator since we need to be able to print the operator id.

@salvacorts salvacorts marked this pull request as ready for review December 5, 2023 13:51
@salvacorts salvacorts requested a review from a team as a code owner December 5, 2023 13:51
Copy link
Contributor

@vlad-diachenko vlad-diachenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@salvacorts salvacorts merged commit a0b462d into main Dec 5, 2023
6 checks passed
@salvacorts salvacorts deleted the salvacorts/bloom-gateway-cache branch December 5, 2023 15:24
rhnasc pushed a commit to inloco/loki that referenced this pull request Apr 12, 2024
**What this PR does / why we need it**:

This PR adds caching to the bloom-gateway client. It uses the result
cache from grafana#11343.

Here's how we:
- Merge responses: group all chunks by FP and remove duplicated chunk
checksums.
- Extract responses based on time span: For all chunks in each FP, add
to the extracted response only the chunks that overlaps with the desired
start and end time.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/XL type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants