Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bloom compactor: Load blocks lazily in batches #11919

Merged
merged 1 commit into from
Feb 13, 2024

Conversation

chaudum
Copy link
Contributor

@chaudum chaudum commented Feb 12, 2024

What this PR does / why we need it:

To avoid loading possibly lots of blocks upfront, this PR introduces lazy loading of blocks in batches using an iterator that loads blocks on demand.


This is the first part of making the block loading truly lazy.
As a second part, the interface of the merge build is going to change so it accepts a single iterator.

@chaudum chaudum requested review from owen-d and salvacorts February 12, 2024 13:23
@chaudum chaudum marked this pull request as ready for review February 12, 2024 13:23
@chaudum chaudum requested a review from a team as a code owner February 12, 2024 13:23
func newBatchedBlockLoader(ctx context.Context, fetcher blocksFetcher, blocks []bloomshipper.BlockRef) (*batchedBlockLoader, error) {
return &batchedBlockLoader{
ctx: ctx,
batchSize: 10, // make configurable?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this should be configurable

err.Add(itr.CloseBatch())
default:
// close remaining loaded blocks
for itr.Next() && itr.Err() == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we ever reach here? As far as I can see closeLoadedBlocks is only called from buildBlocks which will pass whatever loadWorkForGap returns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the function accepts an interface v1.CloseableInterator[*bloomshipper.CloseableBlockQuerier] and not a concrete type, it could be possible that a non-batched version of the iterator is passed.

owen-d added a commit that referenced this pull request Feb 13, 2024
…rs (#11924)

While reviewing #11919, I figured
it'd be nice to make `batchedLoader` generic so we can reuse it's logic.
This let me test it easier and remove a lot of now-unnecessary adapter
code (interfaces, types)
@chaudum chaudum force-pushed the chaudum/lazyloading-bloomcompactor-work branch from dafa7f0 to 35c21ac Compare February 13, 2024 07:56
@chaudum chaudum enabled auto-merge (squash) February 13, 2024 07:58
@chaudum chaudum merged commit eb8464a into main Feb 13, 2024
8 checks passed
@chaudum chaudum deleted the chaudum/lazyloading-bloomcompactor-work branch February 13, 2024 08:10
rhnasc pushed a commit to inloco/loki that referenced this pull request Apr 12, 2024
…rs (grafana#11924)

While reviewing grafana#11919, I figured
it'd be nice to make `batchedLoader` generic so we can reuse it's logic.
This let me test it easier and remove a lot of now-unnecessary adapter
code (interfaces, types)
rhnasc pushed a commit to inloco/loki that referenced this pull request Apr 12, 2024
To avoid loading possibly lots of blocks upfront, this PR introduces
lazy loading of blocks in batches using an iterator that loads blocks on
demand.

Signed-off-by: Christian Haudum <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants