Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(downstreamer leak): Use buffered channel to prevent goroutine leak #13757

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

joanmp-ndtx
Copy link

@joanmp-ndtx joanmp-ndtx commented Aug 5, 2024

What this PR does / why we need it:
Fix goroutine leak due to request errors on downstreamer.go

Which issue(s) this PR fixes:
Fixes #13735

Special notes for your reviewer:
Quick fix using a buffered channel.

Checklist

  • Reviewed the CONTRIBUTING.md guide (required)
  • Documentation added
  • Tests updated
  • Title matches the required conventional commits format, see here
    • Note that Promtail is considered to be feature complete, and future development for logs collection will be in Grafana Alloy. As such, feat PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/setup/upgrade/_index.md
  • For Helm chart changes bump the Helm chart version in production/helm/loki/Chart.yaml and update production/helm/loki/CHANGELOG.md and production/helm/loki/README.md. Example PR
  • If the change is deprecating or removing a configuration option, update the deprecated-config.yaml and deleted-config.yaml files respectively in the tools/deprecated-config-checker directory. Example PR

@joanmp-ndtx joanmp-ndtx requested a review from a team as a code owner August 5, 2024 15:05
@CLAassistant
Copy link

CLAassistant commented Aug 5, 2024

CLA assistant check
All committers have signed the CLA.

@cyriltovena
Copy link
Contributor

The context cancellation should already do the job, are you seeing a leak ? Can you share more information ?

@joanmp-ndtx
Copy link
Author

Hi @cyriltovena,
Yes, we are seeing goroutine leak.

The method returns but there are routines waiting at the channel because no body is reading.

Let's say that we have N jobs to execute in line 174.

  1. The first job fails and sends a message to channel at line 190.
  2. Due to the error in this first job, the method return on line 200.
  3. No body is waiting for messages on the channel, but N-1 routines are waiting at the channel.

// ForEachJob blocks until all are done. However, we want to process the
// results as they come in. That's why we start everything in another
// gorouting.
go func() {
err := concurrency.ForEachJob(ctx, len(queries), in.parallelism, func(ctx context.Context, i int) error {
res, err := fn(queries[i])
response := logql.Resp{
I: i,
Res: res,
Err: err,
}
// Feed the result into the channel unless the work has completed.
select {
case <-ctx.Done():
case ch <- response:
}
return err
})
if err != nil {
ch <- logql.Resp{
I: -1,
Err: err,
}
}
close(ch)
}()
for resp := range ch {
if resp.Err != nil {
return nil, resp.Err
}
if err := acc.Accumulate(ctx, resp.Res, resp.I); err != nil {
return nil, err
}
}
return acc.Result(), nil
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Goroutine leak in downstreamer
3 participants