Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pool goroutines #841

Closed
wants to merge 2 commits into from
Closed

Pool goroutines #841

wants to merge 2 commits into from

Conversation

edsrzf
Copy link
Contributor

@edsrzf edsrzf commented Aug 19, 2019

This is a draft proposal/approach for pooling goroutines. It's not ready to merge, but available for early feedback on the general approach before I go through the effort of trying to add tests, docs, and add further configurability.

There are two problems I'd like to solve:

There may also be performance benefits to pooling goroutines, but I haven't measured yet. Judging from #465 it looks like there's potential.

The solution is relatively simple. It adds a new Pool type which can be configured with a goroutine limit. The limit defaults to 0, in which case there is no limit, but goroutines are still pooled and reused. RequestContext now has a Pool field that's used for kicking off goroutines, so goroutines are pooled per-request.

I have:

  • Added tests covering the bug / feature (see testing)
  • Updated any relevant documentation (see docs)

@coveralls
Copy link

Coverage Status

Coverage increased (+0.2%) to 62.034% when pulling 85a2c4d on edsrzf:goroutine-pool into 7e643fd on 99designs:master.

@vektah
Copy link
Collaborator

vektah commented Aug 20, 2019

I had a similar experiment a while ago to benchmark the difference, one pool for everything though. Most of the performance savings were from reusing stacks, and avoiding calls to morestack (it wasn't a large performance saving)

I think go already does some pooling of goroutines, but they don't hold onto the grown stacks. I get the feeling as this is a separate pool for each request it probably won't have much of a performance impact on but it's worth measuring.

Limiting concurrency can be done very cheaply using atomic. I guess this hinges on what you want to do with dataloading, which probably needs to be moved into gqlgen.

I've wondered for a long time if dataloaders are actually doing too much, caching can happen more naturally in the graph based on Id+type, perhaps collecting batches becomes much simpler addressing it on it's own, with some interface changes gqlgen probably doesn't need goroutines per item.

@edsrzf
Copy link
Contributor Author

edsrzf commented Aug 20, 2019

While limiting concurrency would be kind of nice, I'm mostly after a solution to the dataloader issues.

I'm curious whether you've put much thought into this:

with some interface changes gqlgen probably doesn't need goroutines per item

I'm keen to find a solution that most people are happy with and willing to do some work toward it.

@vektah
Copy link
Collaborator

vektah commented Aug 20, 2019

I'm curious whether you've put much thought into this:

with some interface changes gqlgen probably doesn't need goroutines per item

A little, not enough. My rough thought process goes like this:

  • we could simply make resolvers work on slices, but that interface would be terrible most of the time where single loading is fine.
  • could use an @batch annotation to make some resolvers as a batch, but still means repeating the "fetch N users by n ids" call all the places that could return users
  • for the relay node and apollo federation interfaces what we really want is a way to "fetch N entities by N ids + types"

I'm not really sure how the relationship between the graph and the fetching interface works though.
eg given the ability to lookup users

type ChatMessage {
   author: User
}

What should chat message implement? can we use directives?

  • could be thunks?
  • could be returning a ref to id + type. is there some crossover with federated external references?
  • could be goroutines + pooling, but getting smarter about where concurrency actually happens and remove most of them

Definitely more questions than answers.

@bugzpodder
Copy link

we could simply make resolvers work on slices, but that interface would be terrible most of the time where single loading is fine.

My approach was to just separate out the resolver into two resolveLoader and resolveField steps. resolveLoader loads the data and assign it to the field in the api struct. resolveField just returns whatever that's in the field. It works well now that resolvers in gqlgen returns a slice of pointers.

@stale
Copy link

stale bot commented Oct 21, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Oct 21, 2019
@edsrzf
Copy link
Contributor Author

edsrzf commented Oct 25, 2019

I'm still keen to see some of the underlying issues here addressed, but it seems like this isn't the solution.

@edsrzf edsrzf closed this Oct 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants