-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add thread pool and concurrency_max_threads configuration option #470
Add thread pool and concurrency_max_threads configuration option #470
Conversation
This option allows to limit the maximum number of resources that can be sideloaded concurrently. With a properly configured connection pool, this ensures that the activerecord's connection pool is not exhausted by the sideloading process. Closes #469
Is it possible the test failure is flaky? It doesn't look related to the changes. |
@MattFenelon yep, I re-ran it and the tests pass now. I've run into some thread pool issues on the app I use graphiti with, but never traced it back to the sideloads… so this is an interesting find! It looks like the new code sets the default at 4? I'm not seeing where that default was set before your change, and just want to ensure that this doesn't do anything unexpected for people who are running just fine. In your configuration you set the value at 1? |
Nice. Thank you!
That's a little tricky to answer :) The short answer is that the code was using Concurrent::Promise and that uses concurrent-ruby's global io thread pool, which has a max size equal to Java's max int: https://github.com/ruby-concurrency/concurrent-ruby/blob/8b9b0da4a37585ce5eb71516aca55e93bde39115/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb#L15. The problem with the default is that people will likely see these threading issues pop up so I think the default needs to be set lower. However, lowering the default will mean that people may see less concurrency happening than they were before. I went with 4 as that's what Rails' uses as its default pool size for async queries. But it's fairly arbitrary. I have tested it configured to 1 and that works happily because the fallback option Perhaps it would be worth updating the docs to tell people they need to set their db connection pool appropriately otherwise even with this change they may see threading issues. wdyt? |
@MattFenelon Thanks for the details.
But your theory is that if they're seeing those errors with the new explicit default set at 4, they'd have been seeing them with the implicit default being at 4 currently? I only recently became a maintainer on this repo and don't yet have access to editing docs. @richmolj's time is totally consumed by a tsunami of life so I'm not sure when he'll get me plugged into that side of things. https://www.graphiti.dev/guides/concepts/resources#concurrency If I'm understanding the issue correctly though, will having |
Sorry, I could have been clearer.
That's it exactly. There's basically no safe default without configuring the database pool correctly. This just lowers the possibility as there'll only ever be 4 threads side loading at once rather than We could set the default to 1 but even so there's still a chance the pool will become exhausted. For example, with puma set to 5 threads and the pool set to 5, if the system is processing 5 requests at the same time the connection pool is already exhausted and sideloading any resources at that point will raise the error. But |
I think your logic is sound? I'm not sure what the best default is that would hit that mark of best performance without issues. I feel tiny bit out of my depth when it comes to thread-related issues, but one thing that jumps out at me is that if this new setting is just for concurrency relating to sideloads, I think naming the configuration option to more clearly reflect that would be helpful. |
Sounds good. I went with graphiti/lib/graphiti/configuration.rb Lines 7 to 9 in f68b61f
Thanks for the reviews on this! |
I just realised that the executor initialisation wasn't thread-safe so I've fixed that. |
@MattFenelon Oh didn't realize that re: the concurrency flag. Your naming makes perfect sense then—nice work, and good catch on the thread-safety issue. I fixed a silly lint error only the CI robots care about, but the rest looks good to me and I'll cut a minor release with this change. Thanks for the contribution! |
Ha thanks. My editor really wants to auto format a lot of that code 😅 Looking forward to the release! |
# [1.6.0](v1.5.3...v1.6.0) (2024-03-20) ### Features * add thread pool and concurrency_max_threads configuration option ([#470](#470)) ([697d761](697d761))
🎉 This PR is included in version 1.6.0 🎉 The release is available on:
Your semantic-release bot 📦🚀 |
@MattFenelon I had to make a small change to the thread-safety logic (1.6.1) after running into some major lockups in production (as in, the web server would stop responding) with the change as it was. When trying it out locally, the change I made to how the mutex was set up made all the difference, but redeploying this just now still causes a lock-up in production when concurrency is enabled? I don't know what/why that is happening? I'm at the end of a long day here, but if you see this in the morning see if anything jumps out at you? |
Right ok. I think it might be because the mutex variable is not actually global but an instance variable on the class. It needs to be global to work properly. We could make it a constant or a class variable ‘@@mutex’. I just can’t remember all of the inheritance issues with constants and class variables off the top of my head.I’m not at a computer right now but I can raise a PR in a bit.
|
@jkeen I've raised a new PR to fix the issue. I can't say for certain that it was what caused the issue you saw but it seems likely. |
Thanks again for the merge @jkeen. Apologies for the muck up there. |
@MattFenelon No problem at all, thanks for the quick turnaround! |
…er handle on what's causing the thread-pools to hang
…er handle on what's causing the thread-pools to hang
…er handle on what's causing the thread-pools to hang
…n what's causing thread pool hangs revert: "thread pool scope and mutex need to be global across all instances of Scope for it to be a global thread pool (#471)" revert: "add thread pool and concurrency_max_threads configuration option (#470)" This reverts commit 51fb51c. This reverts commit 697d761.
…n what's causing thread pool hangs. refs #469 revert: "thread pool scope and mutex need to be global across all instances of Scope for it to be a global thread pool (#471)" revert: "add thread pool and concurrency_max_threads configuration option (#470)" This reverts commit 51fb51c. This reverts commit 697d761.
This option allows to limit the maximum number of resources that can be sideloaded concurrently. With a properly configured connection pool, this ensures that the activerecord's connection pool is not exhausted by the sideloading process. The thread pool configuration is based on ActiveRecord's global_thread_pool_async_query_executor. This was previously attempted but there were reports of deadlocks. This code differs from the original by using Concurrency::Delay assigned to a constant instead of a regular Mutex. The Delay+constant is how concurrent-ruby sets up their global thread pools so it's more likely to be correct. Closes graphiti-api#469 See: https://github.com/rails/rails/blob/94faed4dd4c915829afc2698e055a0265d9b5773/activerecord/lib/active_record.rb#L279-L287 See: graphiti-api#470
graphiti-api#470) This option allows to limit the maximum number of resources that can be sideloaded concurrently. With a properly configured connection pool, this ensures that the activerecord's connection pool is not exhausted by the sideloading process.
# [1.6.0](graphiti-api/graphiti@v1.5.3...v1.6.0) (2024-03-20) ### Features * add thread pool and concurrency_max_threads configuration option ([graphiti-api#470](graphiti-api#470)) ([697d761](graphiti-api@697d761))
…n what's causing thread pool hangs. refs graphiti-api#469 revert: "thread pool scope and mutex need to be global across all instances of Scope for it to be a global thread pool (graphiti-api#471)" revert: "add thread pool and concurrency_max_threads configuration option (graphiti-api#470)" This reverts commit 51fb51c. This reverts commit 697d761.
## [1.6.3](graphiti-api/graphiti@v1.6.2...v1.6.3) (2024-03-26) ### Bug Fixes * Remove thread pool executor logic until we get a better handle on what's causing thread pool hangs. refs [graphiti-api#469](graphiti-api#469) ([7941b6f](graphiti-api@7941b6f)), closes [graphiti-api#471](graphiti-api#471) [graphiti-api#470](graphiti-api#470)
This option allows to limit the maximum number of resources that can be
sideloaded concurrently. With a properly configured connection pool,
this ensures that the activerecord's connection pool is not exhausted by
the sideloading process.
The thread pool configuration is based on ActiveRecord's
global_thread_pool_async_query_executor.
Closes #469
See: https://github.com/rails/rails/blob/94faed4dd4c915829afc2698e055a0265d9b5773/activerecord/lib/active_record.rb#L279-L287