Plugin possible cause of RejectedExecutionException - unhandled events #53

carlmot · 2016-09-29T23:35:39Z

Bitbucket Data Center v4.5.1 (2 nodes).
Issue occurs during peak usage hours.

In logs a few entries for 'The event queue is full. Stacks for the processing threads follow'
Then many for 'RejectedExecutionException' after the previous occurs.

Atlassian has documented this issue here:
https://confluence.atlassian.com/bitbucketserverkb/rejectedexecutionexception-779171377.html
It points to "The event queue is full. Stacks for the processing threads follow:"
Looks like PR Harmony may be the cause judging by the log (snippet attached) and the code.
*Attempting to confirm issue is due to this plugin (e.g. by disabling the plugin during error events) - unable to do so as of today.

Searched plugins on GitHub for these issues.
Pull Request Notifier had one issue logged and had a fix (Processing on Bitbucket Server's event threads):
tomasbjerre/pull-request-notifier-for-bitbucket#78
Fix added (v 2.10/1.35) use of import java.util.concurrent.ExecutorService and handleEventAsync() instead of handleEvent().

Explanation and fix/proper way to handle events that can take longer than 'a second or two' here:
https://bitbucket.org/izymesdev/workzone/issues/79/buildeventlistener-is-doing-its-processing

PRHarmony_Log.txt

monitorjbl · 2016-09-30T11:34:11Z

Interesting, I wasn't aware the Datacenter version worked that way. So the BuildEventListener can certainly be handled asynchronously but I don't see how the merge checks could. Hopefully that's never an issue, but if it is you can always increase the queue depth.

bturner · 2018-01-03T21:01:49Z

@monitorjbl

I'm the principal developer for Bitbucket Server. I just wanted to note that there's nothing special about Data Center with regards to event processing; all versions of Bitbucket Server are the same. They all apply a limit to the number of events that can be queued before they start dropping them, to prevent OutOfMemoryErrors.

From looking over the source of your plugin, it looks like PullRequestListener.findPRByCommitId could be significantly improved to make it more performant. A couple of suggestions:

Use a larger page size than 10 when searching for open pull requests. 25 or 50 would produce a significant reduction in requests to the database without an undue impact on memory usage
Instead of using start += 10 to advance, use requests.getNextPageRequest()
- If that returns null, it means there are no more pages
- Based on that, I'd use a do/while loop instead, where the PageRequest is declared outside the loop instead of the Page<PullRequest>
- You can also check requests.getIsLastPage() to know when you're at the last page
- Regardless of exactly how you do it, these checks will prevent you from making a request beyond the last page to detect that there aren't any more results, saving another set of database queries
On the PullRequestSearchRequest.Builder, call withProperties(false) before build()ing
- You're not using PullRequest.getProperties(), so loading them only makes your search slower
- With this set to false, you could probably increase your page size even further; I might go as high as 100
Instead of using PullRequestService.getCommits, use PullRequestService.streamCommits and provide a CommitCallback
- In your callback, you can check the commit ID against what you're looking for and, if you find it, you can return false; from onCommit. That allows the system to eagerly drop out of the git rev-list process once a match is found, rather than listing up to a million commits (and holding them all in memory to do so!) only to have the code find a match on the second one and discard all the others
- Using a callback means you might need to use something like a MutableBoolean to allow you to set a flag that can be seen "outside" the callback to indicate that the pull request matched

You're welcome to make some changes and mention me, if you'd like a review.

Best regards,
Bryan Turner
Atlassian Bitbucket

* Using larger page size for open PR searches * Using PR commit stream instead of list

monitorjbl · 2018-01-04T02:16:10Z

@bturner I've made the changes you suggested. Is there a way I can test this out on a local cluster of Bitbucket Data Center? My biggest problem with supporting that platform is the SDK doesn't seem to support it so I can't test the plugin on it.

bturner · 2018-01-05T02:11:38Z

@monitorjbl,

In general you shouldn't need to do anything specific to support Data Center. The only time you need to tailor for it specifically is when the code needs to not execute in parallel when multiple nodes are present (scheduled jobs, for example, that need to only run on one node at most) or when you have cached data that needs to remain consistent across multiple nodes.

Our Building cluster-safe plugins document has a section (toward the bottom) about how to start a local cluster, but I'm not sure it's really clear enough. (Note that while it mentions AMPS 6.1.0, in order to test with Bitbucket Server 5+ you need to be using AMPS 6.3.0+, which includes all the cluster-relevant changes from AMPS 6.1.) Another alternative might be our full Data Center installation guide.

If it proves too difficult to run up a cluster using any of those resources, let me know and I'll see if I can find/write a better walkthrough.

Best regards,
Bryan Turner
Atlassian Bitbucket

Pistahh · 2018-02-14T12:21:11Z

Hi, we are hit by this issue too. I can see that a fix is already committed but I'm wondering if there is a new add-on release available that has this fix? We are on version 2.4.0.

Thanks,
Istvan

monitorjbl · 2018-02-16T15:50:06Z

v2.5.0 is available now

Pistahh · 2018-02-23T16:21:19Z

We use v2.5.0 in production now since a couple of days now, it works correctly and we no longer see the "event queue if full" error anymore. Thanks. :)

monitorjbl added a commit that referenced this issue Jan 4, 2018

Optimizing PR search requests (#53)

99d3f6d

* Using larger page size for open PR searches * Using PR commit stream instead of list

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plugin possible cause of RejectedExecutionException - unhandled events #53

Plugin possible cause of RejectedExecutionException - unhandled events #53

carlmot commented Sep 29, 2016

monitorjbl commented Sep 30, 2016

bturner commented Jan 3, 2018 •

edited

Loading

monitorjbl commented Jan 4, 2018

bturner commented Jan 5, 2018

Pistahh commented Feb 14, 2018

monitorjbl commented Feb 16, 2018

Pistahh commented Feb 23, 2018

Plugin possible cause of RejectedExecutionException - unhandled events #53

Plugin possible cause of RejectedExecutionException - unhandled events #53

Comments

carlmot commented Sep 29, 2016

monitorjbl commented Sep 30, 2016

bturner commented Jan 3, 2018 • edited Loading

monitorjbl commented Jan 4, 2018

bturner commented Jan 5, 2018

Pistahh commented Feb 14, 2018

monitorjbl commented Feb 16, 2018

Pistahh commented Feb 23, 2018

bturner commented Jan 3, 2018 •

edited

Loading