Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JENKINS-71849] Allow NoThrottle to be used even on github.com #653

Merged
merged 4 commits into from
Sep 15, 2023

Conversation

jglick
Copy link
Member

@jglick jglick commented Jan 11, 2023

Effectively reverts 7b39521 from #313. Whether this is actually safe to do (presuming you are a responsible adult and are using App authentication), and whether it in fact resolves JENKINS-68321 without introducing worse problems, I am unsure—use at your own risk.

Copy link
Member

@samrocketman samrocketman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of this change; context in https://issues.jenkins.io/browse/JENKINS-71849

@jglick
Copy link
Member Author

jglick commented Aug 17, 2023

I doubt I understand the implications of this change. If you think you do, do you want to become a plugin maintainer?

@samrocketman
Copy link
Member

I'll consider it. Currently in the process of moving (to Apex). Once that settles I can take a deeper look.

I've also wanted to get GraphQL integrated here and was considering a PR for that.

@samrocketman
Copy link
Member

samrocketman commented Aug 22, 2023

I plan to roll this out to our staging environment and run this PR in production as part of additional testing (at scale).

Stats for at-scale environment (single Jenkins controller)

  • Greater than 1000 multibranch pipelines with Greater than 20k jobs with GitHub branch source
  • Greater than 1000 users

@samrocketman
Copy link
Member

samrocketman commented Sep 6, 2023

I've deployed this patch to the version we're using in prod 1703.vd5a_2b_29c6cdc-hotfix. It is now running in production for me; I'll report back in a week or so after having time to evaluate it. If it's good, then I'll let you know; or if not, why not.

@samrocketman
Copy link
Member

samrocketman commented Sep 7, 2023

We're now seeing logs like:

2023-09-06 22:07:51.326+0000 [id=28504]	INFO	o.j.p.g.ApiRateLimitChecker$RateLimitCheckerAdapter#checkRateLimit: LocalChecker for rate limit was not set for this thread. Configured using system settings with API URL 'https://api.github.com/'.

But it shouldn't cause a problem. I reviewed the source code and it falls back to the global nothrottle configuration.

I also notice #654 might fix these messages as well. I'll do a thread dump when I get the chance to verify the thread stack traces, with the thread IDs, and with the lines of code to be sure.

I might consider bundling this other PR into our private fork as another hotfix but I'm going to let the current hotfix run to see how performance has changed.

It doesn't appear to be an actual error, though. It does indeed fall back to the global checker which is what we want (no throttling).

@samrocketman
Copy link
Member

samrocketman commented Sep 7, 2023

I also don't see how it would be possible for an individual thread to have a different rate limit checker than what the admin configured. Unless this is intended for plugins to implement directly to be "nice".

@sgleske-ias
Copy link

sgleske-ias commented Sep 8, 2023

This has conflicts; when you get a chance can you resolve the conflicts?

@jglick
Copy link
Member Author

jglick commented Sep 8, 2023

Tried to resolve conflicts with #684 the smart way and completely failed. Falling back to doing it by hand.

@samrocketman
Copy link
Member

What I ended up doing is checking out the files from master and manually applying the patch copying snippets from the diff. I guess I could have opened a PR to your branch; I'll do that next time

@jglick
Copy link
Member Author

jglick commented Sep 11, 2023

I eventually figured out #684 (comment)

@samrocketman
Copy link
Member

This has been running in prod for almost a week and builds appear to be serviced fine.

@jglick jglick marked this pull request as ready for review September 14, 2023 17:27
@jglick jglick requested a review from a team as a code owner September 14, 2023 17:27
@jglick jglick changed the title [JENKINS-68321] Allow NoThrottle to be used even on github.com [JENKINS-71849] Allow NoThrottle to be used even on github.com Sep 15, 2023
@jglick jglick merged commit 51d5810 into jenkinsci:master Sep 15, 2023
@jglick jglick deleted the NoThrottle-JENKINS-68321 branch September 15, 2023 15:28
@bitwiseman
Copy link
Contributor

@jglick @samrocketman
I'm concerned this could have consequences beyond the current user.

As I remember hearing, rate limit checking was originally added to this plugin because github.com decided that Jenkins clients (globally, beyond just the current user or Jenkins instance) were "abusive" and would be heavily penalized by GitHub.com.

The GitHub REST API best practices make that story plausible:

If you hit a rate limit, you should stop making requests until after the time specified by the x-ratelimit-reset header. Failure to do so may result in the banning of your integration.

The way GitHub identifies "an integration" is via user agent which is the same for all Jenkins instances using this plugin.

Strongly suggest you revert this PR as the potential consequences are catastrophic.

https://docs.github.com/en/rest/overview/resources-in-the-rest-api?apiVersion=2022-11-28#user-agent-required

@jglick
Copy link
Member Author

jglick commented Sep 18, 2023

Will leave it to someone more knowledgeable to decide what to do, since I do not know the history and specifically whether the original problems with rate limits applied to a different category of account, or to PATs vs. Apps.

@rsandell
Copy link
Member

We should probably revert this revert then?

@car-roll
Copy link
Contributor

car-roll commented Sep 18, 2023

I think this comment is telling.
I have seen a few anonymous limits issues over the past year and seems like it we still have some more anonymous requests lurking around causing problems.

@bitwiseman sorry I should have pinged you first for your input before I gave my okay. I believe you are correct in saying that the bug is that NoThrottle ThrottleOnOver is not allowing users to get to the limit. I think a new bug needs to be filed (or 71849 re-opened) and this commit should be reverted. Sorry for all the confusion.

car-roll added a commit to car-roll/github-branch-source-plugin that referenced this pull request Sep 18, 2023
@jtnord
Copy link
Member

jtnord commented Sep 18, 2023

I believe you are correct in saying that the bug is that NoThrottle is not allowing users to get to the limit. I think a new bug needs to be filed (or 71849 re-opened) and this commit should be reverted. Sorry for all the confusion.

you mean ThrottleOnOver not NoThrottle?

@car-roll
Copy link
Contributor

@jtnord you're right, ThrottleOnOver 🤦

car-roll added a commit that referenced this pull request Sep 18, 2023
@bitwiseman
Copy link
Contributor

@car-roll
No apologies needed. I haven't been available/present, so there's no guarantee that pinging me would have changed anything. :)

@jglick
Copy link
Member Author

jglick commented Sep 18, 2023

Reverted in #730 FTR

@sgleske-ias
Copy link

That's fair I'll raise this with GitHub support and get their opinion. In the mean time, unfortunately, we're going to have to maintain a fork of this plugin with this patch included for our production instance. The rate limiting bugs cause extreme user disruption. Somehow, the plugin forgets that it throttled a build and some builds will hang indefinitely in a waiting to run state. This patch resolved that issue for us.

I still think by banning the integration GotHub is referring to the specific app (GitHub app) and not the useragent. It doesn't make sense to adversely affect non-abusers with abusers so I question the line of thinking that there will be dire consequences.

Either way, I'll raise this with GitHub support and get their opinion. They did raise our limits upon request. We have Jenkins integrated with over 1000 repositories.

This plugin should rely on a GraphQL client ideally. At some point, I plan to develop it; hopefully before we reach a scale that it becomes necessary.

I'll get back with any findings.

@samrocketman
Copy link
Member

samrocketman commented Oct 5, 2023

@jglick @bitwiseman looks like NoThrottle shouldn't be used but not for the reasons I'd expect.

Reasoning

GitHub App credential type should not ever hit /meta API for credential validity. It should only hit /rate_limit even if only to check its own validity.

Sleuthing

See these lines of code; if NoThrottle check credential validity against /meta else check validity against /rate_limit for all other throttling types.

if (gitHubConfiguration != null
&& gitHubConfiguration.getApiRateLimitChecker() == ApiRateLimitChecker.NoThrottle) {
gitHub.getMeta();
} else {
gitHub.getRateLimit();
}
return true;
} catch (IOException e) {
if (LOGGER.isLoggable(FINE)) {
LOGGER.log(FINE, "Exception validating credentials on " + gitHub.getApiUrl(), e);
}
return false;
}

GitHub App credential checks its validity against the /rate_limit API normally. In GitHub documentation for /rate_limit API,

Accessing this endpoint does not count against your REST API rate limit.

However, GitHub documentation for /meta API makes no mention of no rate limiting so it is subject to eating up rate limits.

The /meta API probably should not ever be used for credential validity regardless of self-hosted enterprise or not. I found this code by debug logging on Connector class after GitHub app credential verification threw exceptions like

[Wed Oct 04 20:31:38 GMT 2023] Received Push event to branch master in repository REDACTED UPDATED event from REDACTED ⇒ https://jenkins.REDACTED.net/github-webhook/ with timestamp Wed Oct 04 20:31:11 GMT 2023

ERROR: Invalid scan credentials GitHub app credentials for higher API limits when cloning projects to connect to https://api.github.com, skipping
hudson.AbortException: Invalid scan credentials GitHub app credentials to connect to https://api.github.com, skipping

Related lines of code:

When using NoThrottle I saw exceptions of two types (our GH App has 25000 req/hr limit):

  • 95% exception rate for Exception validating credentials on https://api.github.com org.kohsuke.github.GHIOException: Ran out of retries for URL: https://api.github.com/meta
  • 5% exception rate for ception validating credentials on https://api.github.com org.kohsuke.github.HttpException: Abuse limit reached

The code checking for credential validity runs A LOT. So should always check validity against /rate_limit even if not checking for its own rate limit.

ThrottleOnOver

I've switched back to ThrottleOnOver and I will continue debugging further. Any findings I have which I think can help other JENKINS Jira issues get resolved I will post directly in those Jira issues.

In my case, we have a sufficiently large usage footprint that I think I can help the community squash some GHBS bugs. I already squashed a bug in my own plugin from this debugging jenkinsci/scm-filter-jervis-plugin#20

spotoczny pushed a commit to smartrecruiters/github-branch-source-plugin that referenced this pull request Oct 13, 2023
…kinsci#653)

* [JENKINS-68321] Allow `NoThrottle` to be used even on github.com

* Ignoring failing tests
spotoczny pushed a commit to smartrecruiters/github-branch-source-plugin that referenced this pull request Oct 13, 2023
@samrocketman
Copy link
Member

While debugging NoThrottle I found this bug hub4j/github-api#1728

@samrocketman
Copy link
Member

note @sgleske-ias is my proprietary GitHub account; so you know it is the same person

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants