handle rate limiter & github server-side errors #63

mredolatti · 2019-05-24T20:44:23Z

Two issues were found while using this library:

Random and isolated 502 errors from github make the whole job fail.
If the number of API calls exceeds github's limit, the whole job fails.

To deal with these scenarios, simple retry logic was added, and in the case of a rate limiting situation, we wait until the next reset time to move forward with the next retry.

cmerrick · 2019-05-24T20:44:24Z

Hi @mredolatti, thanks for your contribution!

In order for us to evaluate and accept your PR, we ask that you sign a contribution license agreement. It's all electronic and will take just minutes.

cmerrick · 2019-05-24T20:51:33Z

You did it @mredolatti!

Thank you for signing the Singer Contribution License Agreement.

luandy64 · 2019-05-29T19:44:25Z

tap_github.py

-            raise AuthException(resp.text)
-        if resp.status_code == 404:
-            raise NotFoundException(resp.text)
+    for _ in range(0, 3):  # 3 attempts


Couldn't this be replaced with the backoff library?

@mredolatti

Hi @luandy64 i'm not familiar with that library, but i'll look into it as soon as possible and tidy up the PR.

Thanks and sorry for the delay!

@mredolatti I sumbled on this example: https://github.com/singer-io/tap-exchangeratesapi/blob/master/tap_exchangeratesapi/__init__.py#L38

I'll take a look at this tonight and see if I can put the branch up to speed

@luandy64 @osterman could it be that the backoff library doesn't provide an easy way to use headers like: X-RateLimit-Remaining, X-RateLimit-Reset to determine how long to back off: litl/backoff#38?

nehiljain · 2020-07-07T00:24:50Z

Are there plans to work on this? @osterman or @mredolatti ? I am having the same issue as well

henriblancke · 2020-07-07T17:45:19Z

@cmerrick @mredolatti any way we can help to get this over the line?

osterman · 2020-07-07T19:40:20Z

@nehiljain Sorry - no resources available on outside to put towards it at this time

KBorders01 · 2021-01-03T17:18:15Z

@osterman @mredolatti @luandy64, what needs to happen to merge this PR? It looks like the code that's there already is much better than the current code, which has no retry capability. Also, it doesn't look like backoff will handle the rate limit headers anyway.

This issue is blocking me from using Stitch's hosted Github tap, and instead I have to run it on my own.

antoine-lizee · 2021-10-06T10:01:39Z

Up! Can we merge this?

savicbo · 2022-05-01T03:06:22Z

@luandy64 can we merge this PR or is has it been implemented elsewhere? Hitting the GitHub ratelimit is affection our ability to use Stitch as well

mredolatti · 2022-05-03T02:18:13Z

is this still an issue? i took a look at backoff a while ago and didn't see a straightforward way to use it while relying on the response's headers to actually wait the correct amount of time. Do we need to add that behavior into backoff first and then update this?

bgreen-litl · 2022-05-05T13:00:49Z

@mredolatti latest backoff (2.0.1) has the backoff.runtime decorator which allows you to introspect the exception or response and yield a wait value based on that

loeakaodas · 2022-05-05T21:24:03Z

@mredolatti @bgreen-litl basic backoff functionality was added in #143. This PR should probably be closed and a custom wait time functionality can be added in a new PR based on the latest version in master.

luandy64 · 2022-05-23T14:26:11Z

@savicbo @mredolatti @bgreen-litl @antoine-lizee @KBorders01 @henriblancke @nehiljain

Correct me if I'm wrong, but if you run v1.10.4 of this tap, there's already logic to handle the rate limit headers thanks to @loeakaodas

tap-github/tap_github/__init__.py

Lines 198 to 207 in 4f7ba58

    
           def rate_throttling(response): 
        
               if int(response.headers['X-RateLimit-Remaining']) == 0: 
        
                   seconds_to_sleep = calculate_seconds(int(response.headers['X-RateLimit-Reset'])) 
        
                   if seconds_to_sleep > MAX_SLEEP_SECONDS: 
        
                       message = "API rate limit exceeded, please try after {} seconds.".format(seconds_to_sleep) 
        
                       raise RateLimitExceeded(message) from None 
        
                   logger.info("API rate limit exceeded. Tap will retry the data collection after %s seconds.", seconds_to_sleep) 
        
                   time.sleep(seconds_to_sleep)

KBorders01 · 2022-05-23T15:19:23Z

@savicbo @mredolatti @bgreen-litl @antoine-lizee @KBorders01 @henriblancke @nehiljain

Correct me if I'm wrong, but if you run v1.10.4 of this tap, there's already logic to handle the rate limit headers thanks to @loeakaodas

tap-github/tap_github/__init__.py

Lines 198 to 207 in 4f7ba58

def rate_throttling(response):

if int(response.headers['X-RateLimit-Remaining']) == 0:

seconds_to_sleep = calculate_seconds(int(response.headers['X-RateLimit-Reset']))

if seconds_to_sleep > MAX_SLEEP_SECONDS:

message = "API rate limit exceeded, please try after {} seconds.".format(seconds_to_sleep)

raise RateLimitExceeded(message) from None

logger.info("API rate limit exceeded. Tap will retry the data collection after %s seconds.", seconds_to_sleep)

time.sleep(seconds_to_sleep)

@luandy64 that is correct, it looks like this issue has been resolved in another PR.

mredolatti added 2 commits May 20, 2019 13:05

handle rate limiter

ceb3382

handle 502 & 503 errors as well

3be1c34

cmerrick added the cla-missing label May 24, 2019

add exception if we run out of attempts

7fa7927

cmerrick removed the cla-missing label May 24, 2019

mredolatti added 2 commits May 24, 2019 18:20

set a proper default time to reset

b632188

fix missing import

f6e2b3d

luandy64 requested changes May 29, 2019

View reviewed changes

mredolatti added 2 commits May 30, 2019 19:01

fix sleep statement

634b5a5

fix waiting logic

00bfa13

osterman mentioned this pull request May 6, 2020

All repos for an organisation #74

Open

KBorders01 mentioned this pull request Jan 3, 2021

Excessive refetching of review_comments and pull_requests after rate-limiting error #43

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

handle rate limiter & github server-side errors #63

handle rate limiter & github server-side errors #63

mredolatti commented May 24, 2019 •

edited

Loading

cmerrick commented May 24, 2019

cmerrick commented May 24, 2019

luandy64 May 29, 2019

luandy64 May 31, 2019

mredolatti May 31, 2019

osterman Jun 19, 2019

mredolatti Jul 7, 2020

henriblancke Jul 13, 2020

nehiljain commented Jul 7, 2020

henriblancke commented Jul 7, 2020

osterman commented Jul 7, 2020

KBorders01 commented Jan 3, 2021

antoine-lizee commented Oct 6, 2021

savicbo commented May 1, 2022

mredolatti commented May 3, 2022

bgreen-litl commented May 5, 2022

loeakaodas commented May 5, 2022

luandy64 commented May 23, 2022

KBorders01 commented May 23, 2022

handle rate limiter & github server-side errors #63

Are you sure you want to change the base?

handle rate limiter & github server-side errors #63

Conversation

mredolatti commented May 24, 2019 • edited Loading

cmerrick commented May 24, 2019

cmerrick commented May 24, 2019

luandy64 May 29, 2019

Choose a reason for hiding this comment

luandy64 May 31, 2019

Choose a reason for hiding this comment

mredolatti May 31, 2019

Choose a reason for hiding this comment

osterman Jun 19, 2019

Choose a reason for hiding this comment

mredolatti Jul 7, 2020

Choose a reason for hiding this comment

henriblancke Jul 13, 2020

Choose a reason for hiding this comment

nehiljain commented Jul 7, 2020

henriblancke commented Jul 7, 2020

osterman commented Jul 7, 2020

KBorders01 commented Jan 3, 2021

antoine-lizee commented Oct 6, 2021

savicbo commented May 1, 2022

mredolatti commented May 3, 2022

bgreen-litl commented May 5, 2022

loeakaodas commented May 5, 2022

luandy64 commented May 23, 2022

KBorders01 commented May 23, 2022

mredolatti commented May 24, 2019 •

edited

Loading