Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TransportError ... is now earlier than... occurring irregularly #2618

Closed
risoms opened this issue Feb 12, 2022 · 14 comments
Closed

TransportError ... is now earlier than... occurring irregularly #2618

risoms opened this issue Feb 12, 2022 · 14 comments
Assignees
Labels
guidance Question that needs advice or information.

Comments

@risoms
Copy link

risoms commented Feb 12, 2022

Describe the bug

Note - This is the same issue here. As suggested, I'm reopening because the recommend solutions unfortunately haven't worked.

We are seeing a regular occurrence of TransportError using boto3. This can occur in the magnitude of minutes as well as days.

Steps to reproduce

import boto3
from elasticsearch.helpers import scan

credentials = boto3.session.Session().get_credentials()
auth = AWS4Auth(credentials.access_key, credentials.secret_key...)
es = Elasticsearch(http_auth=auth....)
scan(client=es, query=body)

Expected behavior
When using elasticsearch.helpers.scan, we're expecting to see an abstraction of elasticsearch scroll() API. Usually this works.

Debug logs

Signature expired: 20210805T032751Z is now earlier than 20210817T205124Z (20210817T205624Z - 5 min.)
@risoms risoms added the needs-triage This issue or PR still needs to be triaged. label Feb 12, 2022
@risoms
Copy link
Author

risoms commented Feb 12, 2022

Sorry for the delay. I'm assuming you meant the expiry_time, but let me know if I'm mistaken. The expiry_time is just sourced from refreshable credentials (i.e. whenever they are set to expire by IAM). How are you sourcing your credentials?

We are storing credentials as a singleton that updates when they have been detected as out of date.

I know you mentioned that you followed the steps in the article I linked, but could you confirm that you still see 169.254.169.123 as a source when running chrony sources -v?

@stobrien89 Yes this is correct.

@tim-finnigan tim-finnigan self-assigned this Feb 14, 2022
@tim-finnigan tim-finnigan added guidance Question that needs advice or information. investigating This issue is being investigated and/or work is in progress to resolve the issue. and removed needs-triage This issue or PR still needs to be triaged. labels Feb 14, 2022
@tim-finnigan
Copy link
Contributor

Hi @risoms thanks for following up. I found similar issues here and here that suggest this is a clock sync issue.

Have you tried syncing your machine’s clock with NTP? For example running sudo ntpdate pool.ntp.org on Ubuntu. You could try creating a CRON job to keep your system time in sync at a recurring interval.

@tim-finnigan tim-finnigan added response-requested Waiting on additional info and feedback. and removed investigating This issue is being investigated and/or work is in progress to resolve the issue. labels Feb 14, 2022
@risoms
Copy link
Author

risoms commented Feb 15, 2022

@tim-finnigan No luck. The issue still persists after introducing syncing with sudo ntpdate.

Also the deltatime between t1-t0 can sometimes be much larger than the examples provided. We've seen cases where this can be on the order of days and weeks.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. label Feb 15, 2022
@risoms
Copy link
Author

risoms commented Feb 17, 2022

Also is there caching that occurs with this authentication? The t0 can vary across observations. For instance we can see t0 be 20220214T034700, and a few minutes later 20220214T011200 (yes the value decreased).

@tim-finnigan
Copy link
Contributor

Hi @risoms can you describe in more detail how you are authenticating? You can see the boto3 credentials documentation for reference. There is caching described here with AssumeRole:

When you specify a profile that has an IAM role configuration, Boto3 will make an AssumeRole call to retrieve temporary credentials. Subsequent Boto3 API calls will use the cached temporary credentials until they expire, in which case Boto3 will then automatically refresh the credentials.

Please note that Boto3 does not write these temporary credentials to disk. This means that temporary credentials from the AssumeRole calls are only cached in-memory within a single session. All clients created from that session will share the same temporary credentials.

Also are you still “AWS Ubuntu 16.04.3 LTS xenial”. Have you seen this issue on any other systems?

@tim-finnigan tim-finnigan added the response-requested Waiting on additional info and feedback. label Feb 17, 2022
@risoms
Copy link
Author

risoms commented Feb 18, 2022

@tim-finnigan I've included the code used to authenticate below. The authentication is stored as a singleton.

credentials = boto3.session.Session().get_credentials()

auth = AWS4Auth(
    credentials.access_key,
    credentials.secret_key,
    region,
    service,
    session_token=credentials.token,
)

When it's detected as expired (using the query below), we re-authenticate.

cls.credentials.refresh_needed()

Yes we're still on AWS Ubuntu 16.04.3 LTS xenial. This issue hasn't occurred anywhere else.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. label Feb 18, 2022
@tim-finnigan
Copy link
Contributor

Hi @risoms is there a pattern as far as which services/APIs are used when this error occurs? Do you have any third-party packages installed that may be causing a clock skew issue?

I know this link was shared in your previous issue but I want to post it again in case you want to retry configuring chrony: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/set-time.html#configure-amazon-time-service-ubuntu

Also I saw the JS SDK offers a correctClockSkew param config option and there is an open feature request to add that to boto3: boto/boto3#1252

@tim-finnigan tim-finnigan added the response-requested Waiting on additional info and feedback. label Feb 25, 2022
@risoms
Copy link
Author

risoms commented Feb 26, 2022

Hi @risoms is there a pattern as far as which services/APIs are used when this error occurs? Do you have any third-party packages installed that may be causing a clock skew issue?

No patterns that we've been able to identify. Some things to note:

  • We have two Docker containers running in an AWS EC2 instance. This issue seems to occur in one and not the either.
  • Both containers build from the same base image.
  • The container where this issue occurs is used to provide our flask application.
  • The other container is used to run a series of scheduled python scripts.

@risoms
Copy link
Author

risoms commented Feb 26, 2022

I know this link was shared in your previous issue but I want to post it again in case you want to retry configuring chrony: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/set-time.html#configure-amazon-time-service-ubuntu

We've attempted using chrony to resolve this with no luck sadly.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. label Feb 27, 2022
@tim-finnigan
Copy link
Contributor

Hi @risoms I’m still not sure if this is a boto3/botocore or issue or a local clock sync issue. The fact that this happens in one of your Docker containers but not the other makes me think that there is something specific to that container that is causing this. Do you think it could be something to do with how your Flask application is set up?

@tim-finnigan tim-finnigan added the response-requested Waiting on additional info and feedback. label Mar 4, 2022
@github-actions
Copy link

github-actions bot commented Mar 9, 2022

Greetings! It looks like this issue hasn’t been active in longer than five days. We encourage you to check if this is still an issue in the latest release. In the absence of more information, we will be closing this issue soon. If you find that this is still a problem, please feel free to provide a comment or upvote with a reaction on the initial post to prevent automatic closure. If the issue is already closed, please feel free to open a new one.

@risoms
Copy link
Author

risoms commented Mar 10, 2022

This can safely be closed. The issue was very much not related to boto3. We were able to identify the cause as related to the library freezegun being used concurrent to querying on AWS.

Thank you for your patience and help with this!

@github-actions github-actions bot removed closing-soon response-requested Waiting on additional info and feedback. labels Mar 10, 2022
@tim-finnigan
Copy link
Contributor

Ok thanks for letting us know! Glad you were able to identify the problem.

@github-actions
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
guidance Question that needs advice or information.
Projects
None yet
Development

No branches or pull requests

2 participants