Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement retry for eventual consistency in IAM updates #235

Merged

Conversation

AlexVulaj
Copy link
Contributor

@AlexVulaj AlexVulaj commented Oct 23, 2023

What type of PR is this?

feature/bug fix

What this PR does / Why we need it?

The retry mechanism built into the sts.Client from the AWS SDK doesn't properly handle issues of IAM updates requiring a few seconds to resolve, so we have to roll our own retry that recreates the client on every failure. See aws/aws-sdk-go-v2#2332

Which Jira/Github issue(s) does this PR fix?

OSD-18992

Pre-checks (if applicable)

  • Ran unit tests locally
  • Validated the changes in a cluster
  • Included documentation changes with PR

@AlexVulaj AlexVulaj force-pushed the backplane-assume-implement-retry branch from 624f36b to 4b81bba Compare October 23, 2023 21:27
Comment on lines 22 to 27
EnvProd Environment = "prod"
ProdPayerAccountID = 922711891673
EnvStg Environment = "stg"
StgPayerAccountId = 277304166082
EnvInt Environment = "int"
IntPayerAccountId = 277304166082
Copy link
Contributor

@samanthajayasinghe samanthajayasinghe Oct 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if moving the payerAccountID into the backplane config file [~/.config/backplane] ?
Otherwise, this exposes the payer accountID in a public repo for all envs .
There may be more supported env soon, [eg: fedramp], and this approach may need to change then.

May be PayerAccountId match to here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spoke about this with @jharrington22 way back when we began this work and we determined that the ARNs aren't secret so there's no issue with having them in GitHub.

Would you still like me to move these to a config file?


func (e *Environment) Set(env string) error {
switch strings.ToLower(env) {
case "int", "stg", "prod":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not always true

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. Though we can change this as needed.

@wanghaoran1988
Copy link
Member

/lgtm
/approve

@openshift-ci openshift-ci bot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Oct 23, 2023
@wanghaoran1988
Copy link
Member

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 23, 2023
@AlexVulaj
Copy link
Contributor Author

Putting a hold on this momentarily.

For SRE convenience, we're going to remove the "env" flag entirely and have it pulled from the backplane config.

More context here: https://redhat-internal.slack.com/archives/CFJD1NZFT/p1698096531994639?thread_ts=1698096531.994639&cid=CFJD1NZFT

@AlexVulaj AlexVulaj force-pushed the backplane-assume-implement-retry branch from 4b81bba to 35b220f Compare October 24, 2023 14:29
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Oct 24, 2023
@AlexVulaj
Copy link
Contributor Author

/unhold

Moved the initial assume ARN to a config field.

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 24, 2023
@AlexVulaj AlexVulaj force-pushed the backplane-assume-implement-retry branch from 35b220f to 37c5820 Compare October 24, 2023 15:17
@AlexVulaj AlexVulaj force-pushed the backplane-assume-implement-retry branch from 37c5820 to 710436c Compare October 24, 2023 15:24
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 24, 2023

@AlexVulaj: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@samanthajayasinghe
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 24, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 24, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: AlexVulaj, samanthajayasinghe, wanghaoran1988

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot merged commit 56b27b7 into openshift:main Oct 24, 2023
@AlexVulaj AlexVulaj deleted the backplane-assume-implement-retry branch October 24, 2023 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants