-
Notifications
You must be signed in to change notification settings - Fork 539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[aws] cache user identity by 'aws configure list' #4507
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Aylei <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @aylei! This PR looks good to me!
@@ -773,6 +784,38 @@ def get_user_identities(cls) -> Optional[List[List[str]]]: | |||
# automatic switching for AWS. Currently we only support one identity. | |||
return [user_ids] | |||
|
|||
@classmethod | |||
@functools.lru_cache(maxsize=1) # Cache since getting identity is slow. | |||
def get_user_identities(cls) -> Optional[List[List[str]]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a docstr for this function for the behavior of caching and returning identity. It might also be good to move the docstr from _sts_get_caller_identity
to here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch! just moved the docstring here and added the description of caching behavior
I assume the time tests listed in the PR description has the |
Signed-off-by: Aylei <[email protected]>
Yes, I updated the PR description to make the result clearer~ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for the quick fix @aylei! Just two minor comments before we merge it. : )
cache_path = catalog_common.get_catalog_path( | ||
f'aws/user-identity-{config_hash}.txt') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we move it to aws/.cache/user-identity-{config_hash}.txt
"""Returns a [UserId, Account] list that uniquely identifies current AWS | ||
principal (user, role or federated identity) whose credentials are used | ||
to run current `sky` process. These identities are assumed to be stable | ||
for the duration of the `sky` process. Modifying the credentials while | ||
the `sky` process is running will not affect the identity returned by | ||
this function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style: keep the first line of the docstr to be single line (within 80 characters), we can have detailed descriptions below in a different paragraph.
Cache
aws sts get-caller-identity
result, indexed by hashing the output ofaws configure list
. This will speed upsky launch
by ~1s in subsequent runs.original discussion: #3159 (comment)
A simple benchmark (enabled clouds:
[aws, gcp, azure, kubernetes, runpod, cudo, paperspace]
, the code is slightly modified to return before the confirmation is prompted):@Michaelvll Please kindly take a look
Tested (run the relevant ones):
bash format.sh
pytest tests/test_smoke.py
pytest tests/test_smoke.py::test_fill_in_the_name
conda deactivate; bash -i tests/backward_compatibility_tests.sh