Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Authenticate using instance IAM roles instead of key? #40

Open
ThrawnCA opened this issue Nov 6, 2019 · 7 comments
Open

Authenticate using instance IAM roles instead of key? #40

ThrawnCA opened this issue Nov 6, 2019 · 7 comments

Comments

@ThrawnCA
Copy link

ThrawnCA commented Nov 6, 2019

Is it possible to authenticate to S3 by relying on the EC2 instance role, instead of providing an access key? As I understand it, Boto makes it pretty easy; if you simply don't provide a key, it will automatically use the role.

@TkTech
Copy link
Owner

TkTech commented Nov 6, 2019

Have you tried it? Set the key and secret to None in your config.

@ThrawnCA
Copy link
Author

ThrawnCA commented Nov 6, 2019

That results in InvalidCredsError.

ckanext.cloudstorage.driver_options = {"key":None, "secret":None}

[Wed Nov 06 11:01:45.771109 2019] [:error] File '/usr/lib/ckan/default/lib/python2.7/site-packages/pylons/controllers/core.py', line 221 in __call__
[Wed Nov 06 11:01:45.771111 2019] [:error]   response = self._dispatch_call()
[Wed Nov 06 11:01:45.771112 2019] [:error] File '/usr/lib/ckan/default/lib/python2.7/site-packages/pylons/controllers/core.py', line 172 in _dispatch_call
[Wed Nov 06 11:01:45.771117 2019] [:error]   response = self._inspect_call(func)
[Wed Nov 06 11:01:45.771119 2019] [:error] File '/usr/lib/ckan/default/lib/python2.7/site-packages/pylons/controllers/core.py', line 107 in _inspect_call
[Wed Nov 06 11:01:45.771121 2019] [:error]   result = self._perform_call(func, args)
[Wed Nov 06 11:01:45.771122 2019] [:error] File '/usr/lib/ckan/default/lib/python2.7/site-packages/pylons/controllers/core.py', line 60 in _perform_call
[Wed Nov 06 11:01:45.771124 2019] [:error]   return func(**args)
[Wed Nov 06 11:01:45.771126 2019] [:error] File '/usr/lib/ckan/default/src/ckan/ckan/controllers/package.py', line 697 in new_resource
[Wed Nov 06 11:01:45.771128 2019] [:error]   get_action('resource_create')(context, data)
[Wed Nov 06 11:01:45.771130 2019] [:error] File '/usr/lib/ckan/default/src/ckan/ckan/logic/__init__.py', line 464 in wrapped
[Wed Nov 06 11:01:45.771131 2019] [:error]   result = _action(context, data_dict, **kw)
[Wed Nov 06 11:01:45.771133 2019] [:error] File '/usr/lib/ckan/default/src/ckan/ckan/logic/action/create.py', line 327 in resource_create
[Wed Nov 06 11:01:45.771135 2019] [:error]   uploader.get_max_resource_size())
[Wed Nov 06 11:01:45.771137 2019] [:error] File '/usr/lib/ckan/default/src/ckanext-cloudstorage/ckanext/cloudstorage/storage.py', line 240 in upload
[Wed Nov 06 11:01:45.771139 2019] [:error]   self.filename
[Wed Nov 06 11:01:45.771140 2019] [:error] File '/usr/lib/ckan/default/lib/python2.7/site-packages/libcloud/storage/base.py', line 157 in upload_object_via_stream
[Wed Nov 06 11:01:45.771142 2019] [:error]   iterator, self, object_name, extra=extra, **kwargs)
[Wed Nov 06 11:01:45.771144 2019] [:error] File '/usr/lib/ckan/default/lib/python2.7/site-packages/libcloud/storage/drivers/s3.py', line 668 in upload_object_via_stream
[Wed Nov 06 11:01:45.771146 2019] [:error]   storage_class=ex_storage_class)
[Wed Nov 06 11:01:45.771148 2019] [:error] File '/usr/lib/ckan/default/lib/python2.7/site-packages/libcloud/storage/drivers/s3.py', line 825 in _put_object
[Wed Nov 06 11:01:45.771150 2019] [:error]   headers=headers, file_path=file_path, iterator=iterator)
[Wed Nov 06 11:01:45.771151 2019] [:error] File '/usr/lib/ckan/default/lib/python2.7/site-packages/libcloud/storage/base.py', line 654 in _upload_object
[Wed Nov 06 11:01:45.771153 2019] [:error]   **upload_func_kwargs)
[Wed Nov 06 11:01:45.771155 2019] [:error] File '/usr/lib/ckan/default/lib/python2.7/site-packages/libcloud/storage/drivers/s3.py', line 475 in _upload_multipart
[Wed Nov 06 11:01:45.771157 2019] [:error]   response.body = response.response.read()
[Wed Nov 06 11:01:45.771159 2019] [:error] File '/usr/lib/ckan/default/lib/python2.7/site-packages/libcloud/common/base.py', line 308 in response
[Wed Nov 06 11:01:45.771160 2019] [:error]   self.parse_error()
[Wed Nov 06 11:01:45.771162 2019] [:error] File '/usr/lib/ckan/default/lib/python2.7/site-packages/libcloud/storage/drivers/s3.py', line 88 in parse_error
[Wed Nov 06 11:01:45.771164 2019] [:error]   raise InvalidCredsError(self.body)
[Wed Nov 06 11:01:45.771168 2019] [:error] InvalidCredsError: <httplib.HTTPResponse instance at 0x7f44a80e0710>

If I'm reading the libcloud docs correctly, the driver is able to accept STS tokens, but it seems to be at a lower level than boto and doesn't generate tokens itself.

@TkTech
Copy link
Owner

TkTech commented Nov 6, 2019

libcloud should be removed entirely for S3 and Azure blobs. Boto is already used to serve signed content as an optional dependency. Should make it required and use it exclusively.

@ThrawnCA
Copy link
Author

ThrawnCA commented Nov 6, 2019

Ok, so...that sounds like an avenue for future development?

Is there a workaround currently, or will we need to look at something like https://github/bstutsky/ckanext-s3filestore instead? I like the look of ckanext-cloudstorage better, but we really want to use IAM roles instead of putting secret keys in config files.

@ThrawnCA
Copy link
Author

ThrawnCA commented Nov 6, 2019

Looking at it further, regardless of which extension we go with, we'll probably end up doing some extra development on it. If you're interested, I'd be happy to assemble some pull requests to incorporate changes back in.

Questions thus far:

  • The extension currently relies on Boto (not Boto3) and Pylons. Do you want to update Boto?
  • We would like to route signed URLs through CloudFront, as per https://advancedweb.hu/2018/11/15/s3_signed_urls_cloudfront/, which ideally requires rounding off the signature timestamp so that the URL remains consistent for long enough to be cached. Is there an easy way to do that?
  • The ckanext-s3filestore extension uses several config entries with different names, but similar functions, to this one. Is there value in detecting and reusing them as a fallback? (Eg ckanext.cloudstorage.container_name is equivalent to ckanext.s3filestore.aws_bucket_name, ckanext.cloudstorage.driver_options includes ckanext.s3filestore.aws_access_key_id and ckanext.s3filestore.aws_secret_access_key)

@TkTech
Copy link
Owner

TkTech commented Nov 6, 2019

Contributions are always welcome.

  • Boto is currently pretty much only used to generate the signed URLs when it's available. Updating should be trivial.
  • Currently no, it's hard-coded here https://github.com/TkTech/ckanext-cloudstorage/blob/master/ckanext/cloudstorage/storage.py#L308. Trivial to get the current time and round it. Should be pulling the maximum expiry from a setting too.
  • No, not something I want to support. Not hard to add so I'd suggest keeping it in your own fork if it's something you really need.

@ThrawnCA
Copy link
Author

ThrawnCA commented Nov 7, 2019

So, looking through the forks of this repository, I notice that master...fjelltopp:master and master...6aika:master both add the ability to retrieve AWS keys dynamically when running on EC2. Anything you think is worth cherry-picking?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants