-
-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add generic uploader #25
base: master
Are you sure you want to change the base?
Conversation
ckanext/cloudstorage/plugin.py
Outdated
# Returning None here will use the default Uploader. | ||
return None | ||
# Custom uploader for generic file uploads | ||
print('get uploader') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove debugging print
Looking good, I'll give it a test this weekend. Do you know if there are any open-source implementations of Google's API that can be used for testing, such as Minio for S3? |
Not that I'm aware, but the google storage libcloud driver is using the S3 compatible API (see https://libcloud.readthedocs.io/en/latest/storage/drivers/google_storage.html). Not exactly sure what version of the S3 api that would be, but I would guess this might work with minio as long as you disable secure urls. Bit of a shame I'm only finding out about minio now, looks amazing and could probably have saved me some work! |
I forgot to mention how to authenticate with GCP, which I need to write up for the docs as well. For the driver options, the key is the service account ID with access to the bucket, secret is the path to the JSON key of this service account. It also needs a third option, |
@TkTech did you get a chance to test this? |
I've pushed some updates. Most noteworthy is the fact that the Also I've fixed some tests that weren't working. Would be great if you get a chance to have a look and let me know what you think. |
Currently everywhere a generic resource can be uploaded, it can be retrieved without being authenticated. From a security perspective I'm okay with this. However, many buckets that are using secure URLs entirely disable non-secure access to the bucket (versus a specific path using the advanced ACL rules). It would be better to have Please remove the cover report, we don't want that in git. Otherwise, definitely looking good! 👍 |
Google Cloud Storage doesn't allow folder-level permissions, so to have private resources but public generic files we need to set the ACL on upload. CKAN's redirect is a 302 with headers disallowing caching, for generic files we want caching so this adds a manual permanent redirect with headers allowing caching.
@TkTech Thanks. I've added the option. So now In the last few commits I've implemented a custom 301 redirect that is cacheable for public generic files, to prevent asset reloading on every new page. Another thing worth mentioning is google's ACL. Google Cloud Storage doesn't do folder level permissions, so when the resource files are private and generics public, I'm setting the ACL on generics on upload. I'm going to deploy this on some of my CKAN instances to test the plugin. Given I don't find any bugs I'm pretty much finished with this PR. Only other thing left to do is update documentation. |
Hi @gerbyzation , @TkTech , cc @akariv |
It's usable with a small conflict, if you need it right away @gerbyzation's work is 👍 v2 of cloudstorage is coming in the next few days as part of work funded by TBS open-data, and is a complete rewrite. |
@jqnatividad Been running my version with the generic uploader for static files for a year on a bunch of instances now, haven't had any issues. |
@TkTech any update on the new cloudstorage version? We can definitely help test or work on missing pieces. |
As promised in #23 (and #2), here is my work so far on adding support for generic file uploads. It's a little messy, but getting there
I still want to do some more work on the tests. Still figuring out how to properly do testing in CKAN. Initially I borrowed the testing approach from s3filestore, but they have a fake backend so need to work out if those tests are actually of any use when mocked out. I also want to write some more tests to make sure certain parts behave as expected. Everything (but especially the tests) need an overall clean as well.
And sorry for the somewhat polluted commit history, was trying to get a docker setup going to make things easy, ended up with https://github.com/okfn/docker-ckan which is quick to setup.
If there's any feedback please let me know!