Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add generic uploader #25

Open
wants to merge 37 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
a2bd0f2
add google cloud signed url support
gerbyzation Dec 5, 2017
66a8433
allow for both explicit sa credentials and implicit authentication
gerbyzation Dec 5, 2017
b55b528
call generate signed url on blob
gerbyzation Jan 13, 2018
9fe3dc9
add docker setup w/ local mounted volume
gerbyzation Jan 15, 2018
42acc13
tests run, prob still misconfigured
gerbyzation Jan 17, 2018
1a0b647
adding s3filestore for testing purps
gerbyzation Jan 18, 2018
b3da887
add first test borrowed from s3filestore
gerbyzation Jan 18, 2018
b3b2692
fix typo in unpacking
gerbyzation Jan 19, 2018
726929f
fix test dependencies and add more tests from s3filestore
gerbyzation Jan 19, 2018
d173b64
pin working dependencies
gerbyzation Jan 20, 2018
ae7ba0e
add tests for storage.py
gerbyzation Jan 24, 2018
668119d
Merge branch 'tests' of https://github.com/thedataplace/ckanext-cloud…
gerbyzation Jan 24, 2018
ab9c2ec
add first tests, inspired by s3filestore
gerbyzation Jan 29, 2018
59ac7fd
start work on uploader
gerbyzation Jan 29, 2018
f0c737a
get uploading & viewing generic files working (warning code is ☢))
gerbyzation Jan 30, 2018
2a56bf1
refactor upload methods
gerbyzation Jan 30, 2018
cd97074
Merge remote-tracking branch 'origin/gcloud-support' into static-uplo…
gerbyzation Jan 30, 2018
0545a92
refactor get_url_from_filename to get_url_from_path
gerbyzation Jan 30, 2018
30d4986
add plugin tests file
gerbyzation Jan 30, 2018
352f597
replace filepath joins with method
gerbyzation Jan 30, 2018
f976efd
fix tests
gerbyzation Jan 30, 2018
0aeed2c
fix google secure url, refactor to raise exception if no secure url a…
gerbyzation Jan 31, 2018
e23940c
ignore all .egg-info things
gerbyzation Jan 31, 2018
9eab142
remove docker setup from repo
gerbyzation Jan 31, 2018
f3896af
remove print in plugin.py
gerbyzation Feb 1, 2018
6c16b68
fix site-url in test
gerbyzation Mar 14, 2018
b9677cc
formatting fixed in storage.py
gerbyzation Mar 14, 2018
e32b28a
use secure_urls only for resources
gerbyzation Mar 14, 2018
a52c607
fix disabled tests
gerbyzation Mar 15, 2018
e34bbac
add cover report
gerbyzation Mar 15, 2018
f48b24b
test file uploads are always public
gerbyzation Mar 15, 2018
ffa8e1c
update cover report
gerbyzation Mar 15, 2018
18d9bb4
remove cover report
gerbyzation Mar 15, 2018
98747be
add use_secure_url_for_generics option
gerbyzation Mar 19, 2018
da87c3d
Set google ACL if needed, add permanent redirect
gerbyzation Mar 19, 2018
3919ea5
add google storage option and use_secure_urls_for_generics to readme
gerbyzation Mar 21, 2018
850498f
update note #2
gerbyzation Mar 21, 2018
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 44 additions & 0 deletions .env
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Variables in this file will be substituted into docker-compose.yml
# Save a copy of this file as .env and insert your own values.
# Verify correct substitution with "docker-compose config"
# If variables are newly added or enabled, please delete and rebuild the images to pull in changes:
# docker-compose down
# docker rmi -f docker_ckan docker_db
# docker rmi $(docker images -f dangling=true -q)
# docker-compose build
# docker-compose up -d
# docker-compose restart ckan # give the db service time to initialize the db cluster on first run

# Image: ckan
CKAN_SITE_ID=default
#
# On AWS, your CKAN_SITE_URL is the output of:
# curl -s http://169.254.169.254/latest/meta-data/public-hostname
# CKAN_SITE_URL=http://ec2-xxx-xxx-xxx-xxx.ap-southeast-2.compute.amazonaws.com
# When running locally, CKAN_SITE_URL must contain the port
CKAN_SITE_URL=http://localhost:5000
#
# CKAN_PORT must be available on the host: sudo netstat -na
# To apply change: docker-compose down && docker rmi docker_ckan && docker-compose build ckan
CKAN_PORT=5000
#
# Email settings
CKAN_SMTP_SERVER=smtp.corporateict.domain:25
CKAN_SMTP_STARTTLS=True
CKAN_SMTP_USER=user
CKAN_SMTP_PASSWORD=pass
CKAN_SMTP_MAIL_FROM=ckan@localhost
#
# Image: db
POSTGRES_PASSWORD=ckan
#
# POSTGRES_PORT must be available on the host: sudo netstat -na | grep 5432
# To apply change: docker-compose down && docker rmi docker_db docker_ckan && docker-compose build
POSTGRES_PORT=5432
#
# The datastore database will be created in the db container as docs
# Readwrite user/pass will be ckan:POSTGRES_PASSWORD
# Readonly user/pass will be datastore_ro:DATASTORE_READONLY_PASSWORD
DATASTORE_READONLY_PASSWORD=datastore cloudstorage

CKAN_SITE_TITLE='Testing subject'
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ syntax: glob
*.swp
*.swo
.DS_Store
ckan.egg-info/*
*.egg-info/
sandbox/*
dist

Expand All @@ -18,3 +18,5 @@ fl_notes.txt
*.ini
.noseids
*~
.coverage
cover/
28 changes: 26 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,20 @@ For most drivers, this is all you need:

ckanext.cloudstorage.driver_options = {"key": "<your public key>", "secret": "<your secret key>"}

## Google Storage

To use the Google Storage driver the following driver options are required:

{"key": "<service account ID>", "secret": "<path to SA private key>", "project": "<google project ID>"}

**Note on secure URL's with Google Storage**
With Google's lack of folder-level permissions the whole bucket will need to be made private when
using secure urls. This will now affect generic file uploads as well. To still allow
generic files to be public (`ckanext.cloudstorage.use_secure_urls_for_generics` is `False` by default)
we will set the ACL of a newly uploaded object to `public-read` when only
`ckanext.cloudstorage.use_secure_urls` is activated. If you later decide to make the generic files private
you will have to manually update the ACL on the already uploaded objects to make them private.

# Support

Most libcloud-based providers should work out of the box, but only those listed
Expand All @@ -39,6 +53,7 @@ below have been tested:
| Azure | YES | YES | YES (if `azure-storage` is installed) |
| AWS S3 | YES | YES | YES (if `boto` is installed) |
| Rackspace | YES | YES | No |
| Google Storage | YES | YES | YES (if `google-cloud-storage` and `pycrypto` are installed) |

# What are "Secure URLs"?

Expand All @@ -50,8 +65,18 @@ the resource. This means that the normal CKAN-provided access restrictions can
apply to resources with no further effort on your part, but still get all the
benefits of your CDN/blob storage.

# applies to resources
ckanext.cloudstorage.use_secure_urls = 1

# applies to generic uploads eg. group images, logo
ckanext.cloudstorage.use_secure_urls_for_generics = 1

The access permissions on the storage container used will have to be set accordingly to reflect
these settings (if using Google Storage, see note on Google Storage use).

`use_secure_urls_for_generics` is recommended to be off, to allow for caching of assets
such as the logo.

This option also enables multipart uploads, but you need to create database tables
first. Run next command from extension folder:
`paster cloudstorage initdb -c /etc/ckan/default/production.ini `
Expand All @@ -77,8 +102,7 @@ cloudstorage will take care of the rest. Ex:

1. You should disable public listing on the cloud service provider you're
using, if supported.
2. Currently, only resources are supported. This means that things like group
and organization images still use CKAN's local file storage.
2. Currently, the migration tool only supports resources.

# FAQ

Expand Down
32 changes: 31 additions & 1 deletion ckanext/cloudstorage/controller.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,18 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os.path
import logging

from pylons import c
from pylons.i18n import _

from webob.exc import status_map
from ckan import logic, model
from ckan.lib import base, uploader
from ckan.common import is_flask_request
import ckan.lib.helpers as h

log = logging.getLogger(__name__)


class StorageController(base.BaseController):
def resource_download(self, id, resource_id, filename=None):
Expand Down Expand Up @@ -52,3 +56,29 @@ def resource_download(self, id, resource_id, filename=None):
base.abort(404, _('No download is available'))

h.redirect_to(uploaded_url)

def uploaded_file_redirect(self, upload_to, filename):
'''Redirect static file requests to their location on cloudstorage.'''
upload = uploader.get_uploader('notused')
file_path = upload.path_from_filename(filename)
uploaded_url = upload.get_url_from_path(file_path)

if upload.use_secure_urls:
h.redirect_to(uploaded_url)
else:
if is_flask_request():
raise NotImplementedError("Permanent redirect for flask \
requests is not implemented yet")
else:
# We are manually performing a redirect for Pylons
# as this is the only way to set the caching headers
# to make a Permanently Moved cachable
# (see https://github.com/Pylons/pylons/blob/master/pylons/controllers/util.py#L218-L229)
exc = status_map[301]
raise exc(
location=uploaded_url.encode('utf-8'),
headers={
"Cache-Control": "public, max-age=3600",
"Pragma": "none"
}
)
11 changes: 8 additions & 3 deletions ckanext/cloudstorage/plugin.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,9 +53,8 @@ def get_resource_uploader(self, data_dict):
return storage.ResourceCloudStorage(data_dict)

def get_uploader(self, upload_to, old_filename=None):
# We don't provide misc-file storage (group images for example)
# Returning None here will use the default Uploader.
return None
# Custom uploader for generic file uploads
return storage.FileCloudStorage(upload_to, old_filename)

def before_map(self, map):
sm = SubMapper(
Expand All @@ -77,6 +76,12 @@ def before_map(self, map):
action='resource_download'
)

sm.connect(
'uploaded_file',
'/uploads/{upload_to}/{filename}',
action='uploaded_file_redirect'
)

return map

# IActions
Expand Down
Loading