gaetk is a small collection of tools that tries to make programming Google AppEngine faster, more comfortable and less error prone.
It comes bundled with gae-session, webapp2 and bootstrap. Check the documentation of these projects for further information.
gaetk tries to stay compatible with "plain appengine" so you don't have to code specifically for gaetk. The idea is to encode patterns and best practices and allow you to use them where appropriate without making all your code gaetk specific. It is slightly slanted at company internal applications.
Features:
- Sessions via gae-session, easy warping of GET, POST et. al. via webapp2.
- efficient pagination with cursors
- Rendering via Jinja2
- automatic multi-format views (HTML, JSON & XML out of the box)
- hybrid Authentication via Google Apps and/or Username & Password
- Form-Based and HTTP Authentication
- transparent handling of HEAD requests
- Sequence generation
- Internal Memcache and Datastore Statistics (e.g. for graphing with munin)
- intelligent handler for
robot.txt
and auditing of which version is deployed. - efficient caching of entities (TO BE MERGED)
- Django-Like Admin interface (TO BE MERGED)
- un-deletion of deleted entities (TO BE MERGED)
- Message Framework (TO BE MERGED)
- groups / access controls (TO BE MERGED)
- mini CMS like Django flatpages (TO BE MERGED)
- in place editing (TO BE MERGED)
- Generic Configuration objects
- simple testing for REST APIs
- Handling of Long running tasks with minimal effort via
gaetk_longtask
- Profiling with
gae_mini_profiler
- Nice Error-Reports with
my_ereporter
- Syncronisation with MySQL via
approcket_industrial
Create a appengine Project, then:
mkdir lib
echo "import site, os.path" > lib/__init__.py
echo "site.addsitedir(os.path.dirname(__file__))" >> lib/__init__.py
git submodule add [email protected]:mdornseif/appengine-toolkit.git lib/appengine-toolkit
cp lib/appengine-toolkit/examples/Makefile .
cp lib/appengine-toolkit/examples/config.py .
cp lib/appengine-toolkit/examples/appengine_config.py .
sed -i -e "s/%%PUT_RANDOM_VALUE_HERE%%/`(date;shasum -a 384 /etc/* 2&>/dev/null)|md5`-5a17/" appengine_config.py
# You might also want to install https://github.com/mdornseif/huTools for some additional functionality.
git submodule add [email protected]:hudora/huTools.git lib/huTools
git submodule foreach 'echo ./`basename ${path}` >> ../submodules.pth'
config.py
is assumed to be imported everywhere to set up paths. It should contain as little code as possible. lib/
contains 3rd party modules.
BasicHandler
is the suggested base class for your WSGI handlers. It tries to encapsulate some best practices and can work with your models to provide automatic generation of HTML/XML/JSON/PDF from the same code, efficient pagination, usage of the jinja2 template engine, messages and authentication.
BasicHandler handles HTTP HEAD
requests by doing an internal GET
request.
All HTTP methods (GET
, POST
, etc.) have access to a session dictionary at self.session
. See gae-session for further documentation.
All models are expected to implement something like this:
def get_url(self, abs_url=lambda x: x):
return abs_url("/artnr/%s/" % self.id)
abs_url
should be provided by the caller (view) to change the relative into an absolute URL. gaetk assumes a Entity knows where it it's canonical representation is located on a local URL path and does not support "routes" or any abstracting away the url knowledge out of the models.
BasicHandler
provides a abs_url()
method to change a relative into a absolute URL. usual calling convention is like this in a handler:
def get(self, keyid):
entity = Entity.get(keyid)
url = entity.get_url(abs_url=self.abs_url)
...
Also entities are expected to have an as_dict(abs_url=lambda x: x)
method which should return a dictionary representation of an object.
BasicHandler currently provides an error()
method to send error codes to the client. Users are encouraged to use raising of HTTP-Status codes instead. This results in much easier program flow and clearly documents a function ending wit e.g. a redirect.
Typical usage is like this:
raise gaetk.handler.HTTP301_Moved(location='http://example.com/')
error()
might be removed in future releases.
paginate()
implements pagination of a query using offset and cursors. See the paginate()
docstring for further information.
query = Rechnungsliste.all().filter('kundennr = ', kundennr).order('-monat')
values = self.paginate(query, 15)
# values is now something like:
# {more_objects=True, prev_objects=True, prev_start=10, next_start=30,
# objects: [...], cursor='ABCDQWERY'}
self.render(values, 'page.html')
A template implementing pagination looks like this:
<div class="pagination">
<ul>
<li class="prev {% if not prev_objects %}disabled{% endif %}"><a href="?{{ prev_qs }}">← Zurück</a></li>
<li class="next {% if not more_objects %}disabled{% endif %}"><a href="?{{ next_qs }}">Vor →</a></li>
</ul>
</div>
gaetk uses Jinja2 for rendering templates. Jinja2 is imported on demand so you can use gaetk without having jinja2 available. create_jinja2env()
sets up the Jinja2 Environment, you might have to overwrite it in a subclass, if you use Jinja2 extensions.
default_template_vars()
sets up values to be available to every template. You might want to overwrite it to get an effect like Django's context processors.
render()
returns a rendered template. render()
does the same but also sends the template to the client.
IT provides every template with the uri
and credential
, the values from default_template_vars()
and the values passed to the method.
multirender()
is a somewhat involved function to render some values in different ways. The usecase is that you want to present the same data in a variety of formats without code dupliction or large if/then/else cascades. We use it to provide the same data as HTML-Page, XML, JSON, PDF, CSV and EDIFACT/EANCOM.
The basic calling convention is:
def get(customerid, format='JSON'):
query = Incoice.all().filter('customerid = ', customerid).order('-date')
values = self.paginate(query, 15)
self.render(format, values, html_template='page.html')
For HTML you can provide a template name in html_template
, all other formats are directly created from the data via huTools.structured or user supplied formatter functions.
Often the HTML needs additional data (e.g. currently logged in user) which can be provided via html_addon
.
multirender()
can generate HTML, XML and JSON out of the box, if you want to provide other formats you must supply a dict of "mappers" which are given the input-data and reformat it for output. e.g.
from functools import partial
multirender(fmt, values,
mappers=dict(xml=partial(dict2xml, roottag='response',
listnames={'rechnungen': 'rechnung', 'odlines': 'odline'},
pretty=True),
html=lambda x: '<body><head><title>%s</title></head></body>' % x))
This renders HTML without Jinja2 and uses huTools.structured.dict2xml
to force a certain XML structure.
See the multirender()
Docstring for further Information.
Per default templates have access to four additional jinja2 filters.
ljustify(width)
andrjustify(width)
Pr- Postefix the given string with spaces until it is width characters longnicenum
formats number with non breaking spaces between thousands. E.g.<span class="currency">1 000</span>
.eurocent
is like nicenum but divides by 100 and uses only two decimal places. E.g.<span class="currency">9 999.99</span>
See jinja_filters.py Docstrings for further documentation.
gaetk has some authentication functionality which is described in a chapter below. BasicHandler
supports this via the is_admin
and the login_required
methods.
authchecker
is called on every request and can be overwritten by subclasses. A simple way of always forcing authentication is like this:
class AuthenitcatedHandler(gaetk.handler.BasicHandler):
def authcecker(self, method, *args, **kwargs):
self.login_required()
TBD: ALLOWED_PERMISSIONS
The Message Framework is not documented for now.
add_message(self, typ, html, ttl=15):
gaetk.handler.JsonResponseHandler
helps to generate nice JSON and JSONP responses. A valid view will look like this:
class VersandauslastungHandler(gaetk.handler.JsonResponseHandler):
def get(self):
entity = BiStatistikZeile.get_by_key_name('versandauslastung_aktuell')
ret = dict(werte=hujson.loads(entity.jsonValue),
...)
return (ret, 200, 60)
This will generate a JSON reply with 60 Second caching and a 200 status code. The reply will support JSONP via an optional callback
parameter in the URL.
Generation of sequential numbers ('autoincrement') on Google appengine is hard. See Stackoverflow for some discussion of the issues. gaetk
implements a sequence number generation based on transactions. This will yield only a preformance of half a dozen or so requests per second but at least allows to alocate more than one number in a single request.
>>> from gaeth.sequences import *
>>> init_sequence('invoce_number', start=1, end=0xffffffff)
>>> get_numbers('invoce_number', 2)
[1, 2]
To use it you need a index.yaml
like this:
indexes:
- kind: gaetkSequence
ancestor: yes
properties:
- name: type
- name: start
- kind: gaetkSequence
ancestor: yes
properties:
- name: type
- name: end
- kind: gaetkSequence
properties:
- name: active
- name: type
- name: start
Add the following lines to your app.yaml
:
- url: /gaetk/static
static_dir: lib/gaetk/static
- url: /gaetk/.*
script: lib/gaetk/gaetk/defaulthandlers.py
- url: /robots.txt
script: lib/gaetk/gaetk/defaulthandlers.py
- url: /version.txt
script: lib/gaetk/gaetk/defaulthandlers.py
This will make bootstrap available at /gaetk/static/bootstrap/1.4.x/
It will also allow you to get JSON encoded statistics at /gaetk/stats.json
:
curl http://localhost:8080/gaetk/stats.json
{"datastore": {"count": 149608,
"kinds": 16,
"bytes": 95853319},
"memcache": {"hits": 1665726,
"items": 1171,
"bytes": 4588130,
"oldest_item_age": 2916,
"misses": 50674,
"byte_hits": 833839440}}
You might want to use Munin to graph these values. In addition the statistics are stored in the Datastore as `gaetk_Stats for later inspection.
RobotTxtHandler
allows serving a robots.txt file that disables crawler access to all app versions except the default version.
VersionHandler
allows clients to read the git revision. When deploying we do something like git show-ref --hash=7 HEAD > version.txt
just before appcfg.py update
and `VersionHandler lets you retive that information.
In models.py you find a superclass LoggedModel
for implementing audit logs for a model:
All create, update and delete operations are logged via the AuditLog
model.
Example:
import gaetk.models
from google.appengine.ext import db
class MyModel(gaetk.models.LoggedModel):
name = db.StringProperty()
obj = MyModel(name=u'Alex')
print obj.logentries()
Please note: Changes in (subclasses of) UnindexedProperty (hence TextProperty and BlobProperty) are not logged.
Tools contians general helpers. It is independent of the rest of gaetk.
tools.split(s)
"Splits a string at space characters while respecting quoting.
>>> split('''A "B and C" D 'E or F' G " "''')
['A',
'B and C',
'D',
'E or F',
'G',
'']
Infrastructure contains helpers for accessing the GAE infrastructure. It is independent of the rest of gaetk.
taskqueue_add_multi
batch adds jobs to a Taskqueue:
tasks = []
for kdnnr in kunden.get_changed():
tasks.append(dict(kundennr=kdnnr))
taskqueue_add_multi('softmq', '/some/path', tasks)
The appengine toolkit provides a method for generic configuration objects.
These configuration objects are stored in Datastore and can hold any string value.
There are two convenience functions for getting and setting values, get_config
and set_config
.
>>> from gaetk import configuration
>>> configuration.get_config('MY-KEY-NAME')
None
>>> configuration.set_config('MY-KEY-NAME', u'5711')
>>> configuration.get_config('MY-KEY-NAME')
u'5711'
Configuration values are cached infinitely. The module provides a HTTP handler as a mean to flush all cached values.
gaetk comes with a simple frameworks to do simle minded acceptance tests againgst
an installed version of your app. The testing harness is installed in resttest_dsl
.
A possible test script would look like this:
import sys from resttest_dsl import create_testclient_from_cli, get_app_version
def main(): """Main Entry Point"""
# init with app id and some credentials
client = create_testclient_from_cli(default_hostname='%s.<myappid>.appspot.com' % get_app_version(),
default_credentials_user='someregular:6BTUKW5TQ',
default_credentials_admin='someadmin:EEGE2OGZ7')
client.GET('/').responds_http_status(200)
client.GET('/').responds_content_type('text/html')
# the two lines above can be written as
client.GET('/').responds_http_status(200).responds_content_type('text/html')
# or even shorter
client.GET('/').responds_html()
# checks for valid XML or JSON
client.GET('/feed.atom').responds_xml()
client.GET('/api.json').responds_json()
# checks that there is a response within 1000 ms
client.GET('/').responds_fast()
# or 500 msa
client.GET('/').responds_fast(500)
# combines responds_html and respondes fast
client.GET('/').responds_normal()
# checks if there is an redirect
client.GET('/artnr/1400/03').redirects_to('/artnr/14600/03.trailer/')
# Internal Pages only available to logged in users
client.GET('/backend/').responds_access_denied()
client.GET('/backend/', auth='user').responds_html()
client.GET('/backend/', auth='admin').responds_html()
# Admin pages available only to admin users
client.GET('/admin/').responds_access_denied()
client.GET('/admin/', auth='user').responds_access_denied()
client.GET('/admin/', auth='admin').responds_html()
sys.exit(client.errors)
if name == "main": main()
It comes also with preformance and Broken-Link monitoring support.
So add before sys.exit()
:
# `responds_normal()` checks for HTML, speed and broken links
client.GET('/').responds_normal()
client.GET('/wir/outlet.html').responds_normal()
# slowstats and brokenlinks contain statistics gathered during the run
from resttest_dsl slowstats, brokenlinks
print "## slowest Pages "
for url, speed in slowstats.most_common(10):
print "{0} ms {1}".format(speed, url)
if brokenlinks:
print "## broken links"
print brokenlinks.values()
for link in brokenlinks:
for source in brokenlinks[link]:
print (u"{0} via {1}".format(link, source)).encode('utf-8')
If tidylib
is installed, it will also output HTML-Errors.
Axel Schlüter for suggestions on abstracting login and JsonResponseHandler.
Contains gaesession.py by David Underhill - http://github.com/dound/gae-sessions Updated 2010-10-02 (v1.05), Licensed under the Apache License Version 2.0.
Contains code from webapp2, Copyright 2010 Rodrigo Moraes. Licensed under the Apache License, Version 2.0
Contains code from bootstrap, Copyright 2011 Twitter, Inc. Licensed under the Apache License, Version 2.0
Contains code from gae_mini_profiler, Copyright 2011 Ben Kamens. Licensed under the MIT license
Contains code from itsdangerous, Copyright (c) 2011 by Armin Ronacher and the Django Software Foundation. Licensed under a Berkeley-style License
gaetk code is Copyright 2010, 2011 Dr. Maximillian Dornseif & Hudora GmbH and dual licensed under GPLv3 and the Apache License Version 2.0.