Author: Colin Howe (@colinhowe)
License: Apache 2.0
Django Sampler allows you to sample a percentage of your queries (SQL, Mongo, etc) and view the ones that are taking up the most time. The queries are grouped together by where they originated from in your code.
Install:
pip install git+git://github.com/colinhowe/djangosampler.git#djangosampler
or download and run
python setup.py install
- Configure:
Add
djangosampler
to your INSTALLED_APPSAdd the tables (
manage.py syncdb
ormanage.py migrate
if you use South)Add the views:
urlpatterns += patterns('', (r'^sampler/', include('djangosampler.urls')), )
Set
DJANGO_SAMPLER_FREQ
to a value between 0 and 1Set
DJANGO_SAMPLER_PLUGINS
to a list of plugins. For just sampling SQL a sensible default is:DJANGO_SAMPLER_PLUGINS = ( 'djangosampler.plugins.sql.Sql', # Plugins are applied in the same order as this list )
There are several plugins available and it is worthwhile reading through them to get the most use out of this tool.
If you are using cost based sampling then set
DJANGO_SAMPLER_BASE_TIME
to the expected duration of a normal query in seconds. By default this is set to 5ms.
- Ensure you have
tox
installed (pip install tox
) - Run
tox
After letting the sampler run for a while you will be able to view queries (grouped by their origin) at the URL you configured.
Django Sampler has a plugin architecture to allow you to control how much data you want to be collected.
In your settings.py add the following:
DJANGO_SAMPLER_PLUGINS = ( 'djangosampler.plugins.sql.Sql', # Plugins are applied in the same order as this list )
The example above will add the SQL plugin.
Available plugins and their settings are described in the Plugins section below.
DJANGO_SAMPLER_FREQ
configures the percentage of queries that will be recorded.
It should be between 0.0 and 1.0.
If this is not set then no plugins will be installed and your code will run as normal.
DJANGO_SAMPLER_USE_COST
will enable cost-based sampling. This causes queries
that run for a long time to be sampled more often than short queries.
The chance that a query is sampled is multiplied by the total time the query takes. If a query takes 2 seconds then it will be twice as likely to be sampled as a query that takes 1 second.
The cost for a query is adjusted to account for this as follows:
cost = max(1.0, time * DJANGO_SAMPLER_FREQ) / DJANGO_SAMPLER_FREQ
A list of available plugins follows. You can write your own plugin and this is described in the section 'Writing Your Own Plugins'.
Plugin class: djangosampler.plugins.sql.Sql
The SQL sampler plugin will sample a percentage of SQL queries that occur in your application. The samples will be grouped by query and stack traces will be recorded to find where the queries are originating.
Plugin class: djangosampler.plugins.request.Request
The request plugin installs a Middleware that will sample the time taken by requests.
This is not strictly a plugin. This is a context manager that will allow you to mark blocks of code and sample how long the blocks take to run. E.g.:
from djangosampler.sampler import sampling with sampling('my_code', 'some_fn'): do_something_slow()
Plugin class: djangosampler.plugins.celery_task.Celery
The Celery plugin uses Celery's signals to sample the time taken to execute tasks.
Plugin class: djangosampler.plugins.mongo.Mongo
The MongoDB plugin will sample a percentage of Mongo commands (queries, inserts, etc) that occur in your application. The samples will be grouped by command and stack traces will be recorded to find where the queries are originating.
TODO. For now, look in the plugins folder and copy :)
Feedback is always welcome! Github or twitter (@colinhowe) are the best places to reach me.