A micro multi-website analytics database service designed to be fast and robust, built with Go and SQLite.
Analytics databases tend to grow fast and exponentially. Requesting data for one specific website from a single database thus become very slow over time. But analytics data are highly decoupled between two websites.
The idea behind µAnalytics is to shard your analytics data on a key, which is usually a website name. Each shard thus only contains a specific website data, allowing faster response times and easy horizontal scaling.
To handle requests even faster, µAnalytics automatically manages a pool of connections to multiple shards at a time.
By default, the service keeps 10 connections alive.
But you can easily increase/decrease the max number of alive shards with the --connections
flag when launching the app.
$ go install github.com/GitbookIO/micro-analytics
To launch the application, simply run:
$ ./micro-analytics
The command takes the following optional parameters:
Parameter | Environment Variable | Usage | Type | Default Value |
---|---|---|---|---|
--user, -u |
MA_USER |
 Username for basic auth | String | "" |
--password, -w |
MA_PASSWORD |
 Password for basic auth | String | "" |
--port, -p |
MA_PORT |
 Port to listen on | String | "7070" |
--root, -r |
MA_ROOT |
Database directory | String | "./dbs" |
--connections, -c |
MA_POOL_SIZE |
Max number of alive shards connections | Number | 1000 |
--idle-timeout, -i |
MA_POOL_TIMEOUT |
Idle timeout for DB connections in seconds | Number | 60 |
--cache-directory, -d |
MA_CACHE_DIR |
Cache directory | String | ".diskache" |
If --user
is provided, the service will automatically use basic access authentication on all requests.
The actual cache directory will be a subdirectory named after the app major version. The default will then be ./.diskache/0
.
All shards of the µAnalytics database share the same TABLE schema:
CREATE TABLE visits (
time INTEGER,
event TEXT,
path TEXT,
ip TEXT,
platform TEXT,
refererDomain TEXT,
countryCode TEXT
)
Every query for a specific website can be executed using a time range. Every following GET request thus takes the two following optional query string parameters:
Name | Type | Description | Default | Example |
---|---|---|---|---|
start |
Date | Start date to query a range | none | "2015-11-20T12:00:00.000Z" |
end |
Date | End date to query a range | none | "2015-11-21T12:00:00.000Z" |
The dates can be passed either as:
- ISO (RFC3339)
"2015-11-20T12:00:00.000Z"
- UTC (RFC1123)
"Fri, 20 Nov 2015 12:00:00 GMT"
- A Unix timestamp as a String
"1448020800"
Name | Type | Description | Default | Example |
---|---|---|---|---|
unique |
Boolean | Include the total number of unique visitors in response | none | true |
Except for GET /:website
, every response to a GET request will contain the two following values:
Name | Type | Description |
---|---|---|
total |
Integer | Total number of visits |
unique |
Integer | Total number of unique visitors based on ip , set to 0 unless unique=true is passed as a query string parameter |
Returns the full analytics for a website.
{
"list": [
{
"time": "2015-11-25T16:00:00+01:00",
"event": "download",
"path": "/somewhere",
"ip": "127.0.0.1",
"platform": "Windows",
"refererDomain": "gitbook.com",
"countryCode": "fr"
},
...
]
}
Returns the count of analytics for a website. The unique
query string parameter is not necessary for this request.
{
"total": 1000,
"unique": 900
}
Returns the number of visits per countryCode
.
label
contains the country full name.
{
"list": [
{
"id": "fr",
"label": "France",
"total": 1000,
"unique": 900
},
...
]
}
Returns the number of visits per platform
.
{
"list": [
{
"id": "Linux",
"label": "Linux",
"total": 1000,
"unique": 900
},
...
]
}
Returns the number of visits per refererDomain
.
{
"list": [
{
"id": "gitbook.com",
"label": "gitbook.com",
"total": 1000,
"unique": 900
},
...
]
}
Returns the number of visits per event
.
{
"list": [
{
"id": "download",
"label": "download",
"total": 1000,
"unique": 900
},
...
]
}
Returns the number of visits as a time serie. The interval in seconds can be specified as an optional query string parameter. Its default value is 86400
, equivalent to one day.
Name | Type | Description | Default | Example |
---|---|---|---|---|
interval |
Integer | Interval of the time serie | 86400 (1 day) |
3600 |
Example with interval set to 3600
:
{
"list": [
{
"start": "2015-11-24T12:00:00.000Z",
"end": "2015-11-24T13:00:00.000Z",
"total": 450,
"unique": 390
},
{
"start": "2015-11-24T13:00:00.000Z",
"end": "2015-11-24T14:00:00.000Z",
"total": 550,
"unique": 510
},
...
]
}
Insert new data for the specified website.
{
"time": "2015-11-24T13:00:00.000Z", // optional
"event": "download",
"ip": "127.0.0.1",
"path": "/README.md",
"headers": {
// ...
// HTTP headers received from your visitor
}
}
The time
parameter is optional and is set to the date of your POST request by default.
Passing the HTTP headers in the POST body allows the service to extract the refererDomain
and platform
values.
The countryCode
will be deduced from the passed ip
parameter using Maxmind's GeoLite2 database.
Insert a list of analytics for a specific website. The analytics can be sent directly in DB format, with time
being a String value.
time
can be passed as either:
- ISO (RFC3339)
"2015-11-20T12:00:00.000Z"
- UTC (RFC1123)
"Fri, 20 Nov 2015 12:00:00 GMT"
- A Unix timestamp as a String
"1448020800"
If the time
parameter is not provided, it will be defaulted to the exact time of the server processing the POST
request.
As for the POST /:website
method, the analytics can also have an optional headers
parameter.
If the refererDomain
and/or platform
values are not passed in the JSON body, the headers
parameter will be used to set these values automatically.
{
"list": [
{
"time": "1450098642",
"ip": "127.0.0.1",
"event": "download",
"path": "/somewhere",
"platform": "Apple Mac",
"refererDomain": "www.gitbook.com",
"countryCode": "fr"
},
{
"time": "2015-11-20T12:00:00.000Z",
"ip": "127.0.0.1",
"event": "login",
"path": "/someplace",
"headers": {
// ...
// HTTP headers received from your visitor
}
}
]
}
The countryCode
will be reprocessed by the service using GeoLite2 based on the ip
.
Insert a list of analytics for different websites. The analytics have the same format as POST /:website/bulk
, with a mandatory website
parameter.
{
"list": [
{
"website": "website-1",
"time": "1450098642",
"ip": "127.0.0.1",
"event": "download",
"path": "/somewhere",
"platform": "Apple Mac",
"refererDomain": "www.gitbook.com",
"countryCode": "fr"
},
{
"website": "website-2",
"time": "2015-11-20T12:00:00.000Z",
"ip": "127.0.0.1",
"event": "login",
"path": "/someplace",
"headers": {
// ...
}
}
]
}
Fully delete a shard from the file system.