This is a CloudFlare Workers implementation of the API required by the SeaStats Dashboard. In addition to the endpoints required by the dashboard for retrieving data, this implementation includes endpoints for uploading and managing data.
This project is funded by NCCS (BC Whales) and Orcasound, and developed by Soundspace Analytics.
The technology stack for this project was chosen for its low cost, minimal maintenance requirements, and high performance.
The technology stack is based in javascript and includes:
- CloudFlare Workers platform
- CloudFlare KV key-value data storage
- CloudFlare R2 raw data storage
- Cloudflare Wrangler command-line interface
- Node.js
The source of truth for this API is JSON files stored in R2. R2 storage is relatively slow to read from, so KV storage is used as a caching layer.
We couldn't use KV as the sole storage solution due to its 'eventually consistent' nature, which means that when you read from it you may be viewing stale data.
API requests that include an API key will always bypass the cache to ensure the most up-to-date data is returned.
Uploaded files such as images and audio are currently not cached in KV.
- When reading from KV (without an API key), the data may be up to 60 seconds old due to caching.
- Our data writing process involves reading, updating, and then saving JSON files, which means that parallel API calls that write to the same file could result in data loss. In general, data from different organisations will be in different JSON files, so writing to different organisations at the same time should be safe - but updating multiple stations within the same organisation in parallel may be problematic.
- When this API implementation was initially planned it was going to be very simple, and storing data in static JSON files was a good fit for the initial needs. Additionally, a suitable serverless database product wasn't available at the time. The complexity of this API has grown a lot since then, and we have tacked on a lot of features that would be better served by a more traditional database.
- Currently when filtering with fromDate and toDate for data points that have a
startDateTime
property, we do not correct for the station's timezone. That means the client may receive additional data points and/or be missing data points within +/- 24 hours from the requested range.
The following guide will help you set up two API instances on your own Cloudflare account. One will be for development purposes (a sandbox for testing) and the other will be for production.
You will need:
- a Cloudflare account
- Node.js installed on your computer
You can also optionally obtain:
- a custom domain name for the API to be served from
- a Sentry account for error tracking.
To better understand the Cloudflare Workers platform, you may want to follow the Getting Started guide.
Note: We'll use the seastats-api-
namespace/prefix for the rest of this guide. Replace this with your own namespace if you prefer.
Open your Cloudflare dashboard and:
- Upgrade to a paid ($5/mo) Workers plan to enable the use of R2 storage and higher resource limits.
- An upgrade link can be found in the sidebar of the 'Workers & Pages > Overview' page.
- Note: Cloudflare Workers is very cost-effective but make sure you understand the pricing model and set up billing notifications to avoid unexpected charges.
- Manually create KV namespaces named
- seastats-api-dev
- seastats-api-prod
- Manually create R2 buckets named
- seastats-api-dev
- seastats-api-prod
- Clone this repository to your local machine.
- If you're new to git, you can learn about it here, or download the project files instead of cloning the repository.
- Duplicate the
wrangler.toml.example
file found in the root of this project aswrangler.toml
, and fill in the necessary details:- Enter your Cloudflare account ID. This can be found in the sidebar of the 'Workers & Pages > Overview' page in your Cloudflare dashboard.
- Enter your KV namespace IDs. These can be found in the Cloudflare dashboard under the 'Workers & Pages > KV' page. Hovering over an ID should show a button to copy it.
- Update the R2 bucket names if you chose different ones.
- If you want to use the
/dashboard.js
endpoint, enterCLIENT_URL
values.
- Open a terminal window and:
cd
to the root of this project. Runnpx wrangler login
to authenticate with your Cloudflare account.- Run
npx wrangler publish
andnpx wrangler publish --env production
to push a development and production worker to your Cloudflare account. - Record the URLs provided by Cloudflare for each worker – these will be the endpoints for your API. (You can also find these URLs in your Cloudflare dashboard.)
- Now that your workers are deployed, uncomment the
services
sections in thewrangler.toml
file (remove the#
characters at the start of the lines) then runnpx wrangler publish
andnpx wrangler publish --env production
again. This allows the worker to call itself.
- Add an
ADMIN_API_KEY
environment variable to each worker. This api key will be used for admin functions such as creating new users.- Run
npx wrangler secret put ADMIN_API_KEY
to enter a value for the development worker. - When prompted, enter a long, randomly generated string. You can use a guid generator to create a suitable value.
- Run
npx wrangler secret put ADMIN_API_KEY --env production
and enter a value for the production worker. Make sure to use a different value from the development worker.
- Run
You should now be able to access your API at the URLs provided by Cloudflare. See the API documentation for more information on how to use the API, or read the Local development section if you wish to modify the API code. To easily test the API, see the Testing with Insomnia section, which includes an Insomnia collection you can import.
- You can use the
wrangler dev
command to run a local development server that will proxy requests to your Cloudflare workers. This is useful for testing changes before deploying them.- Before running this command for the first time, create a
.dev.vars
file in the root of the project with the following content, substituting in your own value between the quotes. This sets the api key for the local development server made available by thewrangler dev
command. This can be any value but should be different from the production and development API keys.ADMIN_API_KEY="<some-random-value>"
- Before running this command for the first time, create a
- Run
npx wrangler publish
to publish your local code to the development worker - Run
npx wrangler publish --env production
to publish your local code to the production worker
When sending requests to the API, you can optionally include an API key as a Bearer Token in the Authorization
header. An API key is required for all endpoints that modify data, and will also have the effect of bypassing the cache when included on public endpoints.
There are two types of API keys, which we conceptualize as users.
- Only one per API installation.
- Can perform administrative tasks, such as creating new users.
- Identified by
ADMIN_API_KEY
environment variable.
- Can be created via the
PUT /user
endpoint. - Can only edit the data of their organisation.
- Each organisation can have many users. A unique user should be created for each device that needs to access your API.
- When you create a user, you'll receive an API token in response. You can invalidate the current API key and get a new one by sending another PUT request to the /user endpoint with the user's name.
Security note: organisation user API keys are stored in plain text in a JSON file, but these files are "encrypted at rest".
Each station has a timeZone
property that should be set to a TZ Database Name. This allows for localised display of data such as sunrise/sunset times, and for time-based data points to be correctly aligned with the station's local time zone when filtering.
All timestamps must be in UTC ISO 8601 format, for example 2021-01-01T12:00:00.000Z
.
Check the response status header for the HTTP status code. A 200 status code indicates success, while a 4xx or 5xx status code indicates an error.
Usually, the API will respond with a JSON object that includes some or all of the following properties:
success
: A boolean indicating whether the operation was successful.msg
: A string with a message describing the result of the operation.warnings
: An array of strings with any warnings that occurred during the operation. This may contain hints on fields which were not filled out.errors
: An array of strings with any errors that occurred during the operation. This may contain validation errors.data
: An object containing the data returned by the operation.cacheHit
: A boolean indicating whether the response was served from the cache.
Create or update an organization.
- Admin API key required.
ORGANIZATION_KEY
should be a unique string that represents the organization. It can only include letters, numbers and hyphens.- Omitting a property will cause the existing value (if any) to be retained.
- To 'delete' the value of a property, set the value to null.
JSON payload structure:
Key | Type | Description |
---|---|---|
name | string | (optional when updating) The name of the organization |
logoUrl | null or string | (optional) URL to an image to use as the logo for this organization |
metadata | null or object | (optional) Any additional data you want to store |
password | null or string | (optional) Set a password to limit access to this organization's data. Set to null to remove the password. |
Example JSON payload:
{
"name": "Test Org",
"logoUrl": null,
"metadata": {
"test": 1234,
"foo": "bar"
},
"password": null
}
Delete an organization, along with all of it's associated data (users, stations, uploads, data points).
- Admin API key required
- See the client documentation for details.
Retrieve a list of organizations and their properties. All properties set on each organization will be shown except for password
, instead a public
property will be included and set to false
if the organization has a password.
Example response:
{
"success": true,
"msg": "Operation was successful",
"warnings": [],
"cacheHit": true,
"data": {
"test-org": {
"key": "test-org",
"name": "Test Org",
"logoUrl": null,
"metadata": {
"test": 1234,
"foo": "bar"
},
"public": true
},
"private-org": {
"key": "private-org",
"name": "Test Private Org",
"logoUrl": "https://example.com/logo.png",
"metadata": null,
"public": false
}
}
}
Upload a file for a particular organization.
- API key required.
FILE_PATH
is a filename that can optionally included/
characters to denote a directory structure. Example:audio/orca-sample.mp4
- Set the body of the request to the contents of the file being uploaded.
- The
Content-Type
header should be set to the type of file being uploaded, for exampleimage/png
for a PNG image, oraudio/mp4
for an AAC audio file with a .mp4 file extension.
Delete an existing organization upload.
- API key required.
Retrieve an organization upload (acts as a direct link to download the file).
Retrieve a list of organization uploads.
- Optionally append a path prefix to filter results.
/organization-uploads/{ORGANIZATION_KEY}/{PATH_PREFIX}
Example response:
{
"success": true,
"msg": "Operation was successful",
"warnings": [],
"cacheHit": true,
"data": [
{
"key": "test-org/test-station-01/audio/orca-audio-sample.mp4",
"uploaded": "2024-05-30T20:38:56.093Z",
"url": "https://example.com/api/v1/organization-upload/test-org/test-station-01/audio/orca-audio-sample.mp4"
}
]
}
Create a new user, or cycle the API key of an existing user.
- Admin API key required
- User API key will be returned in the response
Example response:
{
"success": true,
"msg": "Operation was successful",
"warnings": [],
"cacheHit": false,
"data": {
"apiKey": "ccd72640-782c-4ff9-af09-8f342c9994d5-e8ca8396-18bd-4f4a-b003-cbe94b9b90a4-dd7fe9e4-0aaa-45fb-a31d-afe9a2a8a42d"
}
}
Delete a user.
- Admin API key required
Retrieve a list of users and the organization they are associated with. Does not include API keys.
- Admin API key required
Example response:
{
"success": true,
"msg": "Operation was successful",
"warnings": [],
"cacheHit": true,
"data": [
{
"organizationKey": "test-org",
"userKey": "erics-mac-mini"
},
{
"organizationKey": "test-org",
"userKey": "jon-bon-jovial"
}
]
}
- API key required.
STATION_KEY
should be a unique string that represents the station. It can only include letters, numbers and hyphens.- Omitting a property will cause the existing value (if any) to be retained.
- To 'delete' the value of a property, set the value to null.
JSON payload structure:
Key | Type | Description |
---|---|---|
name | string | (optional when updating) The name of the station |
latitude | number | (optional when updating) The latitude of the station |
longitude | number | (optional when updating) The longitude of the station |
timeZone | string | (optional when updating) The timezone of the station. Use a TZ Database Name such as 'America/Vancouver' for lunar/daylight calculations etc. |
logoUrl | null or string | (optional) URL to an image to use as the logo for this station |
audioVisualisation | string | (optional) The type of audio visualisation to use for this station. Can be 'spectrogram' or 'waveform' |
sidebarText | array of objects | (optional) An array of objects with label and text properties to display in the sidebar of the dashboard |
metadata | null or object | (optional) Any additional data you want to store |
Example JSON payload:
{
"name": "Test Station 01",
"latitude": 50.600408,
"longitude": -126.70807,
"timeZone": "America/Vancouver",
"logoUrl": null,
"audioVisualisation": "spectrogram",
"sidebarText": [
{
"label": "Region",
"text": "Blackfish Sound"
},
{
"label": "Species",
"text": "Humpback, Orca"
}
],
"metadata": {
"anything": 10,
"you": false,
"like": [
"can",
"be saved",
{
"in": "metadata"
}
]
}
}
Delete a station and all associated data points and file uploads.
- API key required.
Retrieve a station and its properties.
- See the client documentation for details.
Retrieve a list of stations and their properties. Includes the same data as the ``/station/{ORGANIZATION_KEY}/{STATION_KEY}` endpoint but for all stations.
Upload a file for a particular station.
- API key required.
UPLOAD_TYPE_KEY
is a unique string that represents the type of upload. It can only include letters, numbers and hyphens.FILE_PATH
is a filename that can optionally include/
characters to denote a directory structure. Example:audio/orca-sample.mp4
- Set the body of the request to the contents of the file being uploaded.
- The
Content-Type
header should be set to the type of file being uploaded, for exampleimage/png
for a PNG image, oraudio/mp4
for an AAC audio file with a .mp4 file extension.
Delete an existing station upload.
- API key required.
Retrieve a station upload (acts as a direct link to download the file).
Retrieve a list of station uploads.
- See the client documentation for details.
Create or replace data points for a particular station.
- API key required.
- See the client documentation for details on types and shapes of permitted data points.
- For data point types that have a
date
property, uploading a data point with the same date as an existing data point will overwrite the existing data point. - For data point types that have a
startDateTime
property, uploading a data point with the samestartDateTime
as an existing data point will overwrite the existing data point. - To determine whether a data point already exists, a key is constructed from the
dataPointType
anddate
orstartDateTime
properties. If a data point with the same key already exists, it will be overwritten. Some data point types may use additional identifier properties in the key, for example the 'exceedance' data point type uses the 'band' and 'threshold' properties in the key, and 'callEvent' data points use the 'species' and 'callType' properties in the key.
Example JSON payload:
[
{
"dataPointType": "recordingCoverage",
"date": "2021-01-03",
"value": 0.16527777777777777
},
{
"dataPointType": "recordingCoverage",
"date": "2021-01-04",
"value": null
}
]
Delete all data points for a particular station that match the query parameters.
- API key required.
- Caution: if no query parameters are provided, all data points for the station will be deleted.
- See the client documentation for the query parameters that can be used as filters.
Retrieve data points for a particular station.
- See the client documentation for details.
Retrieve a summary of the data stored in the API. This essentially combines the output of the /organizations
and /stations
endpoints.
This endpoint is intended to be used in a script tag, and will take care of loading the SeaStats client javascript bundle and css in to the page so that a dashboard can be rendered. See the client documentation for details.
Usage: <script src="https://example.com/api/v1/dashboard.js"></script>
Sanitise the data structure of a particular organization and rebuild the cache.
- Admin API key required
Sanitise the data structure of a particular station and all of it's data points, and rebuild the cache.
- Admin API key required
Sanitise data and rebuild the cache for all known orgs and stations.
- Admin API key required
Insomnia tool for testing APIs. Subscriptions are available but the core features can be used for free. To get started, follow these steps:
- Download and install Insomnia
- Import the Insomnia collection found in the
insomnia
directory of this project, and open it. - Insomnia allows you to create environments for holding variables. We suggest creating a separate environment for each station, so that you can re-use the same requests. A 'Test Station 01' environment is included in the collection. Edit it so that the
ADMIN_API_KEY
andHOST
variables match your own setup. - You can now use the requests in the collection to interact with your API. The requests are organised into folders that correspond to the API endpoints. Open the Admin > PUT Organization endpoint and click the 'Send' button to create a new organization. This will create a new organization with the key
test-org
, since that is the value of theORGANIZATION_KEY
variable in the selected environment. TheJSON
tab of the UI shows you the JSON payload that was sent to the API. - Use the User > PUT Station endpoint to create a new station. This will create a station with the key
test-station-01
under thetest-org
organization. - We now have an organization and a station, but no data points. Use the User > PUT Data Points endpoint to upload some sample data points for the station. The
JSON
tab of the UI shows you the JSON payload that was sent to the API. - Finally, try the Public > GET Station and Public > GET Data Points endpoints to retrieve the data you uploaded. The response will be shown in the
Preview
tab of the UI.
Note: if you are using a local API endpoint provided by wrangler dev
you can use the API key found in the .dev.vars
file for admin access.
In the event that the KV cache becomes corrupted or out of sync with the R2 storage (for example after restoring from a backup), you can rebuild the cache with the rebuild-all
endpoint.
-
Create an API token in CloudFlare for accessing your bucket/s
- Make sure to copy the S3 Access Key ID and Secret Access Key on the confirmation screen as you will not be able to access them afterwards.
- The default 'object read only' permission will be fine if you only want to back up your files. If you want to be able to restore backups with rclone you would need to enable write permissions. If you get a 403 permissions error while using rclone, you likely need to adjust your permissions.
-
Copy your Cloudflare Account ID during the setup process
-
Configure rclone to access your R2 account.
- Enter a name such as
seastats-r2
when prompted
- Enter a name such as
To download all data from your R2 storage to a local directory, use the command below. The first download may take some time depending on your internet speed and the data volume, but subsequent backups to the same directory should be much faster as only new or changed files would be downloaded.
Note that the sync command will delete files in the destination directory that are not present in the source directory. It's recommended to use the --dry-run
flag first to preview what files would be affected. After you've checked the result, you can run the command again without that flag to actually sync the files.
rclone sync --dry-run seastats-r2:seastats-api-prod {YOUR_LOCAL_BACKUP_DIRECTORY}
Rclone can sync to a variety of destinations, including other cloud storage providers, so you can use it to back up your data to multiple locations.
- Make use of Cloudflare D1 and an ORM such as Drizzle
- Improve data access performance
- Simplify code
- Remove/reduce need for validation/casting/rebuilding JSON data
- Use a validation library such as JOI instead of custom code
- Provide resampled versions of images suitable for user's screen
- Make use of Cloudflare's cache API to set a TTL on requests that don't include an API key
- This should allow near-instantaneous subsequent JSON responses and faster image and audio loading
- Better password-protection solution for organisation data
- This currently is only a rudimentary implementation and potentially insecure.