Skip to content
Will Roper edited this page Nov 8, 2022 · 3 revisions

WhereDoIVote uses partitioning to split our data access across 2 databases in production:

  • default: Every front-end has its own copy of the data about addresses, polling stations, polling districts and councils (essentially, the core data we need to serve a search result to a user)
  • logger: There is a shared RDS instance for storing logs, feedback, bug reports, etc

Some criteria for deciding what should be stored in which database:

  • default should store read-only data that we can afford to lose or can be trivially rebuilt. If we can load it in from a management command (e.g: AddressBase, local authority boundaries, polling station addresses, etc), it can go in default.
  • If we create it via the front-end or a user interaction (e.g: a bug report or user account), it needs to go in logger or we'll lose it next time we deploy.
  • If it is performance critical to serving a search result, it should be in default so we can scale out without bottlenecking on DB read capacity.
  • If it is something that will only be used by a small number of users (e.g: internal tooling), it can go in logger.
  • Data in default and data in logger must be free of dependencies, so we can't enforce foreign key integrity between a table which lives on the logger DB in production and a table which lives on the default DB in production. This is why LoggedPostcode.council, for example doesn't have a constraint. We need to be careful about this in local development because we generally develop with a single DB connection.

Database router config is stored in: polling_stations/db_routers.py but broadly, the things that live on the logger DB in production are:

  • /admin accounts
  • API tokens (most of the I/O for keeping track of API tokens is handled by redis)
  • Logs and User Feedback (see Stats and Feedback)
  • Bug reports
  • File Uploads

We also published a blog post about this in 2017 which is a little old but mostly still true.

RDS updates

Upgrade PostGIS

This needs to be done for every DB that has the PostGIS extension enabled.

Log into the DB

psql postgresql://<USER>@<ENDPOINT>:5432/<DBNAME>

eg: will@laptop:~$ psql postgresql://polling_stations@wdivcluster.f00b4rb42.eu-west-2.rds.amazonaws.com:5432/logger_staging

Then

# Get the version
=> select * from pg_available_extensions where name like '%postgis%';

# Discover what update paths exist
=> SELECT * FROM pg_extension_update_paths('postgis') WHERE source='2.3.7' AND target NOT LIKE '%next%' AND source<target AND path LIKE '%--%';

# Update to the most recent
=> alter extension postgis update to '3.1.5';
=> select postgis_extensions_upgrade();

# Check
select * from pg_available_extensions where name like '%postgis%';

You can then check what version of Postgres you can bump to from the PostGIS compatibility matrix.

Go to RDS console and modify the engine to the version you want.