- A table for every type of resource
- An indexed column for most search parameters
- Can use arrays, structs, range types, full-text indexes
- NULLs are apparently stored very efficiently
- Store resources as JSON BLOBs
- Every blob is immutable & identified by (_id, _versionId) → can be cached in some highly scalable non-ACID system with PostgreSQL handling cache misses
- JSONB is problematic as “jsonb will reject numbers that are outside the range of the PostgreSQL numeric data type” (link) but FHIR requires arbitrary-precision decimals..
- PostgreSQL already does compression, but compressing application-side could reduce CPU / network traffic and enable caching of already-compressed blobs
- Single history table
- Stores all historical versions, and perhaps even current versions (see below)
- Indexed _lastUpdated column can satisfy whole-system history _since/_at queries
- Storage of current-version blobs has several possibilities:
- Only store in history table
- Queries get ids (_id + _versionId) from index tables
- Can immediately join to history table to get resource blobs
- Or can cache all the ids application-side (e.g. in Redis as auto-expiring keys), and retrieve resources in pages from PostgreSQL or another cache
- Probably better if many queries returning large numbers of paged results - query only run once, less load on PostgreSQL
- For queries returning too many ids (e.g. all Observations) this may not be appropriate so likely to still need a way to batch results from PostgreSQL (e.g. sort & batch by _lastModified as a last resort)
- Only store in the resource-specific tables
- Slightly more efficient current-version queries?
- Perhaps better if many queries returning small numbers of non-paged results?
- PostgreSQL still needs to go from index value → table row → blob, whereas with (a) will go index value → history table index → history table row → blob (?)
- Less efficient history queries
- Slightly more efficient current-version queries?
- Only store in history table
- Id generation
- Global counter table so every instance of GoFHIR can pre-allocate a bunch of ids - if the instance is stopped or crashes there will be holes but that should be ok
- Or can use a strategy like MongoDB’s ObjectIds that perhaps shard better?
- Would be nice to also support CockroachDB which is partially PostgreSQL-compatible https://github.com/cockroachdb/cockroach