Skip to content

Architectural Decision Records

fzhao99 edited this page Jun 21, 2022 · 17 revisions

Multiplex changes for Flu A and B

Date: 06/2022

Status: Accepted

Additional discussion available here

Context

As of February 2022, SimpleReport is a COVID-19 specific tool. In order to become more flexible, we are looking to expand SimpleReport’s capabilities to start recording Flu results in the app. We want to support the growing number of SimpleReport users testing on multiplex devices that test for COVID-19, Flu A & B with a single patient sample.

Over 65,000 tests have been recorded in SimpleReport using these devices. We want users to be able to report flu results from these devices in tandem with COVID-19 results for three reasons:

  • To better support users as they track disease outbreaks in their facilities and organizations.
  • To provide a proof-of-concept to the CDC and other public health partners that flu data can be collected and is useful.
  • To begin the engineering work required to add additional diseases.

For this iteration of multiplex testing, we will not focus on how to report these results to public health departments through ReportStream, as many departments do not want this data yet. Instead, we’ll focus on how to collect and store these results within the SimpleReport system. We believe this will provide value to our end users by allowing them to review the test results and provide a more complete test result to the patient via SimpleReport’s SMS and SMTP test result delivery.

Decision

  • Previous to multiplex, the “results” data schema had the following pieces: an immutable TestEvent object that we send to ReportStream representing “something related to testing has happened in SR” and a mutable TestOrder object that we use to represent corrected/removed tests results that maps more closely to “a test for a person at a time has occurred”. These objects and their relationships were maintained for this round of multiplex.

  • Previous to multiplex, we tracked test outcomes/results as columns in the TestEvent/TestOrder tables. To implement multiplex, we replaced these columns with a Results table joined to the TestOrder and TestEvent tables. These joins maintained the mapping of Result <> TestOrder <> TestEvent. The Results table has a "Test Result" column taking one of "positive"/"inconclusive"/"negative" similar to the columns that existed in the TestOrder/Event tables.

  • For previous results that weren't corrected or removed, we backfilled the results table with entries that joined existing TestOrder/Events with the respective results’ positive/negative/inconclusive statuses. as well as with the new info like disease ID needed for multiplex

  • We decided that for corrected and removed tests, we’ll make a new result object that links the TestOrder (which we update with removal/corrects status) to a new TestEvent.

    • This means for the existing removed/corrected tests, we’ll need make a new result entry that links the old TestEvents (which are orphaned when a new TestEvent is made) to the TestOrders that they should be associated with. In other words, TestOrders are one-to-many on Results, which are one-to-one on TestEvents, which are many-to-one on TestOrders 🙃.

Consequences

The multiplex implementation has enabled SimpleReport to begin accepting Flu A and B results, as well as laying the groundwork for easier addition of future diseases. These changes have surfaced the potential need to refactor the Results flow given the complicated workarounds to TestEvents/Orders that were needed to make multiplex work.

Flexible Database Migration

Date: 02/2022

Status: Accepted

Context

During the Omicron surge in late 2021 through the beginning of 2022, the volume of daily tests SimpleReport processed increased exponentially. The system frequently alerted on database connection exhaustion errors, as there were not enough available connections to process the number of simultaneous users. Initial remediation attempts included changing the number of connections requested by the application's connection pool, along with increasing the number of application replicas present at a given point in time. These solutions proved temporary, however, as SimpleReport's rate of adoption continued to outstrip its available capacity. A more permanent solution was established by changing the SKU of our existing database to increase the fixed number of available connections, along with the available DB-specific compute resources. This still did not solve our issues with connection pool sizing, and required careful management of application replicas to prevent inadvertent connection starvation. The issue resurfaced as attempts were made to migrate the audit log functionality away from the database, and to Splunk, a third-party log analytics provider.

Decision

Ultimately, the team decided to move to the Azure Database for PostgreSQL - Flexible Server product.

Movement to the Flexible Server SKU would provide a number of advantages, many of which are covered in this documentation. Specifically, the following provide the greatest impact to SimpleReport:

  • High availability of database instances, ensuring minimal downtime in the event of a datacenter or machine outage.
  • Automated patching with a managed maintenance window, ensuring that Azure does not attempt to perform maintenance of our DB during peak SR usage hours
  • Rapid scaling and performance management, enabling rapid response to customer demand
  • The integration of PgBouncer connection pooling, which increases resource efficiency and minimizes the need for manual connection management

Consequences

The Flexible DB rollout has helped reduce response time according to the Azure metrics.

Local development

Setup

How to

Development process and standards

Oncall

Technical resources

How-to guides

Environments/Azure

Misc

?

Clone this wiki locally