Skip to content

Latest commit

 

History

History
881 lines (652 loc) · 45 KB

README.md

File metadata and controls

881 lines (652 loc) · 45 KB

IntegreSQL

IntegreSQL manages isolated PostgreSQL databases for your integration tests.

Do your engineers a favour by allowing them to write fast executing, parallel and deterministic integration tests utilizing real PostgreSQL test databases. Resemble your live environment in tests as close as possible.

sequenceDiagram
    You->>Testrunner: Start tests

    Testrunner->>IntegreSQL: New template database
    IntegreSQL->>PostgreSQL: 
    PostgreSQL-->>IntegreSQL: 
    IntegreSQL-->>Testrunner: 

    Testrunner->>PostgreSQL: Connect to template database, apply all migrations, seed all fixtures, ..., disconnect.
    PostgreSQL-->>Testrunner: 

    Testrunner->>IntegreSQL: Finalize the template database
    IntegreSQL-->>Testrunner: 
  
    Note over Testrunner,PostgreSQL: Your test runner can now get isolated test databases for this hash from the pool!

    loop Each test
    Testrunner->>IntegreSQL: Get test database (looks like template database)
    Testrunner->>PostgreSQL: 
    Note over Testrunner,PostgreSQL: Run your test code in an isolated test database!

    Testrunner-xPostgreSQL: Disconnect from the test database.
    end
Loading

Install

A minimal Docker image is published on GitHub Packages. See GitHub Releases.

docker pull ghcr.io/allaboutapps/integresql:<TAG>

Usage

IntegreSQL is a RESTful JSON API distributed as Docker image and go cli. It's language agnostic and manages multiple PostgreSQL templates and their separate pool of test databases for your tests. It keeps the pool of test databases warm (as it's running in the background) and is fit for parallel test execution with multiple test runners / processes.

Run using Docker (preferred)

Simply start a Docker (19.03 or above) container, provide the required environment variables and expose the server port:

docker run -d --name integresql -e INTEGRESQL_PORT=5000 -p 5000:5000 ghcr.io/allaboutapps/integresql:<TAG>

The container can also be included in your project via Docker Compose (1.25 or above):

version: "3.4"
services:

  # Your main service image
  service:
    depends_on:
      - postgres
      - integresql
    environment:
      PGDATABASE: &PGDATABASE "development"
      PGUSER: &PGUSER "dbuser"
      PGPASSWORD: &PGPASSWORD "9bed16f749d74a3c8bfbced18a7647f5"
      PGHOST: &PGHOST "postgres"
      PGPORT: &PGPORT "5432"
      PGSSLMODE: &PGSSLMODE "disable"

      # optional: env for integresql client testing
      # see https://github.com/allaboutapps/integresql-client-go
      # INTEGRESQL_CLIENT_BASE_URL: "http://integresql:5000/api"

      # [...] additional main service setup

  integresql:
    image: ghcr.io/allaboutapps/integresql:<TAG>
    ports:
      - "5000:5000"
    depends_on:
      - postgres
    environment: 
      PGHOST: *PGHOST
      PGUSER: *PGUSER
      PGPASSWORD: *PGPASSWORD

  postgres:
    image: postgres:12.2-alpine # should be the same version as used live
    # ATTENTION
    # fsync=off, synchronous_commit=off and full_page_writes=off
    # gives us a major speed up during local development and testing (~30%),
    # however you should NEVER use these settings in PRODUCTION unless
    # you want to have CORRUPTED data.
    # DO NOT COPY/PASTE THIS BLINDLY.
    # YOU HAVE BEEN WARNED.
    # Apply some performance improvements to pg as these guarantees are not needed while running locally
    command: "postgres -c 'shared_buffers=128MB' -c 'fsync=off' -c 'synchronous_commit=off' -c 'full_page_writes=off' -c 'max_connections=100' -c 'client_min_messages=warning'"
    expose:
      - "5432"
    ports:
      - "5432:5432"
    environment:
      POSTGRES_DB: *PGDATABASE
      POSTGRES_USER: *PGUSER
      POSTGRES_PASSWORD: *PGPASSWORD
    volumes:
      - pgvolume:/var/lib/postgresql/data

volumes:
  pgvolume: # declare a named volume to persist DB data

You may also refer to our go-starter docker-compose.yml.

Run locally (not recommended)

Installing IntegreSQL locally requires a working Go (1.14 or above) environment. Install the integresql executable to your Go bin folder:

# This installs the latest version of IntegreSQL into your $GOBIN
go install github.com/allaboutapps/integresql/cmd/server@latest

# you may want to rename the binary to integresql after installing:
mv $GOBIN/server $GOBIN/integresql

Running the IntegreSQL server locally requires configuration via exported environment variables (see below).

export INTEGRESQL_PORT=5000
export PGHOST=127.0.0.1
export PGUSER=test
export PGPASSWORD=testpass
integresql

Run within your CI/CD

You'll also want to use integresql within your CI/CD pipeline. We recommend using the Docker image. Simply run it next to the postgres service.

GitHub Actions

For a working sample see allaboutapps/go-starter.

jobs:
  build-test:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: postgres:<TAG>
        env:
          POSTGRES_DB: "development"
          POSTGRES_USER: "dbuser"
          POSTGRES_PASSWORD: "dbpass"
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
        ports:
          - 5432:5432
      integresql:
        image: ghcr.io/allaboutapps/integresql:<TAG>
        env:
          PGHOST: "postgres"
          PGUSER: "dbuser"
          PGPASSWORD: "dbpass"

Integrate

You will typically want to integrate by a client lib (see below), but you can also integrate by RESTful JSON calls directly. The flow is illustrated in the follow up section.

Integrate by client lib

It's simple to integrate especially if there is already an client library available for your specific language. We currently have those:

Integrate by RESTful JSON calls

A really good starting point to write your own integresql-client for a specific language can be found here (go code) and here (godoc). It's just RESTful JSON after all.

First start IntegreSQL and leave it running in the background (your PostgreSQL template and test database pool will then always be warm). When you trigger your test command (e.g. make test), 1..n test runners/processes can start in parallel and get ready and isoloated test database from the pool (after the template database(s) was/were initialized).

Once per test runner/process

Each test runner starts and need to communicate with IntegreSQL to setup 1..n template database pools. The following sections describe the flows/scenarios you need to implement.

Testrunner creates a new template database
sequenceDiagram
    You->>Testrunner: make test

    Note right of Testrunner: Compute a hash over all related <br/> files that affect your database<br/> (migrations, fixtures, imports, etc.)

    Note over Testrunner,IntegreSQL: Create a new PostgreSQL template database<br/> identified a the same unique hash <br/>payload: {"hash": "string"} 

    Testrunner->>IntegreSQL: InitializeTemplate: POST /api/v1/templates

    IntegreSQL->>PostgreSQL: CREATE DATABASE <br/>template_<hash>
    PostgreSQL-->>IntegreSQL: 

    IntegreSQL-->>Testrunner: StatusOK: 200

    Note over Testrunner,PostgreSQL: Parse the received database connection payload and connect to the template database.

    Testrunner->>PostgreSQL: Apply all migrations, seed all fixtures, ..., disconnect.
    PostgreSQL-->>Testrunner: 

    Note over Testrunner,IntegreSQL: Finalize the template so it can be used!

    Testrunner->>IntegreSQL: FinalizeTemplate: PUT /api/v1/templates/:hash
    IntegreSQL-->>Testrunner: StatusOK: 200

    Note over Testrunner,PostgreSQL: You can now get isolated test databases for this hash from the pool!

    loop Each test
      Testrunner->>IntegreSQL: GetTestDatabase: GET /api/v1/templates/:hash/tests
      Testrunner->>PostgreSQL: 
    end
Loading
Testrunner reuses an existing template database
sequenceDiagram

    You->>Testrunner: make test

    Note over Testrunner,IntegreSQL: Subsequent testrunners or multiple processes <br/> simply call with the same template hash again.

    Testrunner->>IntegreSQL: InitializeTemplate: POST /api/v1/templates
    IntegreSQL-->>Testrunner: StatusLocked: 423

    Note over Testrunner,IntegreSQL: Some other testrunner / process has already recreated <br/> this PostgreSQL template database identified by this hash<br/> (or is currently doing it), you can just consider<br/> the template ready at this point.

    Note over Testrunner,PostgreSQL: You can now get isolated test databases for this hash from the pool!

    loop Each test
      Testrunner->>IntegreSQL: GetTestDatabase: GET /api/v1/templates/:hash/tests
      Testrunner->>PostgreSQL: 
    end

Loading
Failure modes while template database setup: 503
sequenceDiagram

    You->>Testrunner: make test

    Testrunner->>IntegreSQL: InitializeTemplate: POST /api/v1/templates
    IntegreSQL-->>Testrunner: StatusServiceUnavailable: 503

    Note over Testrunner,PostgreSQL: Typically happens if IntegreSQL cannot communicate with<br/>PostgreSQL, fail the test runner process in this case (e.g. exit 1).

Loading

Per each test

New test database per test

Well, this is the normal flow to get a new isolated test database (prepopulated as its created from the template) for your test.

sequenceDiagram

    Note right of You: ...

    loop Each test

    Note right of Testrunner: Before each test, get a new isolated test database<br/> from the pool for the template hash.

    Testrunner->>IntegreSQL: GetTestDatabase: GET /api/v1/templates/:hash/tests

    Note over Testrunner,IntegreSQL: Blocks until the template is finalized

    Note right of IntegreSQL: The test databases for the template pool<br/>were already created and are simply returned.

    IntegreSQL-->>Testrunner: StatusOK: 200

    Note over Testrunner,PostgreSQL: Your runner now has a fully isolated PostgreSQL database<br/>from our already migrated/seeded template database to use within your test.

    Testrunner->>PostgreSQL: Directly connect to the test database.

    Note over Testrunner,PostgreSQL: Run your test code!

    Testrunner-xPostgreSQL: Disconnect from the test database

    Note over Testrunner,PostgreSQL: Your test is finished.

    end
Loading
Optional: Manually unlocking a test database after a readonly test
  • Returns the given test DB directly to the pool, without cleaning (recreating it).
  • This is optional! If you don't call this endpoints, the test database will be recreated in a FIFO manner (first in, first out) as soon as possible, even though it actually had no changes.
  • This is useful if you are sure, you did not do any changes to the database and thus want to skip the recreation process by returning it to the pool directly.
sequenceDiagram

    Note right of You: ...

    loop Each test

    Testrunner->>IntegreSQL: GetTestDatabase: GET /api/v1/templates/:hash/tests
    IntegreSQL-->>Testrunner: StatusOK: 200

    Testrunner->>PostgreSQL: Directly connect to the test database.

    Note over Testrunner,PostgreSQL: Run your **readonly** test code!

    Testrunner-xPostgreSQL: Disconnect from the test database

    Note over Testrunner,PostgreSQL: Your **readonly** test is finished.<br/> As you did not modify the test database, you can unlock it again<br/>(immediately available in the pool again).

    Testrunner->>IntegreSQL: ReturnTestDatabase: POST /api/v1/templates/:hash/tests/:id/unlock<br/>(previously and soft-deprecated DELETE /api/v1/templates/:hash/tests/:id) 
    IntegreSQL-->>Testrunner: StatusOK: 200

    end
Loading
Optional: Manually recreating a test database
  • Recreates the test DB according to the template and returns it back to the pool.
  • This is optional! If you don't call this endpoint, the test database will be recreated in a FIFO manner (first in, first out) as soon as possible.
  • This is useful if you have parallel testing with a mix of very long and super short tests. Our auto–FIFO recreation handling might block there.
sequenceDiagram

    Note right of You: ...

    loop Each test

    Testrunner->>IntegreSQL: GetTestDatabase: GET /api/v1/templates/:hash/tests
    IntegreSQL-->>Testrunner: StatusOK: 200

    Testrunner->>PostgreSQL: Directly connect to the test database.

    Note over Testrunner,PostgreSQL: Run your test code!

    Testrunner-xPostgreSQL: Disconnect from the test database

    Note over Testrunner,PostgreSQL: Your test is finished.<br/> As you don't want to wait for FIFO autocleaning,<br/> you can manually recreate the test database.

    Testrunner->>IntegreSQL: RecreateTestDatabase: POST /api/v1/templates/:hash/tests/:id/recreate
    IntegreSQL-->>Testrunner: StatusOK: 200

    end
Loading
Failure modes while getting a new test database

Some typical status codes you might encounter while getting a new test database.

StatusNotFound 404

Well, seems like someone forgot to call InitializeTemplate or it errored out.

StatusGone 410

There was an error during test setup with our fixtures, someone called DiscardTemplate, thus this template cannot be used.

StatusServiceUnavailable 503

Well, typically a PostgreSQL connectivity problem

Demo

If you want to take a look on how we integrate IntegreSQL - 🤭 - please just try our go-starter project or take a look at our test_database setup code.

Configuration

IntegreSQL requires little configuration, all of which has to be provided via environment variables (due to the intended usage in a Docker environment). The following settings are available:

Description Environment variable Required Default
Server listen address (defaults to all if empty) INTEGRESQL_ADDRESS ""
Server port INTEGRESQL_PORT 5000
PostgreSQL: host INTEGRESQL_PGHOST, PGHOST Yes "127.0.0.1"
PostgreSQL: port INTEGRESQL_PGPORT, PGPORT 5432
PostgreSQL: username INTEGRESQL_PGUSER, PGUSER, USER Yes "postgres"
PostgreSQL: password INTEGRESQL_PGPASSWORD, PGPASSWORD Yes ""
PostgreSQL: database for manager INTEGRESQL_PGDATABASE "postgres"
PostgreSQL: template database to use INTEGRESQL_ROOT_TEMPLATE "template0"
Managed databases: prefix INTEGRESQL_DB_PREFIX "integresql"
Managed template databases: prefix integresql_template_<HASH> INTEGRESQL_TEMPLATE_DB_PREFIX "template"
Managed test databases: prefix integresql_test_<HASH>_<ID> INTEGRESQL_TEST_DB_PREFIX "test"
Managed test databases: username INTEGRESQL_TEST_PGUSER PostgreSQL: username
Managed test databases: password INTEGRESQL_TEST_PGPASSWORD PostgreSQL: password
Managed test databases: minimal test pool size INTEGRESQL_TEST_INITIAL_POOL_SIZE runtime.NumCPU()
Managed test databases: maximal test pool size INTEGRESQL_TEST_MAX_POOL_SIZE runtime.NumCPU()*4
Maximal number of pool tasks running in parallel INTEGRESQL_POOL_MAX_PARALLEL_TASKS runtime.NumCPU()
Minimal time to wait after a test db recreate has failed INTEGRESQL_TEST_DB_RETRY_RECREATE_SLEEP_MIN_MS 250ms
The maximum possible sleep time between recreation retries INTEGRESQL_TEST_DB_RETRY_RECREATE_SLEEP_MAX_MS 3000ms
Get test-database blocks auto-recreation (FIFO) for this duration INTEGRESQL_TEST_DB_MINIMAL_LIFETIME_MS 250ms
Internal time to wait for a template-database to transition into the 'finalized' state INTEGRESQL_TEMPLATE_FINALIZE_TIMEOUT_MS 60000ms
Internal time to wait for a ready database INTEGRESQL_TEST_DB_GET_TIMEOUT_MS 60000ms
Enables pprof debug endpoints under /debug/* INTEGRESQL_DEBUG_ENDPOINTS false
Enables echo framework debug mode INTEGRESQL_ECHO_DEBUG false
Enables CORS INTEGRESQL_ECHO_ENABLE_CORS_MIDDLEWARE true
Enables logger INTEGRESQL_ECHO_ENABLE_LOGGER_MIDDLEWARE true
Enables recover INTEGRESQL_ECHO_ENABLE_RECOVER_MIDDLEWARE true
Sets request_id to context INTEGRESQL_ECHO_ENABLE_REQUEST_ID_MIDDLEWARE true
Auto-adds trailing slash INTEGRESQL_ECHO_ENABLE_TRAILING_SLASH_MIDDLEWARE true
Enables timeout middleware INTEGRESQL_ECHO_ENABLE_REQUEST_TIMEOUT_MIDDLEWARE true
Generic timeout handling for most endpoints INTEGRESQL_ECHO_REQUEST_TIMEOUT_MS 60000ms
Show logs of severity INTEGRESQL_LOGGER_LEVEL "info"
Request log severity INTEGRESQL_LOGGER_REQUEST_LEVEL "info"
Should the request-log include the body? INTEGRESQL_LOGGER_LOG_REQUEST_BODY false
Should the request-log include headers? INTEGRESQL_LOGGER_LOG_REQUEST_HEADER false
Should the request-log include the query? INTEGRESQL_LOGGER_LOG_REQUEST_QUERY false
Should the request-log include the response body? INTEGRESQL_LOGGER_LOG_RESPONSE_BODY false
Should the request-log include the response header? INTEGRESQL_LOGGER_LOG_RESPONSE_HEADER false
Should the console logger pretty-print the log (instead of json)? INTEGRESQL_LOGGER_PRETTY_PRINT_CONSOLE false

Architecture

TestDatabase states

The following describes the state and transitions of a TestDatabase.

stateDiagram-v2

    HashPool --> TestDatabase: Task EXTEND

    state TestDatabase {
        [*] --> ready: init
        ready --> dirty: GetTestDatabase()
        dirty --> ready: ReturnTestDatabase()
        dirty --> recreating: RecreateTestDatabase()\nTask CLEAN_DIRTY
        recreating --> ready: generation++
        recreating --> recreating: retry (still in use)
    }
Loading

Pool structure

The following describes the relationship between the components of IntegreSQL.

erDiagram
    Server ||--o| Manager : owns
    Manager {
        Template[] templateCollection
        HashPool[] poolCollection
    }
    Manager ||--o{ HashPool : has
    Manager ||--o{ Template : has
    Template {
        TemplateDatabase database
    }
    HashPool {
        TestDatabase database
    }
    HashPool ||--o{ TestDatabase : "manages"
    Template ||--|| TemplateDatabase : "sets"
    TestDatabase {
        int ID
        Database database
    }
    TemplateDatabase {
        Database database
    }
    Database {
        string TemplateHash
        Config DatabaseConfig
    }
    TestDatabase o|--|| Database : "is"
    TemplateDatabase o|--|| Database : "is"
Loading

Background

We came a long way to realize that something just did not feel right with our PostgreSQL integration testing strategies. This is a loose summary of how this project came to life.

Approach 0: Leaking database mutations for subsequent tests

Testing our customer backends actually started quite simple:

  • Test runner starts
  • Recreate a PostgreSQL test database
  • Apply all migrations
  • Seed all fixtures
  • Utilizing the same PostgreSQL test database for each test:
    • Run your test code
  • Test runner ends

It's quite easy to spot the problem with this approach. Data may be mutated by any single test and is visible from all subsequent tests. It becomes cumbersome to make changes in your test code if you can't rely on a clean state in each and every test.

Approach 1: Isolating by resetting

Let's try to fix that like this:

  • Test runner starts
  • Recreate a PostgreSQL test database
  • Before each test:
    • Truncate
    • Apply all migrations
    • Seed all fixtures
  • Utilizing the same PostgreSQL test database for each test:
    • Run your test code
  • Test runner ends

Well, it's now isolated - but testing time has increased by a rather high factor and is totally dependent on your truncate/migrate/seed operations.

Approach 2a: Isolation by transactions

What about using database transactions?

  • Test runner starts
  • Recreate a PostgreSQL test database
  • Apply all migrations
  • Seed all fixtures
  • Before each test:
    • Start a new database transaction
  • Utilizing the same PostgreSQL test database for each test:
    • Run your test code
  • After each test:
    • Rollback the database transaction
  • Test runner ends

After spending various time to rewrite all code to actually use the injected database transaction in each code, you realize that nested transactions are not supported and can only be poorly emulated using save points. All database transaction specific business code, especially their potential error state, is not properly testable this way. You therefore ditch this approach.

Approach 2b: Isolation by mocking

What about using database mocks?

  • Test runner starts
  • Utilizing an in-memory mock database isolated for each test:
    • Run your test code
  • Test runner ends

I'm generally not a fan of emulating database behavior through a mocking layer while testing/implementing. Even minor version changes of PostgreSQL plus it's extensions (e.g. PostGIS) may introduce slight differences, e.g. how indices are used, function deprecations, query planner, etc. . It might not even be an erroneous result, just performance regressions or slight sorting differences in the returned query result.

We try to approximate local/test and live as close as possible, therefore using the same database, with the same extensions in their exact same version is a hard requirement for us while implementing/testing locally.

Approach 3a: Isolation by templates

We discovered that using PostgreSQL templates and creating the actual new test database from them is quite fast, let's to this:

  • Test runner starts
  • Recreate a PostgreSQL template database
  • Apply all migrations
  • Seed all fixtures
  • Before each test:
    • Create a new PostgreSQL test database from our already migrated/seeded template database
  • Utilizing a new isolated PostgreSQL test database for each test:
    • Run your test code
  • Test runner ends

Well, we are up in speed again, but we still can do better, how about...

Approach 3b: Isolation by cached templates

  • Test runner starts
  • Check migrations/fixtures have changed (hash over all related files)
    • Yes
      • Recreate a PostgreSQL template database
      • Apply all migrations
      • Seed all fixtures
    • No, nothing has changed
      • Simply reuse the previous PostgreSQL template database
  • Before each test:
    • Create a new PostgreSQL test database from our already migrated/seeded template database
  • Utilizing a new isolated PostgreSQL test database for each test:
    • Run your test code
  • Test runner ends

This gives a significant speed bump as we no longer need to recreate our template database if no files related to the database structure or fixtures have changed. However, we still need to create a new PostgreSQL test database from a template before running any test. Even though this is quite fast, could we do better?

Approach 3c: Isolation by cached templates and pool

  • Test runner starts
  • Check migrations/fixtures have changed (hash over all related files)
    • Yes
      • Recreate a PostgreSQL template database
      • Apply all migrations
      • Seed all fixtures
    • No, nothing has changed
      • Simply reuse the previous PostgreSQL template database
  • Create a pool of n PostgreSQL test databases from our already migrated/seeded template database
  • Before each test:
    • Select the first new PostgreSQL test database that is ready from the test pool
  • Utilizing your selected PostgreSQL test database from the test pool for each test:
    • Run your test code
  • After each test:
    • If there are still tests lefts to run add some additional PostgreSQL test databases from our already migrated/seeded template database
  • Test runner ends

Finally, by keeping a warm pool of test database we arrive at the speed of Approach 0, while having the isolation gurantees of all subsequent approaches. This is actually the (simplified) strategy, that we have used in allaboutapps-backend-stack for many years.

Approach 3c benchmark 1: Baseline

Here's a quick benchmark of how this strategy typically performed back then:

--- ----------------<storageHelper strategy report>---------------- ---
    replicas switched:          50     avg=11ms min=1ms max=445ms
    replicas awaited:           1      prebuffer=8 avg=436ms max=436ms
    background replicas:        58     avg=272ms min=41ms max=474ms
    - warm up template (cold):  82%    2675ms
        * truncate:             62%    2032ms
        * migrate:              18%    594ms
        * seed:                 1%     45ms
    - switching:                17%    571ms
        * disconnect:           1%     42ms
        * switch replica:       14%    470ms
            - resolve next:     1%     34ms
            - await next:       13%    436ms
        * reinitialize:         1%     57ms
    strategy related time:      ---    3246ms
    vs total executed time:     20%    15538ms
--- ---------------</ storageHelper strategy report>--------------- ---

This is a rather small testsuite with 50 tests and with a tiny database. Thus the whole test run was finished in ~15sec. ~2.7sec were spend setting up the template within the warm up (truncate + migrate + seed) and ~0.6sec in total waiting for a new test/replica databases to become available for a test. We spend ~20% of our total execution time running / waiting inside our test strategy approach.

This a cold start. You pay for this warm-up flow only if no template database was cached by a previous test run (if your migrations + fixtures files - the hash over these files - hasn't changed).

A new test database (called a replica here) from this tiny template database took max. ~500ms to create, on avg. this was ~halfed and most importantly can be done in the background (while some tests already execute).

The cool thing about having a warm pool of replicas setup in the background, is that selecting new replicas from the pool is blazingly fast, as typically they will be already ready when it's time to execute the next test. For instance, it took ~500ms max. and 11ms on avg. to select a new replica for all subsequent tests (we only had to wait once until a replica became available for usage within a test - typically it's the first test to be executed).

Approach 3c benchmark 2: Small project

Let's look at a sightly bigger testsuite and see how this approach may possibly scale:

--- -----------------<storageHelper strategy report>------------------ ---
    replicas switched:             280    avg=26ms min=11ms max=447ms
    replicas awaited:              1      prebuffer=8 avg=417ms max=417ms
    background replicas:           288    avg=423ms min=105ms max=2574ms
    - warm up template (cold):     40%    5151ms
        * truncate:                8%     980ms
        * migrate:                 26%    3360ms
        * seed:                    4%     809ms
    - switching:                   60%    7461ms
        * disconnect:              2%     322ms
        * switch replica:          6%     775ms
            - resolve next:        2%     358ms
            - await next:          3%     417ms
        * reinitialize:            50%    6364ms
    strategy related time:         ---    12612ms
    vs total executed time:        11%    111094ms
--- ----------------</ storageHelper strategy report>----------------- ---

This test suite is larger and comes with 280 tests, the whole test run finished in ~1m50s (~390ms per test on avg.). ~5.2sec were spend setting up the template and ~7.5sec in total waiting for a new test / replica databases to become available for a test.

The rise in switching time is expected, as we need way more replicas / test databases this time, however we only spend ~11% running / waiting inside our test strategy approach. To put that into perspective, each test only had to wait ~26ms on avg. until it could finally execute (and typically, this is solely the time it needs to open up a new database connection).

This should hopefully give you some base understanding on why we consider this testing approach essential for our projects. It's the sweet combination of speed and isolation.

Final approach: IntegreSQL

We realized that having the above pool logic directly within the test runner is actually counterproductive and is further limiting usage from properly utilizing parallel testing (+performance).

As we switched to Go as our primary backend engineering language, we needed to rewrite the above logic anyways and decided to provide a safe and language agnostic way to utilize this testing strategy with PostgreSQL.

This is how IntegreSQL was born.

Benchmarks

Benchmark v1.1.0 vs v1.0.0

We focued on improving the pool manager performance in v1.1.0, especially when it comes to locking and thus request latency.

benchmark comparison v1.1.0

The main goal was to bring IntegreSQL's performance on par with our previous native Node.js implementation, which we also benchmarked:

# Previous Node.js implementation 
--- -----------------<storageHelper strategy report>------------------ ---
    replicas switched:             563    avg=14ms min=6ms max=316ms
    replicas awaited:              1      prebuffer=8 avg=301ms max=301ms
    background replicas:           571    avg=-ms min=-ms max=1180ms
    - warm up:                     32%    4041ms 
        * drop/cache check:        4%     561ms 
        * migrate/cache reuse:     25%    3177ms 
        * fixtures:                2%     302ms
        * special:                 0%     0ms
        * create pool:             0%     1ms
    - switching:                   67%    8294ms
        * disconnect:              1%     139ms
        * switch slave:            4%     591ms
            - resolve next:        2%     290ms
            - await next:          2%     301ms
        * reinitialize:            61%    7563ms
    strategy related time:                12335ms 
    vs total executed time:        11%    106184ms
--- ----------------</ storageHelper strategy report>----------------- ---
Done in 106.60s.

# IntegreSQL v1.1.0 (next version)
--- -----------------<integresql strategy report>------------------ ---
    replicas switched:             563    avg=70ms min=58ms max=603ms
    replicas awaited:              1      prebuffer=8 avg=72ms max=72ms
    background replicas:           571    avg=58ms min=49ms max=520ms
    - warm up:                     9%     4101ms
        * drop/cache check:        0%     1ms
        * migrate/cache reuse:     8%     3520ms
        * fixtures:                0%     296ms
        * special:                 0%     0ms
        * create pool:             0%     284ms
    - switching:                   90%    39865ms
        * disconnect:              0%     120ms
        * switch replica:          0%     261ms (563x min=0ms q25=0ms q50=0ms q75=1ms q95=1ms max=72ms)
            - resolve next:        0%     189ms
            - await next:          0%     72ms
        * reinitialize:            89%    39478ms (563x min=58ms q25=66ms q50=68ms q75=71ms q95=80ms max=531ms)
    strategy related time:                43966ms
    vs total executed time:        40%    109052ms
--- ----------------</ integresql strategy report>----------------- ---
Done in 109.45s.

# IntegreSQL v1.0.0 (previous version)
--- -----------------<integresql strategy report>------------------ ---
    replicas switched:             563    avg=131ms min=9ms max=2019ms
    replicas awaited:              94     prebuffer=8 avg=590ms max=1997ms
    background replicas:           571    avg=1292ms min=52ms max=3817ms
    - warm up:                     7%     6144ms
        * drop/cache check:        0%     0ms
        * migrate/cache reuse:     4%     3587ms
        * fixtures:                0%     298ms
        * special:                 0%     0ms
        * create pool:             2%     2259ms
    - switching:                   92%    73837ms
        * disconnect:              0%     112ms
        * switch replica:          64%    51552ms (563x min=0ms q25=0ms q50=0ms q75=1ms q95=773ms max=1997ms)
            - resolve next:        5%    3922ms
            - await next:          69%    55474ms
        * reinitialize:            27%    22169ms (563x min=9ms q25=12ms q50=15ms q75=19ms q95=187ms max=1201ms)
    strategy related time:                79981ms
    vs total executed time:        51%    153889ms
--- ----------------</ integresql strategy report>----------------- ---
Done in 154.29s.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

Development setup

IntegreSQL requires the following local setup for development:

The project makes use of the devcontainer functionality provided by Visual Studio Code so no local installation of a Go compiler is required when using VSCode as an IDE.

Should you prefer to develop IntegreSQL without the Docker setup, please ensure a working Go (1.14 or above) environment has been configured as well as a PostgreSQL instance is available (tested against version 12 or above, but should be compatible to lower versions) and the appropriate environment variables have been configured as described in the Install section.

Development quickstart

  1. Start the local docker-compose setup and open an interactive shell in the development container:
# Build the development Docker container, start it and open a shell
./docker-helper.sh --up
  1. Initialize the project, downloading all dependencies and tools required (executed within the dev container):
# Init dependencies/tools
make init

# Build executable (generate, format, build, vet)
make
  1. Execute project tests and start server:
# Execute tests
make test

# Run IntegreSQL server with config from environment
integresql

Maintainers

Previous maintainers

License

MIT © 2020-2024 aaa – all about apps GmbH | Nick Müller | Mario Ranftl and the IntegreSQL project contributors