Skip to content

benzht/docker-postgres-backup-local

 
 

Repository files navigation

Docker pulls GitHub actions

postgres-backup-local

Backup PostgresSQL to the local filesystem with periodic rotating backups, based on schickling/postgres-backup-s3. Backup multiple databases from the same host by setting the database names in POSTGRES_DB separated by commas or spaces.

Supports the following Docker architectures: linux/amd64, linux/arm64, linux/arm/v7, linux/s390x, linux/ppc64le.

Please consider reading detailed the How the backups folder works?.

This application requires the docker volume /backups to be a POSIX-compliant filesystem to store the backups (mainly with support for hardlinks and softlinks). So filesystems like VFAT, EXFAT, SMB/CIFS, ... can't be used with this docker image.

Usage

Docker:

docker run -u postgres:postgres -e POSTGRES_HOST=postgres -e POSTGRES_DB=dbname -e POSTGRES_USER=user -e POSTGRES_PASSWORD=password  prodrigestivill/postgres-backup-local

Docker Compose:

version: '2'
services:
    postgres:
        image: postgres
        restart: always
        environment:
            - POSTGRES_DB=database
            - POSTGRES_USER=username
            - POSTGRES_PASSWORD=password
         #  - POSTGRES_PASSWORD_FILE=/run/secrets/db_password <-- alternative for POSTGRES_PASSWORD (to use with docker secrets)
    pgbackups:
        image: prodrigestivill/postgres-backup-local
        restart: always
        user: postgres:postgres # Optional: see below
        volumes:
            - /var/opt/pgbackups:/backups
        links:
            - postgres
        depends_on:
            - postgres
        environment:
            - POSTGRES_HOST=postgres
            - POSTGRES_DB=database
            - POSTGRES_USER=username
            - POSTGRES_PASSWORD=password
         #  - POSTGRES_PASSWORD_FILE=/run/secrets/db_password <-- alternative for POSTGRES_PASSWORD (to use with docker secrets)
            - POSTGRES_EXTRA_OPTS=-Z1 --schema=public --blobs
            - SCHEDULE=@daily
            - BACKUP_ON_START=TRUE
            - BACKUP_KEEP_DAYS=7
            - BACKUP_KEEP_WEEKS=4
            - BACKUP_KEEP_MONTHS=6
            - HEALTHCHECK_PORT=8080

For security reasons it is recommended to run it as user postgres:postgres.

In case of running as postgres user, the system administrator must initialize the permission of the destination folder as follows:

# for default images (debian)
mkdir -p /var/opt/pgbackups && chown -R 999:999 /var/opt/pgbackups
# for alpine images
mkdir -p /var/opt/pgbackups && chown -R 70:70 /var/opt/pgbackups

Environment Variables

Most variables are the same as in the official postgres image.

env variable description
BACKUP_DIR Directory to save the backup at. Defaults to /backups.
BACKUP_SUFFIX Filename suffix to save the backup. Defaults to .sql.gz.
BACKUP_ON_START If set to TRUE performs an backup on each container start or restart. Defaults to FALSE.
BACKUP_KEEP_DAYS Number of daily backups to keep before removal. Defaults to 7.
BACKUP_KEEP_WEEKS Number of weekly backups to keep before removal. Defaults to 4.
BACKUP_KEEP_MONTHS Number of monthly backups to keep before removal. Defaults to 6.
BACKUP_KEEP_MINS Number of minutes for last folder backups to keep before removal. Defaults to 1440.
BACKUP_LATEST_TYPE Type of latest pointer (symlink,hardlink,none). Defaults to symlink.
HEALTHCHECK_PORT Port listening for cron-schedule health check. Defaults to 8080.
POSTGRES_DB Comma or space separated list of postgres databases to backup. If POSTGRES_CLUSTER is set this refers to the database to connect to for dumping global objects and discovering what other databases should be dumped (typically is either postgres or template1). Required.
POSTGRES_DB_FILE Alternative to POSTGRES_DB, but with one database per line, for usage with docker secrets.
POSTGRES_EXTRA_OPTS Additional options for pg_dump (or pg_dumpall options if POSTGRES_CLUSTER is set). Defaults to -Z1.
POSTGRES_CLUSTER Set to TRUE in order to use pg_dumpall instead. Also set POSTGRES_EXTRA_OPTS to any value or empty since the default value is not compatible with pg_dumpall.
POSTGRES_HOST Postgres connection parameter; postgres host to connect to. Required.
POSTGRES_PASSWORD Postgres connection parameter; postgres password to connect with. Required.
POSTGRES_PASSWORD_FILE Alternative to POSTGRES_PASSWORD, for usage with docker secrets.
POSTGRES_PASSFILE_STORE Alternative to POSTGRES_PASSWORD in passfile format, for usage with postgres clusters.
POSTGRES_PORT Postgres connection parameter; postgres port to connect to. Defaults to 5432.
POSTGRES_USER Postgres connection parameter; postgres user to connect with. Required.
POSTGRES_USER_FILE Alternative to POSTGRES_USER, for usage with docker secrets.
SCHEDULE Cron-schedule specifying the interval between postgres backups. Defaults to @daily.
TZ POSIX TZ variable specifying the timezone used to evaluate SCHEDULE cron (example "Europe/Paris").
WEBHOOK_URL URL to be called after an error or after a successful backup (POST with a JSON payload, check hooks/00-webhook file for more info). Default disabled.
WEBHOOK_ERROR_URL URL to be called in case backup fails. Default disabled.
WEBHOOK_PRE_BACKUP_URL URL to be called when backup starts. Default disabled.
WEBHOOK_POST_BACKUP_URL URL to be called when backup completes successfully. Default disabled.
WEBHOOK_EXTRA_ARGS Extra arguments for the curl execution in the webhook (check hooks/00-webhook file for more info).
POST_DUMP_HOOK Allows actions to be performed on the database dumps, like for instance tar-ing a directory-dump an/or encrypting it.
When defined, the hook is called with three parameters: POSTGRES_HOST DB and DFILE (the name of the dumped file/dir).
The hook is supposed to return the name of the 'processed' dump.

Special Environment Variables

This variables are not intended to be used for normal deployment operations:

env variable description
POSTGRES_PORT_5432_TCP_ADDR Sets the POSTGRES_HOST when the latter is not set.
POSTGRES_PORT_5432_TCP_PORT Sets POSTGRES_PORT when POSTGRES_HOST is not set.

How the backups folder works?

First a new backup is created in the last folder with the full time.

Once this backup finish succefully then, it is hard linked (instead of coping to avoid use more space) to the rest of the folders (daily, weekly and monthly). This step replaces the old backups for that category storing always only the latest for each category (so the monthly backup for a month is always storing the latest for that month and not the first).

So the backup folder are structured as follows:

  • BACKUP_DIR/last/DB-YYYYMMDD-HHmmss.sql.gz: all the backups are stored separatly in this folder.
  • BACKUP_DIR/daily/DB-YYYYMMDD.sql.gz: always store (hard link) the latest backup of that day.
  • BACKUP_DIR/weekly/DB-YYYYww.sql.gz: always store (hard link) the latest backup of that week (the last day of the week will be Sunday as it uses ISO week numbers).
  • BACKUP_DIR/monthly/DB-YYYYMM.sql.gz: always store (hard link) the latest backup of that month (normally the ~31st).

And the following symlinks are also updated after each successfull backup for simlicity:

BACKUP_DIR/last/DB-latest.sql.gz -> BACKUP_DIR/last/DB-YYYYMMDD-HHmmss.sql.gz
BACKUP_DIR/daily/DB-latest.sql.gz -> BACKUP_DIR/daily/DB-YYYYMMDD.sql.gz
BACKUP_DIR/weekly/DB-latest.sql.gz -> BACKUP_DIR/weekly/DB-YYYYww.sql.gz
BACKUP_DIR/monthly/DB-latest.sql.gz -> BACKUP_DIR/monthly/DB-YYYYMM.sql.gz

For cleaning the script removes the files for each category only if the new backup has been successfull. To do so it is using the following independent variables:

  • BACKUP_KEEP_MINS: will remove files from the last folder that are older than its value in minutes after a new successfull backup without affecting the rest of the backups (because they are hard links).
  • BACKUP_KEEP_DAYS: will remove files from the daily folder that are older than its value in days after a new successfull backup.
  • BACKUP_KEEP_WEEKS: will remove files from the weekly folder that are older than its value in weeks after a new successfull backup (remember that it starts counting from the end of each week not the beggining).
  • BACKUP_KEEP_MONTHS: will remove files from the monthly folder that are older than its value in months (of 31 days) after a new successfull backup (remember that it starts counting from the end of each month not the beggining).

Hooks

The folder hooks inside the container can contain hooks/scripts to be run in differrent cases getting the exact situation as a first argument (error, pre-backup or post-backup).

Just create an script in that folder with execution permission so that run-parts can execute it on each state change.

Please, as an example take a look in the script already present there that implements the WEBHOOK_URL functionality.

Manual Backups

By default this container makes daily backups, but you can start a manual backup by running /backup.sh.

This script as example creates one backup as the running user and saves it the working folder.

docker run --rm -v "$PWD:/backups" -u "$(id -u):$(id -g)" -e POSTGRES_HOST=postgres -e POSTGRES_DB=dbname -e POSTGRES_USER=user -e POSTGRES_PASSWORD=password  prodrigestivill/postgres-backup-local /backup.sh

Automatic Periodic Backups

You can change the SCHEDULE environment variable in -e SCHEDULE="@daily" to alter the default frequency. Default is daily.

More information about the scheduling can be found here.

Folders daily, weekly and monthly are created and populated using hard links to save disk space.

Restore examples

Some examples to restore/apply the backups.

Restore using the same container

To restore using the same backup container, replace $BACKUPFILE, $CONTAINER, $USERNAME and $DBNAME from the following command:

docker exec --tty --interactive $CONTAINER /bin/sh -c "zcat $BACKUPFILE | psql --username=$USERNAME --dbname=$DBNAME -W"

Restore using a new container

Replace $BACKUPFILE, $VERSION, $HOSTNAME, $PORT, $USERNAME and $DBNAME from the following command:

docker run --rm --tty --interactive -v $BACKUPFILE:/tmp/backupfile.sql.gz postgres:$VERSION /bin/sh -c "zcat /tmp/backupfile.sql.gz | psql --host=$HOSTNAME --port=$PORT --username=$USERNAME --dbname=$DBNAME -W"

Use the POST_DUMP_HOOK to tar directory backup

As stated in Compress directory backups as a single file #33 is is often more convenient to store directory-backups as single files (tars), especially when saving in locations with file-count based billing (S3)

When we extend the example form above with the new POST_DUMP_HOOK like this

- POST_DUMP_HOOK=tar --create --file "${DFILE}.tar" --remove-files "${DFILE}"; DFILE="${DFILE}.tar"

then each dumped directory (${DFILE}) will be tar-ed and the tar-ed file will show up in the daily, weekly, and monthly` directories

Use the POST_DUMP_HOOK to encrypt the backup

My original motivation for this hook is actually the need to encrypt the database backups. To achieve that, we mount a script implementing the encryption (and some required resources like an X509 cert) in a directory and call it from the hook.

version: '2'
services:
     # ....
    pgbackups:
      # ....
      volumes:
        - ./scripts:/scripts:ro
      # ....
      environment:
        # ....
        - POSTGRES_EXTRA_OPTS=-Z6 --format=directory
        - BACKUP_SUFFIX=.crypt.tar
        - POST_DUMP_HOOK=/scripts/encrypt.sh "$${DFILE}"

The script in /scripts/encrypt.sh replaces the directory postgres dumped into with an ecrypted tar'ed file with the same name

About

Backup PostgresSQL to local filesystem with periodic backups and rotate backups.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Shell 58.2%
  • HCL 23.6%
  • Dockerfile 18.2%