From ab10f156ab0d6c397c056a82ddfc8a238038d13d Mon Sep 17 00:00:00 2001
From: Ronan Dunklau <ronan.dunklau@aiven.io>
Date: Tue, 27 Jul 2021 15:31:20 +0200
Subject: [PATCH] Initial documentation import.

This adds a documentation using sphinx, which is basically the content
of the README file, with some editing and a bit more content in some
places.

I left a lot of FIXME in the documentation since there is still much to
do, but at least it will be a nice start.
---
 docs/.gitignore        |   1 +
 docs/Makefile          |  20 ++
 docs/about.rst         |  35 +++
 docs/architecture.rst  |  71 ++++++
 docs/commands.rst      | 132 ++++++++++
 docs/conf.py           |  56 +++++
 docs/configuration.rst | 553 +++++++++++++++++++++++++++++++++++++++++
 docs/development.rst   |  98 ++++++++
 docs/index.rst         |  60 +++++
 docs/install.rst       | 103 ++++++++
 docs/make.bat          |  35 +++
 docs/monitoring.rst    |  47 ++++
 docs/quickstart.rst    | 179 +++++++++++++
 13 files changed, 1390 insertions(+)
 create mode 100644 docs/.gitignore
 create mode 100644 docs/Makefile
 create mode 100644 docs/about.rst
 create mode 100644 docs/architecture.rst
 create mode 100644 docs/commands.rst
 create mode 100644 docs/conf.py
 create mode 100644 docs/configuration.rst
 create mode 100644 docs/development.rst
 create mode 100644 docs/index.rst
 create mode 100644 docs/install.rst
 create mode 100644 docs/make.bat
 create mode 100644 docs/monitoring.rst
 create mode 100644 docs/quickstart.rst

diff --git a/docs/.gitignore b/docs/.gitignore
new file mode 100644
index 00000000..69fa449d
--- /dev/null
+++ b/docs/.gitignore
@@ -0,0 +1 @@
+_build/
diff --git a/docs/Makefile b/docs/Makefile
new file mode 100644
index 00000000..d4bb2cbb
--- /dev/null
+++ b/docs/Makefile
@@ -0,0 +1,20 @@
+# Minimal makefile for Sphinx documentation
+#
+
+# You can set these variables from the command line, and also
+# from the environment for the first two.
+SPHINXOPTS    ?=
+SPHINXBUILD   ?= sphinx-build
+SOURCEDIR     = .
+BUILDDIR      = _build
+
+# Put it first so that "make" without argument is like "make help".
+help:
+	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
+
+.PHONY: help Makefile
+
+# Catch-all target: route all unknown targets to Sphinx using the new
+# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
+%: Makefile
+	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
diff --git a/docs/about.rst b/docs/about.rst
new file mode 100644
index 00000000..aef21958
--- /dev/null
+++ b/docs/about.rst
@@ -0,0 +1,35 @@
+About PGHoard
+=============
+
+Features
+--------
+
+* Automatic periodic basebackups
+* Automatic transaction log (WAL/xlog) backups (using either ``pg_receivewal``
+  (formerly ``pg_receivexlog``), ``archive_command`` or experimental PG native
+  replication protocol support with ``walreceiver``)
+* Optional Standalone Hot Backup support
+* Cloud object storage support (AWS S3, Google Cloud, OpenStack Swift, Azure, Ceph)
+* Backup restoration directly from object storage, compressed and encrypted
+* Point-in-time-recovery (PITR)
+* Initialize a new standby from object storage backups, automatically configured as
+  a replicating hot-standby
+
+Fault-resilience and monitoring
+-------------------------------
+
+* Persists over temporary object storage connectivity issues by retrying transfers
+* Verifies WAL file headers before upload (backup) and after download (restore),
+  so that e.g. files recycled by PostgreSQL are ignored
+* Automatic history cleanup (backups and related WAL files older than N days)
+* "Archive sync" tool for detecting holes in WAL backup streams and fixing them
+* "Archive cleanup" tool for deleting obsolete WAL files from the archive
+* Keeps statistics updated in a file on disk (for monitoring tools)
+* Creates alert files on disk on problems (for monitoring tools)
+
+
+Performance
+-----------
+
+* Parallel compression and encryption
+* WAL pre-fetching on restore
diff --git a/docs/architecture.rst b/docs/architecture.rst
new file mode 100644
index 00000000..b6466b6d
--- /dev/null
+++ b/docs/architecture.rst
@@ -0,0 +1,71 @@
+Architecture
+============
+
+PostgreSQL Point In Time Replication (PITR) consists of having a database
+basebackup and changes after that point go into WAL log files that can be
+replayed to get to the desired replication point.
+
+PGHoard runs as a daemon which will be responsible for performing the main
+tasks of a backup tool for PostgreSQL:
+
+* Taking periodical basebackups
+* Archiving the WAL
+* Managing backup retention according to a policy.
+
+Basebackup
+----------
+
+The basebackups are taken by the pghoard daemon directly, with no need for an
+external scheduler / crond.
+
+When pghoard is first launched, it will take a basebackup. After that, the
+frequency of basebackups is determined by configuration files.
+
+Those basebackups can be taken in one of two ways:
+
+* Either by copying the files directly from ``PGDATA``, using the
+  ``local-tar`` or ``delta`` modes
+* By calling ``pg_basebackup``, using the ``basic`` or ``pipe`` modes.
+
+See :ref:`configuration_basebackup` for how to configure it.
+
+Archiving
+---------
+
+PGHoard supports multiple operating models. If you don't want to modify the
+backuped server archiving configuration, or install anything particular on that
+server, ``pghoard`` can fetch the WAL using ``pg_receivewal`` (formerly ``pg_receivexlog`` on PostgreSQL < 10).
+It also provides its own replication client replacing ``pg_receivewal``, using
+the ``walreceiver`` mode. This mode is currently experimental.
+
+PGHoard also supports a traditional ``archive_command`` in the form of the
+``pghoard_postgres_command`` utility.
+
+
+See :ref:`configuration_archiving` for how to configure it.
+
+Retention
+---------
+
+``pghoard`` expires the backups according to the configured retention policy.
+Whenever there is more than the specified number of backups, older backups will
+be removed as well as their associated WAL files.
+
+Compression and encryption
+--------------------------
+
+The PostgreSQL write-ahead log (WAL) and basebackups are compressed with
+Snappy (default) in order to ensure good compression speed and relatively small backup size.  for more information. Zstandard or LZMA encryption is also available. See :ref:`configuration_compression` for more information.
+
+Encryption is not enabled by default, but PGHoard can encrypt backuped data at
+rest. Each individual file is encrypted and authenticated with file specific
+keys.  The file specific keys are included in the backup in turn encrypted with
+a master RSA private/public key pair.
+
+You should follow the encryption section in the quickstart guide :ref:`quickstart_encryption`. For a full reference see :ref:`configuration_encryption`.
+
+
+Deployment examples
+-------------------
+
+FIXME: add schemas showing a deployment of pghoard on the same host with
diff --git a/docs/commands.rst b/docs/commands.rst
new file mode 100644
index 00000000..4f2d5e45
--- /dev/null
+++ b/docs/commands.rst
@@ -0,0 +1,132 @@
+Commands
+========
+
+
+pghoard
+-------
+
+``pghoard`` is the main daemon process that should be run under a service
+manager, such as ``systemd`` or ``supervisord``.  It handles the backup of
+the configured sites.
+
+.. code-block::
+
+  usage: pghoard [-h] [-D] [--version] [-s] [--config CONFIG] [config_file]
+
+  postgresql automatic backup daemon
+
+  positional arguments:
+    config_file      configuration file path (for backward compatibility)
+
+  optional arguments:
+    -h, --help       show this help message and exit
+    -D, --debug      Enable debug logging
+    --version        show program version
+    -s, --short-log  use non-verbose logging format
+    --config CONFIG  configuration file path
+
+
+.. _commands_restore:
+
+pghoard_restore
+---------------
+
+``pghoard_restore`` is a command line tool that can be used to restore a
+previous database backup from either ``pghoard`` itself or from one of the
+supported object stores.  ``pghoard_restore`` can also configure
+``recovery.conf`` to use ``pghoard_postgres_command`` as the WAL
+``restore_command`` in ``recovery.conf``.
+
+
+.. code-block::
+
+  usage: pghoard_restore [-h] [-D] [--status-output-file STATUS_OUTPUT_FILE] [--version]
+                         {list-basebackups-http,list-basebackups,get-basebackup} ...
+
+positional arguments:
+      list-basebackups-http
+        List available basebackups from a HTTP source
+      list-basebackups
+        List basebackups from an object store
+      get-basebackup
+        Download a basebackup from an object store
+
+
+-h, --help            show this help message and exit
+-D, --debug           Enable debug logging
+--status-output-file STATUS_OUTPUT_FILE
+                      Filename for status output JSON
+--version             show program version
+
+pghoard_archive_cleanup
+-----------------------
+
+``pghoard_archive_cleanup`` can be used to clean up any orphan WAL files
+from the object store.  After the configured number of basebackups has been
+exceeded (configuration key ``basebackup_count``), ``pghoard`` deletes the
+oldest basebackup and all WAL associated with it.  Transient object storage
+failures and other interruptions can cause the WAL deletion process to leave
+orphan WAL files behind, they can be deleted with this tool.
+
+.. code-block::
+
+  usage: pghoard_archive_cleanup [-h] [--version] [--site SITE] [--config CONFIG] [--dry-run]
+
+
+-h, --help       show this help message and exit
+--version        show program version
+--site SITE      pghoard site
+--config CONFIG  pghoard config file
+--dry-run        only list redundant segments and calculate total file size but do not delete
+
+
+pghoard_archive_sync
+--------------------
+
+``pghoard_archive_sync`` can be used to see if any local files should
+be archived but haven't been or if any of the archived files have unexpected
+content and need to be archived again. The other usecase it has is to determine
+if there are any gaps in the required files in the WAL archive
+from the current WAL file on to to the latest basebackup's first WAL file.
+
+.. code-block::
+
+  usage: pghoard_archive_sync [-h] [-D] [--version] [--site SITE] [--config CONFIG]
+                              [--max-hash-checks MAX_HASH_CHECKS] [--no-verify] [--create-new-backup-on-failure]
+
+
+-h, --help            show this help message and exit
+-D, --debug           Enable debug logging
+--version             show program version
+--site SITE           pghoard site
+--config CONFIG       pghoard config file
+--max-hash-checks MAX_HASH_CHECKS
+                      Maximum number of files for which to validate hash in addition to basic existence check
+--no-verify           do not verify archive integrity
+--create-new-backup-on-failure
+                      request a new basebackup if verification fails
+
+pghoard_create_keys
+-------------------
+
+``pghoard_create_keys`` can be used to generate and output encryption keys
+in the ``pghoard`` configuration format.
+
+``pghoard_postgres_command`` is a command line tool that can be used as
+PostgreSQL's ``archive_command`` or ``recovery_command``.  It communicates with
+``pghoard`` 's locally running webserver to let it know there's a new file that
+needs to be compressed, encrypted and stored in an object store (in archive
+mode) or its inverse (in restore mode.)
+
+.. code-block::
+
+
+  usage: pghoard_create_keys [-h] [-D] [--version] [--site SITE] --key-id KEY_ID [--bits BITS] [--config CONFIG]
+
+-h, --help       show this help message and exit
+-D, --debug      Enable debug logging
+--version        show program version
+--site SITE      backup site
+--key-id KEY_ID  key alias as used with encryption_key_id configuration directive
+--bits BITS      length of the generated key in bits, default 3072
+--config CONFIG  configuration file to store the keys in
diff --git a/docs/conf.py b/docs/conf.py
new file mode 100644
index 00000000..d865b364
--- /dev/null
+++ b/docs/conf.py
@@ -0,0 +1,56 @@
+# Configuration file for the Sphinx documentation builder.
+#
+# This file only contains a selection of the most common options. For a full
+# list see the documentation:
+# https://www.sphinx-doc.org/en/master/usage/configuration.html
+
+# -- Path setup --------------------------------------------------------------
+
+# If extensions (or modules to document with autodoc) are in another directory,
+# add these directories to sys.path here. If the directory is relative to the
+# documentation root, use os.path.abspath to make it absolute, like shown here.
+#
+import os
+import sys
+sys.path.insert(0, os.path.abspath('..'))
+from version import get_project_version
+
+
+# -- Project information -----------------------------------------------------
+
+project = 'PGHoard'
+copyright = '2021, Aiven'
+author = 'Aiven'
+
+# The full version, including alpha/beta/rc tags
+release = get_project_version('pghoard/version.py')
+
+# -- General configuration ---------------------------------------------------
+
+# Add any Sphinx extension module names here, as strings. They can be
+# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
+# ones.
+extensions = [
+    "sphinx_rtd_theme"
+]
+
+# Add any paths that contain templates here, relative to this directory.
+templates_path = ['_templates']
+
+# List of patterns, relative to source directory, that match files and
+# directories to ignore when looking for source files.
+# This pattern also affects html_static_path and html_extra_path.
+exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
+
+
+# -- Options for HTML output -------------------------------------------------
+
+# The theme to use for HTML and HTML Help pages.  See the documentation for
+# a list of builtin themes.
+#
+html_theme = 'sphinx_rtd_theme'
+
+# Add any paths that contain custom static files (such as style sheets) here,
+# relative to this directory. They are copied after the builtin static files,
+# so a file named "default.css" will overwrite the builtin "default.css".
+html_static_path = ['_static']
diff --git a/docs/configuration.rst b/docs/configuration.rst
new file mode 100644
index 00000000..d0b8bc4f
--- /dev/null
+++ b/docs/configuration.rst
@@ -0,0 +1,553 @@
+.. _configuration:
+
+Configuration
+=============
+
+The configuration file is in the JSON format. It consists of nested
+key-value pairs.
+
+For example::
+
+    {
+        "json_state_file_path": "/var/lib/pghoard/pghoard_state.json"
+        "backup_sites": {
+            "mycluster": {
+                "nodes": [
+                    {
+                        "host": "127.0.0.1",
+                        "password": "secret",
+                        "port": 5432,
+                        "user": "backup",
+                        "slot": "pghoard"
+                    }
+                ],
+                "basebackup_count": 5,
+                "basebackup_mode": "delta",
+                "object_storage": {
+                    "storage_type": "local",
+                    "directory": "/tmp/pghoard/backups"
+                }
+            }
+        }
+    }
+
+Global Configuration
+--------------------
+
+Global configuration options are specified at the top-level.
+In this documentation we group them by categories.
+
+
+Generic Configuration
+~~~~~~~~~~~~~~~~~~~~~
+
+
+
+
+active (default ``true``)
+        Can also be set on the ``backup_site`` level to disable taking of new backups
+        and to stop the deletion of old ones
+backup_location
+        Where ``pghoard`` will create its internal data structures for local
+        state data.
+hash_algorithm (default ``"sha1"``)
+        The hash algorithm used for calculating checksums for WAL or other files. Must
+        be one of the algorithms supported by Python's `hashlib <https://docs.python.org/3/library/hashlib.html#hash-algorithms>`_
+json_state_file_path (default ``"/var/lib/pghoard/pghoard_state.json"``)
+        Location of the JSON state file path which describes the state of the
+        ``pghoard`` process.
+maintenance_mode_file (default ``"/var/lib/pghoard/maintenance_mode_file"``)
+                        Trigger file for maintenance mode: if a file exists at
+                        this location no new backup actions will be started)
+                        FIXME: define "new backup actions"
+transfer (default see below)
+  A JSON object defining the WAL/basebackup tranfer parameters.
+
+  Example::
+
+    {
+      transfer: {
+        thread_count: 4,
+        upload_retries_warning_limit: 3
+      }
+    }
+
+  thread_count (default ``min(cpu_count + 3, 20)``)
+    Number of parallel uploads / downloads
+  upload_retries_warning_limit (default ``3``)
+    Create an alert file ``upload_retries_warning`` after this many failed
+    upload attempts. See (FIXME: link to alert system)
+tar_executable (default ``"pghoard_gnutaremu"``)
+  The tar command to use for restoring basebackups. This must be GNU tar because some
+  advanced switches like ``--transform`` are needed. If this value is not defined (or
+  is explicitly set to ``"pghoard_gnutaremu"``), Python's internal tarfile
+  implementation is used. The Python implementation is somewhat slower than the
+  actual tar command and in environments with fast disk IO (compared to available CPU
+  capacity) it is recommended to set this to ``"tar"``.
+restore_prefetch (default ``transfer.thread_count``)
+  Number of files to prefetch when performing archive recovery.  The default
+  is the number of Transfer Agent threads to try to utilize them all.
+
+
+.. _configuration_logging:
+
+Logging configuration
+~~~~~~~~~~~~~~~~~~~~~
+
+log_level (default ``"INFO"``)
+            Determines log level of ``pghoard``.
+syslog (default ``false``)
+  Enable / disable syslog logging
+syslog_address (default ``"/dev/log"``)
+  Determines syslog address to use in logging (requires syslog to be true as
+  well)
+syslog_facility (default ``"local2"``)
+  Determines syslog log facility. (requires syslog to be true as well)
+
+
+.. _configuration_monitoring:
+
+Monitoring
+~~~~~~~~~~
+
+alert_file_dir (default ``backup_location`` if set else ``os.getcwd()``)
+  Directory in which alert files for replication warning and failover are
+  created.
+stats (default ``null``)
+  When set, enables sending to a statsd daemon that supports Telegrag or DataDog
+  syntax with tags.
+  The value is a JSON object, for example::
+
+    {
+        "host": "<statsd address>",
+        "port": <statsd port>,
+        "format": "<statsd message format>",
+        "tags": {
+            "<tag>": "<value>"
+        }
+    }
+
+  host
+    The statsd host address
+  port
+    The statsd listening port
+  format (default ``"telegraf"``)
+      Determines statsd message format. Following formats are supported:
+
+        - ``telegraf`` `Telegraf spec <https://github.com/influxdata/telegraf/tree/master/plugins/inputs/statsd>`_
+        - ``datadog`` `DataDog spec <http://docs.datadoghq.com/guides/dogstatsd/#datagram-format>`_
+
+  :tags: (default null)
+      The tag key can be used to enter optional tag values for the metrics
+push_gateway (default ``null``)
+  When set, enables sending metrics to a Prometheus Pushgateway with tags.
+  The value is a JSON obejct, for example::
+
+    {
+        "endpoint": "<pushgateway address>",
+        "tags": {
+            "<tag>": "<value>"
+        }
+    }
+
+  endpoint
+    The pushgateway address
+  tags
+    An object mapping tags to their values.
+
+
+.. _configuration_http:
+
+HTTP Server configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The pghoard daemon needs to listen on an HTTP port for the archive command and
+for fetching of basebackups/WAL's when restoring if not using an object store.
+
+http_address (default ``"127.0.0.1"``)
+               Address to bind the PGHoard HTTP server to.  Set to an empty string to
+               listen to all available addresses.
+http_port (default ``16000``)
+            HTTP webserver port. Used for the archive command and for fetching of
+            basebackups/WAL's when restoring if not using an object store.
+
+
+
+
+.. _configuration_compression:
+
+Compression
+~~~~~~~~~~~
+
+The PostgreSQL write-ahead log (WAL) and basebackups are compressed with
+Snappy (default), Zstandard (configurable, level 3 by default) or LZMA (configurable,
+level 0 by default) in order to ensure good compression speed and relatively small backup size.
+For performance critical applications it is recommended to test compression
+algorithms to find the most suitable trade-off for the particular use-case.
+E.g. Snappy is fast but yields larger compressed files, Zstandard (zstd) on the other hand
+offers a very wide range of compression/speed trade-off.
+
+The top-level ``compression`` key allows to define compression options::
+
+  {
+    "compression": {
+      "algorithm": "snappy",
+      "level": 3,
+      "thread_count": 4
+    }
+  }
+
+algorithm (default ``snappy``)
+            The compression algorithm to use. Available algorithms are
+            ``snappy``, ``zstd``, and ``lzma``
+level (default ``0`` for ``lzma`` and ``zstd``, ``3`` for ``snappy``)
+        The compression level to use. Depends on the algorithm used.
+thread_count (default to ``cpu_count`` + 1)
+        The number of threads used for parallel compression.
+        Contrary to ``basebackup_compression_threads`` this is the number of
+        compression threads started by ``pghoard``, not internal compression
+        threads for libraries supporting it, and is then applicable to any
+        compression algorithm.
+
+
+Backup sites
+------------
+
+The key ``backup_sites`` contains configuration for groups of PostgreSQL clusters (here
+called ``sites``). Each backup site configures how to backup the different nodes
+it comprises. Each site can be configured separately, under an idenfiying
+site name (example: ``mysite``).
+
+A backup site contains an array of at least one node. For each node, the connection
+information is required. The keys for a node are libpq parameters, for example::
+
+  {
+    "backup_sites": {
+      "mysite": {
+        "nodes": [
+            {
+                "host": "127.0.0.1",
+                "password": "secret",
+                "port": 5432,
+                "user": "backup",
+                "slot": "pghoard",
+                "sslmode": "require"
+            }
+        ]
+      }
+    }
+  }
+
+It is advised to use a replication slot when performing a form of wal streaming archiving (``pg_receivexlog`` or ``walreceiver`` modes).
+
+nodes (no default)
+  A node can be described as an object of libpq key: value connection info pairs or libpq
+  connection string or a ``postgres://`` connection uri. If for example you'd
+  like to use a streaming replication slot use the syntax {... "slot": "slotname"}.
+pg_data_directory (no default)
+  This is used when the ``local-tar`` or ``delta`` ``basebackup_mode`` is in
+  use. The data directory must point to PostgreSQL's ``$PGDATA`` and must be readable by the
+  ``pghoard`` daemon.
+prefix: (default site_name)
+  Path prefix to use for all backups related to this site.
+pg_bin_directory: (default find binaries from well-known directories)
+  Where to find the ``pg_basebackup`` and ``pg_receivewal`` (``pg_receivexlog``
+  for PG < 10).
+  If a value is not supplied, ``pghoard`` will attempt to find matching binaries
+  from various well-known locations. If ``pg_data_directory`` is set and points to a
+  valid data directory the lookup is restricted to the version contained in
+  the given data directory.
+
+
+.. _configuration_basebackup:
+
+Basebackup configuration
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+The following options all concern various aspect of the basebackup process and
+their retention policy.
+
+basebackup_mode (default ``"basic"``)
+  The way basebackups should be created. We support 4 different modes, the first
+  two use ``pg_basebackup`` while the rest directly read the files from the
+  cluster. Neither ``basic`` nor ``pipe`` modes support multiple tablespaces.
+
+  ``basic``
+    runs ``pg_basebackup`` and waits for it to write an uncompressed tar file on the
+    disk before compressing and optionally encrypting it.
+  ``pipe``
+    pipes the data directly from ``pg_basebackup`` to PGHoard's
+    compression and encryption processing reducing the amount of temporary disk
+    space that's required.
+  ``local-tar``
+    Can be used only when running on the same host as the
+    PostgreSQL cluster. Instead of using ``pg_basebackup``, PGHoard reads the files directly from ``$PGDATA`` in this mode and compresses and optionally encrypts them.  This mode allows backing up user
+    tablespaces. Note that the ``local-tar`` backup mode can not be used on replica servers
+    prior to PostgreSQL 9.6 unless the pgespresso extension is installed.
+
+  ``delta``
+    similar to ``local-tar``, but only changed files are uploaded into the storage.
+    On every backup snapshot of the data files is taken, this results in a manifest file,
+    describing the hashes of all the files needed to be backed up.
+    New hashes are uploaded to the storage and used together with complementary
+    manifest from control file for restoration.
+
+  In order to properly assess the efficiency of ``delta`` mode in comparison with
+  ``local-tar``, one can use ``local-tar-delta-stats`` mode, which behaves the same as
+  ``local-tar``, but also collects the metrics as if it was ``delta`` mode. It can help
+  in decision making of switching to ``delta`` mode.
+basebackup_thread (default ``1``)
+  How many threads to use for tar, compress and encrypt tasks. Only applies for
+  ``local-tar`` basebackup mode. Only values 1 and 2 are likely to be sensible for
+  this, with higher thread count speed improvement is negligible and CPU time is
+  lost switching between threads.
+
+
+
+
+
+
+The following options define how to schedule basebackups.
+
+basebackup_interval_hours (default ``24``)
+  How often to take a new basebackup of a cluster.  The shorter the interval,
+  the faster your recovery will be, but the more CPU/IO usage is required from
+  the servers it takes the basebackup from.  If set to a null value basebackups
+  are not automatically taken at all.
+basebackup_hour (default undefined)
+  The hour of day during which to start new basebackup. If backup interval is
+  less than 24 hours this is the base hour used to calculate the hours at which
+  backup should be taken. E.g. if backup interval is 6 hours and this value is
+  set to 1 backups will be taken at hours 1, 7, 13 and 19. This value is only
+  effective if also ``basebackup_interval_hours`` and ``basebackup_minute`` are
+  set.
+basebackup_minute (default undefined)
+  The minute of hour during which to start new basebackup. This value is only
+  effective if also ``basebackup_interval_hours`` and ``basebackup_hour`` are
+  set.
+
+
+basebackup_chunks_in_progress (default  ``5``)
+  How many basebackup chunks can there be simultaneously on disk while
+  it is being taken. For chunk size configuration see ``basebackup_chunk_size``.
+basebackup_chunk_size (default ``2147483648``)
+  In how large backup chunks to take a ``local-tar`` basebackup. Disk space
+  needed for a successful backup is ``basebackup_chunk_size *
+  basebackup_chunks_in_progress``.
+basebackup_compression_threads (default ``0``)
+  Number of threads to use within compression library during basebackup. Only
+  applicable when using compression library that supports internal multithreading,
+  namely zstd at the moment. Default value ``0`` means not to use multithreading.
+
+The following options manage the retention policy.
+
+basebackup_age_days_max (default ``null``)
+  Maximum age for basebackups. Basebackups older than this will be removed. By
+  default this value is not defined and basebackups are deleted based on total
+  count instead.
+basebackup_count (default ``2``)
+  How many basebackups should be kept around for restoration purposes.  The
+  more there are the more diskspace will be used. If ``basebackup_max_age`` is
+  defined this controls the maximum number of basebackups to keep; if backup
+  interval is less than 24 hour or extra backups are created there can be more
+  than one basebackup per day and it is often desirable to set
+  ``basebackup_count`` to something slightly higher than the max age in days.
+basebackup_count_min (default ``2``)
+  Minimum number of basebackups to keep. This is only effective when
+  ``basebackup_age_days_max`` has been defined. If for example the server is
+  powered off and then back on a month later, all existing backups would be very
+  old. However, in that case it is usually not desirable to immediately delete
+  all old backups. This setting allows specifying a minimum number of backups
+  that should always be preserved regardless of their age.
+
+
+
+.. _configuration_archiving:
+
+Archiving configuration
+~~~~~~~~~~~~~~~~~~~~~~~
+
+
+active_backup_mode (default ``pg_receivexlog``)
+  Can be either ``pg_receivexlog`` or ``archive_command``. If set to
+  ``pg_receivexlog``, ``pghoard`` will start up a ``pg_receivexlog`` process to be
+  run against the database server.  If ``archive_command`` is set, we rely on the
+  user setting the correct ``archive_command`` in
+  ``postgresql.conf``. You can also set this to the experimental ``walreceiver`` mode
+  whereby pghoard will start communicating directly with PostgreSQL
+  through the replication protocol. (Note requires psycopg2 >= 2.7)
+
+
+pg_receivexlog
+  When active backup mode is set to ``"pg_receivexlog"`` this object may
+  optionally specify additional configuration options. The currently available
+  options are all related to monitoring disk space availability and optionally
+  pausing xlog/WAL receiving when disk space goes below configured threshold.
+  This is useful when PGHoard is configured to create its temporary files on
+  a different volume than where the main PostgreSQL data directory resides. By
+  default this logic is disabled and the minimum free bytes must be configured
+  to enable it.
+
+  Example::
+
+    {
+      "backup_sites": {
+        "mysite": {
+          "pg_receivexlog": {
+            "disk_space_check_interval": 10,
+            "min_disk_free_bytes": null,
+            "resume_multiplier": 1.5
+          }
+        }
+      }
+
+  :disk_space_check_interval: (default ``10``)
+      How often (in seconds) to check available disk space.
+  :min_disk_free_bytes: (default ``null``)
+      Minimum bytes (in integer) that must be available in order to keep receiving
+      xlogs/WAL from PostgreSQL. If available disk space goes below this
+      limit a ``STOP`` signal is sent to the ``pg_receivexlog`` / ``pg_receivewal``
+      application.
+  :resume_multiplier: (default ``1.5``)
+      Number of times the ``min_disk_free_bytes`` bytes of disk space that is
+      required to start receiving xlog/WAL again (i.e. send the ``CONT`` signal to
+      the ``pg_receivexlog`` / ``pg_receivewal`` process). Multiplier above 1
+      should be used to avoid stopping and continuing the process constantly.
+
+
+
+.. _configuration_restore:
+
+Restore configuration
+---------------------
+
+
+
+
+
+
+.. _configuration_storage:
+
+Storage configuration
+~~~~~~~~~~~~~~~~~~~~~
+
+FIXME: reformat that according to what's been done above
+
+``object_storage`` (no default)
+
+Configured in ``backup_sites`` under a specific site.  If set, it must be an
+object describing a remote object storage.  The object must contain a key
+``storage_type`` describing the type of the store, other keys and values are
+specific to the storage type.
+
+``proxy_info`` (no default)
+
+Dictionary specifying proxy information. The dictionary must contain keys ``type``,
+``host`` and ``port``. Type can be either ``socks5`` or ``http``.  Optionally,
+``user`` and ``pass`` can be specified for proxy authentication.  Supported by
+Azure, Google and S3 drivers.
+
+The following object storage types are supported:
+
+* ``local`` makes backups to a local directory, see ``pghoard-local-minimal.json``
+  for example. Required keys:
+
+ * ``directory`` for the path to the backup target (local) storage directory
+
+* ``sftp`` makes backups to a sftp server, required keys:
+
+ * ``server``
+ * ``port``
+ * ``username``
+ * ``password`` or ``private_key``
+
+* ``google`` for Google Cloud Storage, required configuration keys:
+
+ * ``project_id`` containing the Google Storage project identifier
+ * ``bucket_name`` bucket where you want to store the files
+ * ``credential_file`` for the path to the Google JSON credential file
+
+* ``s3`` for Amazon Web Services S3, required configuration keys:
+
+ * ``aws_access_key_id`` for the AWS access key id
+ * ``aws_secret_access_key`` for the AWS secret access key
+ * ``region`` S3 region of the bucket
+ * ``bucket_name`` name of the S3 bucket
+
+Optional keys for Amazon Web Services S3:
+
+ * ``encrypted`` if True, use server-side encryption. Default is False.
+
+* ``s3`` for other S3 compatible services such as Ceph, required
+  configuration keys:
+
+ * ``aws_access_key_id`` for the AWS access key id
+ * ``aws_secret_access_key`` for the AWS secret access key
+ * ``bucket_name`` name of the S3 bucket
+ * ``host`` for overriding host for non AWS-S3 implementations
+ * ``port`` for overriding port for non AWS-S3 implementations
+ * ``is_secure`` for overriding the requirement for https for non AWS-S3
+ * ``is_verify_tls`` for configuring tls verify for non AWS-S3
+   implementations
+
+* ``azure`` for Microsoft Azure Storage, required configuration keys:
+
+ * ``account_name`` for the name of the Azure Storage account
+ * ``account_key`` for the secret key of the Azure Storage account
+ * ``bucket_name`` for the name of Azure Storage container used to store
+   objects
+ * ``azure_cloud`` Azure cloud selector, ``"public"`` (default) or ``"germany"``
+
+* ``swift`` for OpenStack Swift, required configuration keys:
+
+ * ``user`` for the Swift user ('subuser' in Ceph RadosGW)
+ * ``key`` for the Swift secret_key
+ * ``auth_url`` for Swift authentication URL
+ * ``container_name`` name of the data container
+
+ * Optional configuration keys for Swift:
+
+  * ``auth_version`` - ``2.0`` (default) or ``3.0`` for keystone, use ``1.0`` with
+    Ceph Rados GW.
+  * ``segment_size`` - defaults to ``1024**3`` (1 gigabyte).  Objects larger
+    than this will be split into multiple segments on upload.  Many Swift
+    installations require large files (usually 5 gigabytes) to be segmented.
+  * ``tenant_name``
+  * ``region_name``
+  * ``user_id`` - for auth_version 3.0
+  * ``user_domain_id`` - for auth_version 3.0
+  * ``user_domain_name`` - for auth_version 3.0
+  * ``tenant_id`` - for auth_version 3.0
+  * ``project_id`` - for auth_version 3.0
+  * ``project_name`` - for auth_version 3.0
+  * ``project_domain_id`` - for auth_version 3.0
+  * ``project_domain_name`` - for auth_version 3.0
+  * ``service_type`` - for auth_version 3.0
+  * ``endpoint_type`` - for auth_version 3.0
+
+
+
+
+.. _configuration_encryption:
+
+Encryption
+~~~~~~~~~~
+
+It is possible to set up encryption on a per-site basis.
+
+To generate this configuration, you can use ``pghoard_create_keys`` to generate
+and output encryption keys in the ``pghoard`` configuration format.
+
+
+encryption_key_id (no default)
+  Specifies the encryption key used when storing encrypted backups. If this
+  configuration directive is specified, you must also define the public key
+  for storing as well as private key for retrieving stored backups. These
+  keys are specified with ``encryption_keys`` dictionary.
+
+:encryption_keys: (no default)
+  This key is a mapping from key id to keys. Keys in turn are mapping from
+  ``public`` and ``private`` to PEM encoded RSA public and private keys
+  respectively. Public key needs to be specified for storing backups. Private
+  key needs to be in place for restoring encrypted backups.
+
diff --git a/docs/development.rst b/docs/development.rst
new file mode 100644
index 00000000..15fbfd2f
--- /dev/null
+++ b/docs/development.rst
@@ -0,0 +1,98 @@
+Development
+===========
+
+Requirements
+------------
+
+PGHoard can backup and restore PostgreSQL versions 9.3 and above.  The
+daemon is implemented in Python and works with CPython version 3.5 or newer.
+The following Python modules are required:
+
+* psycopg2_ to look up transaction log metadata
+* requests_ for the internal client-server architecture
+
+.. _`psycopg2`: http://initd.org/psycopg/
+.. _`requests`: http://www.python-requests.org/en/latest/
+
+Optional requirements include:
+
+* azure_ for Microsoft Azure object storage (patched version required, see link)
+* botocore_ for AWS S3 (or Ceph-S3) object storage
+* google-api-client_ for Google Cloud object storage
+* cryptography_ for backup encryption and decryption (version 0.8 or newer required)
+* snappy_ for Snappy compression and decompression
+* zstandard_ for Zstandard (zstd) compression and decompression
+* systemd_ for systemd integration
+* swiftclient_ for OpenStack Swift object storage
+* paramiko_  for sftp object storage
+
+.. _`azure`: https://github.com/aiven/azure-sdk-for-python/tree/aiven/rpm_fixes
+.. _`botocore`: https://github.com/boto/botocore
+.. _`google-api-client`: https://github.com/google/google-api-python-client
+.. _`cryptography`: https://cryptography.io/
+.. _`snappy`: https://github.com/andrix/python-snappy
+.. _`zstandard`: https://github.com/indygreg/python-zstandard
+.. _`systemd`: https://github.com/systemd/python-systemd
+.. _`swiftclient`: https://github.com/openstack/python-swiftclient
+.. _`paramiko`: https://github.com/paramiko/paramiko
+
+Developing and testing PGHoard also requires the following utilities:
+flake8_, pylint_ and pytest_.
+
+.. _`flake8`: https://flake8.readthedocs.io/
+.. _`pylint`: https://www.pylint.org/
+.. _`pytest`: http://pytest.org/
+
+PGHoard has been developed and tested on modern Linux x86-64 systems, but
+should work on other platforms that provide the required modules.
+
+Vagrant
+-------
+
+The Vagrantfile can be used to setup a vagrant development environment, consisting of two
+vagrant virtual machines.
+
+1) Postgresql 9.4, python 3.5 and 3.6::
+
+    vagrant up
+    vagrant ssh postgres9
+    cd /vagrant
+    source ~/venv3/bin/activate
+    make test
+    source ~/venv3.6/bin/activate
+    make test
+
+2) Postgresql 10 and python 3.7::
+
+    vagrant ssh postgres10
+    cd /vagrant
+    make test
+
+Note: make deb does not work from vagrant at the moment, hopefully this will be resolved soon
+
+.. _building_from_source:
+
+Building
+--------
+
+To build an installation package for your distribution, go to the root
+directory of a PGHoard Git checkout and run:
+
+Debian::
+
+  make deb
+
+This will produce a ``.deb`` package into the parent directory of the Git
+checkout.
+
+Fedora::
+
+  make rpm
+
+This will produce a ``.rpm`` package usually into ``rpm/RPMS/noarch/``.
+
+Python/Other::
+
+  python setup.py bdist_egg
+
+This will produce an egg file into a dist directory within the same folder.
diff --git a/docs/index.rst b/docs/index.rst
new file mode 100644
index 00000000..49e51753
--- /dev/null
+++ b/docs/index.rst
@@ -0,0 +1,60 @@
+.. PGHoard documentation master file, created by
+   sphinx-quickstart on Tue Jul 27 13:52:50 2021.
+   You can adapt this file completely to your liking, but it should at least
+   contain the root `toctree` directive.
+
+PGHoard
+=======
+
+.. |BuildStatus| image:: https://github.com/aiven/pghoard/actions/workflows/build.yml/badge.svg?branch=master
+.. _BuildStatus: https://github.com/aiven/pghoard/actions
+
+
+``pghoard`` is a PostgreSQL backup daemon and restore tooling that stores backup data in cloud object stores.
+
+.. toctree::
+   :maxdepth: 2
+   :caption: Contents:
+
+   about
+   quickstart
+   architecture
+   install
+   commands
+   monitoring
+   configuration
+   development
+
+License
+=======
+
+PGHoard is licensed under the Apache License, Version 2.0. Full license text
+is available in the ``LICENSE`` file and at
+http://www.apache.org/licenses/LICENSE-2.0.txt
+
+
+Credits
+=======
+
+PGHoard was created by Hannu Valtonen <hannu.valtonen@aiven.io> for
+`Aiven`_ and is now maintained by Aiven developers <opensource@aiven.io>.
+
+.. _`Aiven`: https://aiven.io/
+
+Recent contributors are listed on the GitHub project page,
+https://github.com/aiven/pghoard/graphs/contributors
+
+
+Contact
+=======
+
+Bug reports and patches are very welcome, please post them as GitHub issues
+and pull requests at https://github.com/aiven/pghoard .  Any possible
+vulnerabilities or other serious issues should be reported directly to the
+maintainers <opensource@aiven.io>.
+
+
+Copyright
+=========
+
+Copyright (C) 2015 Aiven Ltd
diff --git a/docs/install.rst b/docs/install.rst
new file mode 100644
index 00000000..5edf31fe
--- /dev/null
+++ b/docs/install.rst
@@ -0,0 +1,103 @@
+Installation
+============
+
+To run ``PGHoard`` you need to install it, and configure PostgreSQL according
+to the modes of backup and archiving you chose.
+
+This section only describes how to install it using a package manager.
+See :ref:`building_from_source` for other installation methods.
+
+
+.. _installation_package:
+
+Installation from your distribution package manager
+---------------------------------------------------
+
+RHEL
+++++
+
+FIXME: the RPM package seems to be available on yum.postgresql.org. Write a
+proper documentation for that.
+
+Debian
+++++++
+
+FIXME: can the package be included in apt.postgresql.org ? doesn't seem to be
+the case for now.
+
+
+
+Installation from pip
+---------------------
+
+You can also install it using pip:
+
+``pip install pghoard``
+
+FIXME: version of pghoard on pypi isn't up to date.
+
+
+.. _installation_postgresql_configuration:
+
+PostgreSQL Configuration
+========================
+
+PosgreSQL should be configured to allow replication connections, and have a
+high enough ``wal_level``.
+
+wal_level
+---------
+
+``wal_level`` should be set to at least ``replica`` (or ``archive`` for
+PostgreSQL versions prior to 9.6).
+
+.. note:: Changing ``wal_level`` requires restarting PostgreSQL.
+
+
+Replication connections
+-----------------------
+
+If you use the one of the non-local basebackup strategies (``basic``  or
+``pipe``), you will need to allow ``pg_basebackup`` to connect using a
+replication connection.
+
+Additionally, if you use a WAL-streaming archiving mode (``pg_receivexlog`` or
+``walreceiver``) you will need another replication connection for those.
+
+The parameter ``max_wal_senders`` must then be setup accordingly to allow for
+at least that number of connections. You should of course take into account the
+other replication connections that you may need, for one or several replicas.
+
+Example::
+
+    max_wal_senders = 4
+
+.. note:: Changing ``max_wal_senders`` requires restrating PostgreSQL
+
+You also need a PostgreSQL user account with the ``REPLICATION`` attribute,
+using psql::
+
+    -- create the user
+    CREATE USER pghoard REPLICATION;
+    -- Setup a password for the pghoard user
+    \password pghoard
+
+This user will need to be allowed to connect. For this you will need to edit
+the ``pg_hba.conf`` file on your PostgreSQL cluster.
+
+For example::
+
+     # TYPE  DATABASE     USER     ADDRESS       METHOD
+     host    replication  pghoard  127.0.0.1/32  md5
+
+.. note:: See `PostgreSQL documentation <https://www.postgresql.org/docs/current/auth-pg-hba-conf.html>`_ for
+   more information
+
+After editing, please reload the configuration with either::
+
+ SELECT pg_reload_conf();
+
+or by using your distribution service manager (ex: ``systemctl reload
+postgresql``)
+
+Now you can move on to :ref:`configuration` for how to setup PGHoard.:
diff --git a/docs/make.bat b/docs/make.bat
new file mode 100644
index 00000000..2119f510
--- /dev/null
+++ b/docs/make.bat
@@ -0,0 +1,35 @@
+@ECHO OFF
+
+pushd %~dp0
+
+REM Command file for Sphinx documentation
+
+if "%SPHINXBUILD%" == "" (
+	set SPHINXBUILD=sphinx-build
+)
+set SOURCEDIR=.
+set BUILDDIR=_build
+
+if "%1" == "" goto help
+
+%SPHINXBUILD% >NUL 2>NUL
+if errorlevel 9009 (
+	echo.
+	echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
+	echo.installed, then set the SPHINXBUILD environment variable to point
+	echo.to the full path of the 'sphinx-build' executable. Alternatively you
+	echo.may add the Sphinx directory to PATH.
+	echo.
+	echo.If you don't have Sphinx installed, grab it from
+	echo.http://sphinx-doc.org/
+	exit /b 1
+)
+
+%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
+goto end
+
+:help
+%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
+
+:end
+popd
diff --git a/docs/monitoring.rst b/docs/monitoring.rst
new file mode 100644
index 00000000..94efe533
--- /dev/null
+++ b/docs/monitoring.rst
@@ -0,0 +1,47 @@
+Monitoring
+==========
+
+Any backup tool must be properly monitored to ensure backups are correctly
+performed.
+
+``pghoard`` provides several ways to monitor it.
+
+
+.. note::
+   In addition to monitoring, the restore process should be tested regularly
+
+Alert files
+-----------
+
+Alert files are created whenever an error condition that requires human
+intervention to solve.  You're recommended to add checks for the existence
+of these files to your alerting system.
+
+:authentication_error:
+  There has been a problem in the authentication of at least one of the
+  PostgreSQL connections.  This usually denotes a wrong username and/or
+  password.
+:configuration_error:
+  There has been a problem in the authentication of at least one of the
+  PostgreSQL connections.  This usually denotes a missing ``pg_hba.conf`` entry or
+  incompatible settings in postgresql.conf.
+:upload_retries_warning:
+  Upload of a file has failed more times than
+  :upload_retries_warning_limit:. Needs human intervention to figure
+  out why and to delete the alert once the situation has been fixed.
+:version_mismatch_error:
+  Your local PostgreSQL client versions of ``pg_basebackup`` or
+  ``pg_receivewal`` (formerly ``pg_receive_xlog``) do not match with the servers PostgreSQL version.  You
+  need to update them to be on the same version level.
+
+:version_unsupported_error:
+  Server PostgreSQL version is not supported.
+
+Metrics
+-------
+
+You can configure ``pghoard`` to send metrics to an external system. Supported
+systems are described in :ref:`configuration_logging`.
+
+FIXME: describe the different metrics and what kind of alert to trigger based on
+them.
diff --git a/docs/quickstart.rst b/docs/quickstart.rst
new file mode 100644
index 00000000..0784145f
--- /dev/null
+++ b/docs/quickstart.rst
@@ -0,0 +1,179 @@
+QuickStart
+==========
+
+This quickstart will help you setup PGHoard for simple use case:
+
+  * "Remote" basebackup using pg_basebackup
+  * WAL Archiving using the ``pg_receivewal`` tool
+  * Local archiving to ``/mnt/pghoard_backup/``
+
+The local archiving is chosen because it's the easiest to demonstrate without
+external dependencies. For the object storage of your choice (S3, Azure,
+GCP...), refer to the appropriate section at :ref:`configuration_storage`.
+
+Installation
+------------
+
+Follow the instructions at :ref:`installation_package`.
+Then, setup PostgreSQL following :ref:`installation_postgresql_configuration`.
+
+Configuration
+-------------
+
+It is advised to use a replication slot to prevent WAL files from being recycled
+when we haven't consumed them yet.
+
+You can use pg_receivewal to create your replication slot::
+
+  pg_receivewal --create-slot -S pghoard_slot -U pghoard
+
+Create a ``/var/lib/pghoard.json`` file containing the following information, replacing
+the user and password you chose at the previous step::
+
+  {
+      "backup_location": "/mnt/pghoard_backup/state/",
+      "backup_sites": {
+        "my_test_cluster": {
+          "nodes": [
+            {
+              "host": "127.0.0.1",
+              "password": "secret",
+              "user": "pghoard",
+              "slot": "pghoard_slot"
+            }
+          ],
+          "object_storage": {
+              "storage_type": "local",
+              "directory": "/mnt/pghoard_backup/"
+          },
+          "pg_data_directory": "/var/lib/postgres/data/",
+          "pg_receivexlog_path": "/usr/bin/pg_receivewal",
+          "pg_basebackup_path": "/usr/bin/pg_basebackup",
+          "basebackup_interval_hours": 24,
+          "active_backup_mode": "basic"
+
+        }
+      }
+  }
+
+
+Testing your first backup
+-------------------------
+
+Launching pghoard
+~~~~~~~~~~~~~~~~~
+
+Launch pghoard using::
+
+  pghoard pghoard.json
+
+If everything went well, you should see something like this in the logs of
+pghoard::
+
+  2021-07-30 15:56:48,678 PGBaseBackup    Thread-23       INFO    Started: ['/usr/bin/pg_basebackup', '--format', 'tar', '--label', 'pghoard_base_backup', '--verbose', '--pgdata', '/mnt/pghoard_backup/state/my_test_cluster/basebackup_incoming/2021-07-30_13-56_0', '--wal-method=none', '--progress', '--dbname', "dbname='replication' host='127.0.0.1' replication='true' user='pghoard'"], running as PID: 3652881, basebackup_location: '/mnt/pghoard_backup/state/my_test_cluster/basebackup_incoming/2021-07-30_13-56_0/base.tar'
+  2021-07-30 15:56:48,805 PGBaseBackup    Thread-23       INFO    Ran: ['/usr/bin/pg_basebackup', '--format', 'tar', '--label', 'pghoard_base_backup', '--verbose', '--pgdata', '/mnt/pghoard_backup/state/my_test_cluster/basebackup_incoming/2021-07-30_13-56_0', '--wal-method=none', '--progress', '--dbname', "dbname='replication' host='127.0.0.1' replication='true' user='pghoard'"], took: 0.127s to run, returncode: 0
+  2021-07-30 15:56:48,922 Compressor      Thread-3        INFO    Compressed 33009152 byte open file '/mnt/pghoard_backup/state/my_test_cluster/basebackup_incoming/2021-07-30_13-56_0/base.tar' to 6797509 bytes (21%), took: 0.091s
+  2021-07-30 15:56:48,925 TransferAgent   Thread-12       INFO    Uploading file to object store: src='/mnt/pghoard_backup/state/my_test_cluster/basebackup/2021-07-30_13-56_0' dst='my_test_cluster/basebackup/2021-07-30_13-56_0'
+  2021-07-30 15:56:48,928 TransferAgent   Thread-12       INFO    Deleting file: '/mnt/pghoard_backup/state/my_test_cluster/basebackup/2021-07-30_13-56_0' since it has been uploaded
+
+What this means is that pghoard performed the following sequence of actions:
+
+- it launched pg_basebackup to perform the first basebackup of your cluster,
+  and stored it in a temporary location (``backup_location`` from the config file)
+- then it "uploaded" it. Since we chose a local storage for backup, it is just
+  copied to the destination.
+- finally it removes the temporary files
+
+This process would have been the same had you used a remote object storage like
+``S3`` or ``Swift``.
+
+You can check the contents of the final storage location::
+
+  ❯ tree  /mnt/pghoard_backup/my_test_cluster
+  /mnt/pghoard_backup/my_test_cluster
+  └── basebackup
+      ├── 2021-07-30_13-56_0
+      └── 2021-07-30_13-56_0.metadata
+
+Restoration
+~~~~~~~~~~~
+
+
+You can list your database basebackups by running::
+
+  ❯ pghoard_restore list-basebackups --config pghoard.json -v
+  Available 'my_test_cluster' basebackups: 
+
+  Basebackup                                Backup size    Orig size  Start time          
+  ----------------------------------------  -----------  -----------  --------------------
+  my_test_cluster/basebackup/2021-07-30_13-56_0         6 MB        31 MB  2021-07-30T13:56:48Z
+      metadata: {'backup-decision-time': '2021-07-30T13:56:48.673846+00:00', 'backup-reason': 'scheduled', 'start-wal-segment': '000000010000000000000081', 'pg-version': '130003', 'compression-algorithm': 'snappy', 'compression-level': '0', 'original-file-size': '33009152', 'host': 'myhost'}
+
+If we'd want to restore to the latest point in time we could fetch the
+required basebackup by running::
+
+  pghoard_restore get-basebackup --config pghoard.json \
+      --target-dir <destination> --restore-to-master
+
+  Basebackup complete.
+  You can start PostgreSQL by running pg_ctl -D foo start
+  On systemd based systems you can run systemctl start postgresql
+  On SYSV Init based systems you can run /etc/init.d/postgresql start
+
+Note that the ``target-dir`` needs to be either an empty or non-existent
+directory in which case PGHoard will automatically create it.
+
+After this we'd proceed to start both the PGHoard server process and the
+PostgreSQL server normally by running (on systemd based systems, assuming
+PostgreSQL 9.5 is used)::
+
+  systemctl start pghoard
+  systemctl start postgresql-9.5
+
+Which will make PostgreSQL start recovery process to the latest point
+in time. PGHoard must be running before you start up the
+PostgreSQL server. To see other possible restoration options please look at
+:ref:`commands_restore`.
+
+
+.. _quickstart_encryption:
+
+Optional: Adding encryption
+---------------------------
+
+
+If you want to encrypt your backups, you need to generate a public / private RSA
+key pair.
+
+The ``pghoard_create_keys`` script is used for that::
+
+  pghoard_create_keys --site my_test_site --key-id 1
+
+It will output a config snippet of the form::
+
+  {
+      "backup_sites": {
+          "my_test_site": {
+              "encryption_key_id": "1",
+              "encryption_keys": {
+                  "1": {
+                      "private": "-----BEGIN PRIVATE KEY-----<actual key>-----END PRIVATE KEY-----\n",
+                      "public": "-----BEGIN PUBLIC KEY-----<actual key>-----END PUBLIC KEY-----\n"
+                  }
+              }
+          }
+      }
+  }
+
+If you want this server to perform both backup and restore, you will need to
+copy both keys to your config file, under the ``backup_sites/my_test_site``
+section.
+
+If you only need to perform backups, you can store only the public key, in which
+case the host running pghoard will not be able to decipher the encrypted
+backups.
+
+.. danger::
+
+  Always keep a safe copy of your private key ! You WILL need it
+  to access your backups