Skip to content

Data backup and restore

Deepak Narayana Rao edited this page Oct 27, 2017 · 9 revisions

We follow a common backup, restore and purging process for all data

Backup Process

  • Data backup is scheduled via *_Backup jobs in jenkins. These jobs run respective *-backup ansible roles in this repo.
  • Backup jobs run once every day @midnight
  • Data backup is compressed and uploaded to Azure blob storage for long term storage

Databases

Data stored in following databases are periodically backed up using process mentioned above

  • Postgresql - Full DB backup us taken using postgresql-backup role which uses pg_dump_all internally.
  • ElasticSearch - Snapshot based backups are stored in azure using azure repository plugin. Uses ansible role es-azure-snapshot
  • Cassandra - Full DB backups taken using ansible role cassandra-backup
  • MongoDB - Full DB backups taken using ansible role mongo-backup

Jenkins

  • Jenkins thin backup plugin is used for jenkins backup.
  • This plugin is scheduled to run @midnight. It saves a new backup folder for each backup
  • The Jenkins_Backup job compresses latest backup folder and uploads to azure blob storage. This is scheduled to run every day

Restore Process

  • Restore from backup can be done via *_Restore jenkins jobs. These jobs run respective *-restore ansible roles in this repo
  • Restore jobs take backup_name as parameter to decide backup from which time should be restored
  • Restore job are run on demand

Databases

These backups are stored in azure resource group <env>-db-backups. There will be a storage account by name backups<env> which will have container(folder) for each backup

  • Postgresql - Name of the backup to restore can be found in container postgresql-backup
  • ElasticSearch - Name of the backup to restore can be found in console logs of Application_ElasticSearch_Backup job

Note: Alternatively you can get snapshot name by looking at last snapshot from API response of curl --silent 10.10.3.7:9200/_snapshot/azurebackup/_all | jq . | less

  • Cassandra - Name of the backup to restore can be found in container cassandra-backup
  • MongoDB - Name of the backup to restore can be found in container mongodb-backup

Jenkins

These backups are stored in azure resource group admin-backups. There will be a storage account by name backupsadmin. Name of the backup to restore can be found in container jenkins-backup

Purging Process

  • Backup data is retained for 30 days by default
  • Purging of postgresql, cassandra and jenkins backup in azure blob storage is scheduled using azure logic app. The steps are listed in Azure blob storage purge setup
  • Elasticsearch snapshots are purged using ansible role es5-snapshot-purge which uses elasticsearch-curator internally. This purging step runs inside job Application_ElasticSearch_Backup. Purging step runs after the backup step
  • Mongodb will removed soon, hence no purging has been setup
Clone this wiki locally