Elasticsearch data backup #5074

facyber · 2021-08-05T09:24:28Z

facyber
Aug 5, 2021

Hello community, just wanted to share possible way of backups of ES data, I did not tested backup and restore with large amount of data, only few GB maybe, but still, maybe someone will get an idea to improve the way.

Note: Please do test it first on your test environment before production, and of course create a backup/snapshots, whatever.

My architecture is Manager + Heavy Nodes, so I guess this guide is for that setup as I am not sure if all the files are the same for other types.

1) Create Curator YML file

This configuration will first delete all snapshots that are older than 7 days, so we can keep it weekly update which I suggest, because it would be better to keep more smaller backups, than one big, as the big one can take a lot of time for backup depending of your resources and of course there is higher risk for something to happen during the backup so you would need to start all over. If you want to keep more, play a bit with unit and unit_count values. After it deletes the latest one, it will create a new snapshot of all ES data, or better say of all indices.

Create a Curator config file in /opt/so/saltstack/local/salt/curator/files/action/ folder (let's say we name it so-curator-snapshot-create.yml):

actions:
  1:
    action: delete_snapshots
    description: Delete snapshots older than 7 days
    options:
      repository: my_repository
      disable_action: True
    filters:
    - filtertype: age
      source: creation_date
      direction: older
      timestring: '%Y.%m.%d'
      unit: days
      unit_count: 7
  2:
    action: snapshot
    description: Weekly snapshot of all Elasticsearch data.
    options:
      repository: my_repository
      # Leaving name blank will result in the default 'curator-%Y%m%d%H%M%S'
      name: 'curator-%Y%m%d%W-%H%M%S'
      include_global_state: True
      partial: False
      wait_for_completion: True
      # -1 for max_wait means it will wait indefinitely for the action to complete.
      # It should be used for testing on larger amount of data to see how long will it take to create a snapshot
      max_wait: 3600
      wait_interval: 10
    filters:
    - filtertype: period
      period_type: relative
      source: name
      range_from: -1
      range_to: -1
      timestring: '%Y.%m.%d'
      unit: weeks
      week_starts_on: monday

Apply changes with sudo salt-call state.apply curator

2) Create Curator BASH script

Then you need to create a Curator BASH script that will execute the Curator script through the Docker. This was basically just a copy and edited version of some other script that came with SO.

#!/bin/bash

# Avoid starting multiple instances
APP=snapshotcreate
lf=/tmp/$APP-pidLockFile
# create empty lock file if none exists
cat /dev/null >> $lf
read lastPID < $lf
# if lastPID is not null and a process with that pid exists , exit
[ ! -z "$lastPID" -a -d /proc/$lastPID ] && exit
echo $$ > $lf


docker exec so-curator curator --config /etc/curator/config/curator.yml /etc/curator/action/so-curator-snapshot-create.yml 2>&1

3) ES backup folder

For this part I will first show config files and then explain how it works (at least I believe it works like this hehe). for the purpose of example, let's say you have mounted shared backup folder at location /mnt/sharedfolder, wher you will first create a folder esdata.

Then inside the Elasticsearch container create /tmp/esdata folder:

a) docker ps | grep elasticsearch to get ES container ID
b) docker exec -u root -it <ES_ID> bash
c) mkdir /tmp/esdata
d) exit to exit the container

Then edit /opt/so/saltstack/local/salt/elasticsearch/files/elasticsearch.yml and add path.repo: /tmp/esdata just bellow path.logs.

Then edit /opt/so/saltstack/local/salt/elasticsearch/init.sls and under so-elasticsearch -> binds add /mnt/sharedfolder/esdata:/tmp/esdata:rw, so it should look something like this:

so-elasticsearch:
- binds:
   - /opt/so/conf/elasticsearch/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml:ro
   - /opt/so/conf/elasticsearch/log4j2.properties:/usr/share/elasticsearch/config/log4j2.properties:ro
   - /nsm/elasticsearch:/usr/share/elasticsearch/data:rw
   - /opt/so/log/elasticsearch:/var/log/elasticsearch:rw
   - /opt/so/conf/ca/cacerts:/etc/pki/ca-trust/extracted/java/cacerts:ro
   - /mnt/sharedfolder/esdata:/tmp/esdata:rw

Then use sudo salt-call state.apply elasticsearch or maybe sudo salt-call state.highstate to apply changes.

Now explanation: When a curator creates a snapshot, a noticed that it create it inside the Docker container, not the outside of it, therefor the only folder where we are able to write data is /tmp folder, which is the reason why we create /tmp/esdata folder so we can keep data there. The value /mnt/sharedfolder/esdata:/tmp/esdata:rw will then link the folder inside the container to the folder outside of it, on our mounted shared folder, so when the snapshot data is create inside the /tmp/esdata, they will also be available on shared folder. If you delete something from inside the container, it will also be deleted from outside folder, so I suggest you to create additional BASH script that will copy/move data from mounted shared folder to another one just to be safe.

4) Create a cronjob

Make a copy of /opt/so/saltstack/default/salt/curator/init.sls to /opt/so/saltstack/local/salt/curator/init.sls if you don't already have one and add the following somewhere in the file.

snapshotcreate:
  file.managed:
    - name: /usr/sbin/so-curator-snapshot-create
    - source: salt://curator/files/bin/so-curator-snapshot-create
    - user: 934
    - group: 939
    - mode: 755

so-curatorsnapshotcreate:
 cron.present:
   - name: /usr/sbin/so-curator-snapshot-create
   - user: root
   - minute: '0'
   - hour: '0'
   - daymonth: '*'
   - month: '*'
   - dayweek: '1'

Apply changes with sudo salt-call state.apply curator

And I believe this is it. You can go through Curator Documentation here to see more options to make it more personalized to your environment, as this was more for testing purposes. Feel free to propose changes or if something should be different.

Cheers

Answered by Git-Me-Some-Hub

Aug 12, 2021

I was trying to roll through this setup to test on my test cluster since you gratiously let me know you posted this out there; however, I hit a snag.

I did get through this, but had to do a different command.

When logging into the docker instance of so-elasticsearch I could not use root, I instead Googled it and came up with 0 "zero"
Error:
unable to find user root: no matching entries in passwd file

My current show stopper for the night is adding the binds. When I tried to run the State config update, it says there is an unknown mount point and breaks so-elasticseach container. I had to remove the 2 config files and reboot.

Bind File: /opt/so/saltstack/local/salt/elasticsearch/ini…

View full answer

Git-Me-Some-Hub · 2021-08-12T06:25:57Z

Git-Me-Some-Hub
Aug 12, 2021

I was trying to roll through this setup to test on my test cluster since you gratiously let me know you posted this out there; however, I hit a snag.

I did get through this, but had to do a different command.

When logging into the docker instance of so-elasticsearch I could not use root, I instead Googled it and came up with 0 "zero"
Error:
unable to find user root: no matching entries in passwd file

My current show stopper for the night is adding the binds. When I tried to run the State config update, it says there is an unknown mount point and breaks so-elasticseach container. I had to remove the 2 config files and reboot.

Bind File: /opt/so/saltstack/local/salt/elasticsearch/init.sls
Error:
[ERROR ] {'container': {'Config': {'Volumes': {'old': {'/etc/pki/ca-trust/extracted/java/cacerts': {}, '/usr/share/elasticsearch/config/ca.crt': {}, '/usr/share/elasticsearch/config/elasticsearch.crt': {}, '/usr/share/elasticsearch/config/elasticsearch.key': {}, ' /usr/share/elasticsearch/config/elasticsearch.p12': {}, '/usr/share/elasticsearch/config/elasticsearch.yml': {}, '/usr/share/elastics earch/config/log4j2.properties': {}, '/usr/share/elasticsearch/data': {}, '/var/log/elasticsearch': {}}, 'new': {'/etc/pki/ca-trust/e xtracted/java/cacerts': {}, '/tmp/esdata': {}, '/usr/share/elasticsearch/config/ca.crt': {}, '/usr/share/elasticsearch/config/elastic search.crt': {}, '/usr/share/elasticsearch/config/elasticsearch.key': {}, '/usr/share/elasticsearch/config/elasticsearch.p12': {}, '/ usr/share/elasticsearch/config/elasticsearch.yml': {}, '/usr/share/elasticsearch/config/log4j2.properties': {}, '/usr/share/elasticse arch/data': {}, '/var/log/elasticsearch': {}}}}, 'HostConfig': {'Binds': {'old': ['/opt/so/conf/elasticsearch/elasticsearch.yml:/usr/ share/elasticsearch/config/elasticsearch.yml:ro', '/opt/so/conf/elasticsearch/log4j2.properties:/usr/share/elasticsearch/config/log4j 2.properties:ro', '/nsm/elasticsearch:/usr/share/elasticsearch/data:rw', '/opt/so/log/elasticsearch:/var/log/elasticsearch:rw', '/opt /so/conf/ca/cacerts:/etc/pki/ca-trust/extracted/java/cacerts:ro', '/etc/pki/ca.crt:/usr/share/elasticsearch/config/ca.crt:ro', '/etc/ pki/elasticsearch.crt:/usr/share/elasticsearch/config/elasticsearch.crt:ro', '/etc/pki/elasticsearch.key:/usr/share/elasticsearch/con fig/elasticsearch.key:ro', '/etc/pki/elasticsearch.p12:/usr/share/elasticsearch/config/elasticsearch.p12:ro'], 'new': ['/opt/so/conf/ elasticsearch/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml:ro', '/opt/so/conf/elasticsearch/log4j2.properties: /usr/share/elasticsearch/config/log4j2.properties:ro', '/nsm/elasticsearch:/usr/share/elasticsearch/data:rw', '/opt/so/log/elasticsea rch:/var/log/elasticsearch:rw', '/opt/so/conf/ca/cacerts:/etc/pki/ca-trust/extracted/java/cacerts:ro', '/mnt/sharedfolder/esdata:/tmp /esdata:rw', '/etc/pki/ca.crt:/usr/share/elasticsearch/config/ca.crt:ro', '/etc/pki/elasticsearch.crt:/usr/share/elasticsearch/config /elasticsearch.crt:ro', '/etc/pki/elasticsearch.key:/usr/share/elasticsearch/config/elasticsearch.key:ro', '/etc/pki/elasticsearch.p1 2:/usr/share/elasticsearch/config/elasticsearch.p12:ro']}}}, 'container_id': {'removed': ['00fee2ef43f5a4c39711f58c7510bcf221c4ee644c 228b4992460db234cf0429'], 'added': '833aba38124328ec2cd86e65c7762bdd0baaa93f4b52c7de1a13245dc142c3fd'}}

[ERROR ] Container 'so-elasticsearch' is already configured as specified. Failed to start container 'so-elasticsearch': 'Error 500: OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:340: applying cgroup configuration for process caused: mountpoint for devices not found: unknown'.

I am able to write to the mount in the general command prompt, and I have root:root 775 permissions on the folder, should it be something else?
I am using the same exact file paths, and essentially names for simplicity on my first go around.

I would also like to note that it might be clearer for some newer users that if the two config files are not present in the local folders you need to copy them over from default.

point 3: cp /opt/so/saltstack/default/salt/elasticsearch/files/elasticsearch.yml /opt/so/saltstack/local/salt/elasticsearch/files/elasticsearch.yml
point 3: cp /opt/so/saltstack/default/salt/elasticsearch/init.sls /opt/so/saltstack/local/salt/elasticsearch/init.sls

Thanks again for writing up what you have so far, would love to get this working.

14 replies

dougburks Oct 29, 2021
Maintainer

@IzacFrank Please see the documentation:
https://docs.securityonion.net/en/2.3/backups.html

IzacFrank Nov 1, 2021

@dougburks Thank you for the link but it is not the answer to my question. I already checked that page and it seems configuration backup only.

IzacFrank Nov 1, 2021

Hello @dougburks
What should I backup? json files or elastic data., mysql dbs? If we need them for forensics analysis for example in the future.

Thank you.

Git-Me-Some-Hub Nov 1, 2021

@IzacFrank - Hey if you are looking to back up "Logs" of Suricata and such, I am assuming you want all the indexes for those things, correct? That would mean at the very least you need to back up so-suricata-, and so-zeek-, being I think the default is a 7 days retention rate.

If you read any articles on backing up Elastic database information all road point to Snapshots, so you are in the right place. Now you just need to get those offloaded to some other share.
Link: https://www.elastic.co/guide/en/elasticsearch/reference/current/snapshot-restore.html

Does that make sense and answer you question?

IzacFrank Nov 2, 2021

@Git-Me-Some-Hub Thank you so much! Yes, I didn't understand that part, that is now clear to me. Thanks again.

SaoPauloooo · 2021-11-05T23:13:09Z

SaoPauloooo
Nov 5, 2021

@facyber Thank you for this guide. I've gotten really close following it! One question: How do you register the snapshot repository? (my_repository in your example)

Elasticsearch Documentation gives examples like

curl -X PUT "localhost:9200/_snapshot/my_backup?pretty" -H 'Content-Type: application/json' -d'
{  
  "type": "fs",  
  "settings": {  
    "location": "my_backup_location"
  }
}
'

But I cannot figure how to channel that through docker or is it possible to do with so-elasticsearch-query?

This is the error I currently get when the CRON tries to run:

2021-11-05 20:53:54,729 INFO      Preparing Action ID: 2, "snapshot"
2021-11-05 20:53:54,778 INFO      Trying Action ID: 2, "snapshot": Weekly snapshot of all Elasticsearch data.
2021-11-05 20:53:55,247 ERROR     Failed to complete action: snapshot.  <class 'curator.exceptions.ActionError'>: Cannot snapshot indices to missing repository: my_repository

My install is:

from the ISO
Standalone
2.3.61

1 reply

SaoPauloooo Nov 5, 2021

Think I figured it out!

sudo so-elasticsearch-query _snapshot/my_repository -XPUT -d '{"type": "fs", "settings": {"location": "/tmp/esdata", "compress": true}}'

Thanks! Hopefully this helps someone in the future.

abesinger · 2021-11-16T05:08:11Z

abesinger
Nov 16, 2021

@SaoPauloooo @facyber Yes, this was quite helpful, i spent several hours trying to figure out why elastic couldn't see my repo before I found this discussion. The key thing I was missing was having the mount point bound to the container.

A couple of extra things I found:

the repo directory needs to be writable by the elasticsearch user, uid/gid may vary for each installation. The mode can be set when you create the directory on the fileserver, or on the search node, or in the container -- the perms will get changed on the same inode not matter what. If the repo directory is not writeable by elasticsearch, you'll get an error when trying to register the repo.
Copying and editing /opt/so/saltstack/default/salt/elasticsearch/files/elasticsearch.yml didn't work, I had to copy/edit /opt/so/saltstack/default/salt/elasticsearch/defaults.yaml.

It would be nice if there were a way to specify the repos in the pillar files and not have to maintain local copies of the salt files. I tried adding some jinja2 magic to defaults.yaml to pick it up out of the pillar for the searchnode, but something about the way salt parses/loads that file doesn't seem to handle complete jinja2 syntax. How SO maps the salt and pillar files into the working config files are still a bit of unobvious to me.

1 reply

facyber Nov 16, 2021
Author

@abesinger I am glad that you found solution. I changed my job so I do not work on SO anymore, and it could be that some files changed since then. :)

Cheers

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Elasticsearch data backup #5074

{{title}}

Replies: 3 comments 16 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Elasticsearch data backup #5074

Replies: 3 comments · 16 replies

dougburks Oct 29, 2021 Maintainer

facyber Nov 16, 2021 Author

Replies: 3 comments 16 replies

dougburks Oct 29, 2021
Maintainer

facyber Nov 16, 2021
Author