-
Notifications
You must be signed in to change notification settings - Fork 41
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #6151 from dimagi/ap/es/reindex-changelog
Changelog for Instructions to Reindex Indexes
- Loading branch information
Showing
3 changed files
with
188 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
title: Reindex All Indices For Elasticsearch Upgrade | ||
key: reindex-all-indices-for-es-upgrade | ||
date: 2023-10-25 | ||
optional_per_env: no | ||
# (optional) Min version of HQ that MUST be deployed before this change can be rolled out (commit hash) | ||
min_commcare_version: 045d45fa69b6dd83f870ed185c0d73adeef14350 | ||
max_commcare_version: | ||
context: | | ||
Reindex Elasticsearch Indices for upcoming ES Upgrade. | ||
details: | | ||
Currently CommCare HQ is running with Elasticsearch 2.4. Reindexing is required before upgrading Elasticsearch to version 5.6.16. | ||
update_steps: | | ||
1. Deploy latest version of CommCare HQ. | ||
2. Ensure that there is enough free space available in your cluster. The command to estimate disk space required for reindexing is: | ||
```sh | ||
cchq <env> django-manage elastic_sync_multiplexed estimated_size_for_reindex | ||
``` | ||
3. Check disk usage on each node | ||
- If you have separate data nodes, check disk usage on data nodes | ||
```sh | ||
cchq <env> run-shell-command es_data "df -h /opt/data" -b | ||
``` | ||
- If you don't have separate data nodes, check disk usage on ES nodes. | ||
```sh | ||
cchq <env> run-shell-command elasticsearch "df -h /opt/data" -b | ||
``` | ||
This will return disk usage for each node. You can check if the cumulative available space across all nodes is greater than the total recommended space from the `estimated_size_for_reindex` output. | ||
If you don't have enough space available in your elasticsearch cluster, it is recommended to add more storage capacity before reindexing. If you don't have the option to add more storage, you will need to reindex one index at a time. You should follow the process described in [Reindexing One Index at A Time](https://github.com/dimagi/commcare-hq/blob/56682492f20c60cdef0ccde6049b9945b3658493/corehq/apps/es/REINDEX_PROCESS.md#reindexing-one-index-at-a-time) | ||
4. It is advised to run the following steps in a tmux session as they might take a long time and can be detached/re-attached as needed for monitoring progress. | ||
- To start a tmux session in your django-manage machine, use the following steps, changing `<env>` to your environment name where needed: | ||
```bash | ||
cchq <env> tmux django_manage | ||
sudo -iu cchq | ||
cd /home/cchq/www/<env>/current | ||
source python_env/bin/activate | ||
``` | ||
5. The following steps should be performed serially for each canonical index name, replacing `${INDEX_CNAME}` with each of the following values `['apps', 'cases', 'case_search', 'domains', 'forms', 'groups', 'sms', 'users']` | ||
1. Set the `${INDEX_CNAME}` environment variable. For example, the first run should be: | ||
```bash | ||
INDEX_CNAME='apps' | ||
``` | ||
2. Start the reindex process | ||
```bash | ||
./manage.py elastic_sync_multiplexed start ${INDEX_CNAME} | ||
``` | ||
Note down the Task Number that is displayed by the command. It should be a numeric ID and will be required to verify the reindex process. | ||
3. Verify the reindex is completed by querying the logs and ensuring that the doc count matches between primary and secondary indices. The commands in this step should be run from the control machine. | ||
1. Login to the control machine in a new shell. | ||
``` | ||
cchq <env> ssh control | ||
``` | ||
2. From control machine, run the following command to query the reindex logs. | ||
``` | ||
cchq <env> run-shell-command elasticsearch "grep '<Task Number>.*ReindexResponse' /opt/data/elasticsearch*/logs/*.log" | ||
``` | ||
This command will query the Elasticsearch nodes to find any log entries containing the `ReindexResponse` for the given Task Number. The log should look something like: | ||
``` | ||
[2023-10-25 08:59:37,648][INFO] [tasks] 29216 finished with response ReindexResponse[took=1.8s,updated=0,created=1111,batches=2,versionConflicts=0,noops=0,retries=0,throttledUntil=0s,indexing_failures=[],search_failures=[]] | ||
``` | ||
Ensure that `search_failures` and `indexing_failures` are empty lists. | ||
3. Then verify doc counts match between primary and secondary indices using: | ||
``` | ||
cchq <env> django-manage elastic_sync_multiplexed display_doc_counts <index_cname> | ||
``` | ||
This command will display the document counts for both the primary and secondary indices for a given index. If the doc count matches between the two and there are no errors in the reindex logs, then reindexing is complete for that index. | ||
Please note that the counts may not match perfectly for high frequency indices like case_search, cases, and forms. In such cases, ensure the difference in counts is small (within one hundred) and there are no errors in reindex logs. | ||
If reindex fails for any index, please refer to the docs [here](https://github.com/dimagi/commcare-hq/blob/56682492f20c60cdef0ccde6049b9945b3658493/corehq/apps/es/REINDEX_PROCESS.md#common-issues-resolutions-during-reindex) for troubleshooting steps. | ||
4. If doc counts match and there are no errors present in the reindex logs, the reindex for the current index is complete. You can continue reindexing for the next index by repeating steps 5.1-5.3 with `${INDEX_CNAME}` set to 'cases', 'case_search' and so on for all indices. |
96 changes: 96 additions & 0 deletions
96
docs/source/changelog/0075-reindex-all-indexes-for-es-upgrade.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,96 @@ | ||
<!--THIS FILE IS AUTOGENERATED: DO NOT EDIT--> | ||
<!--See https://github.com/dimagi/commcare-cloud/blob/master/changelog/README.md for instructions--> | ||
# 75. Reindex All Indices For Elasticsearch Upgrade | ||
|
||
**Date:** 2023-10-25 | ||
|
||
**Optional per env:** _required on all environments_ | ||
|
||
|
||
## CommCare Version Dependency | ||
The following version of CommCare must be deployed before rolling out this change: | ||
[045d45fa](https://github.com/dimagi/commcare-hq/commit/045d45fa69b6dd83f870ed185c0d73adeef14350) | ||
|
||
|
||
## Change Context | ||
Reindex Elasticsearch Indices for upcoming ES Upgrade. | ||
|
||
## Details | ||
Currently CommCare HQ is running with Elasticsearch 2.4. Reindexing is required before upgrading Elasticsearch to version 5.6.16. | ||
|
||
## Steps to update | ||
1. Deploy latest version of CommCare HQ. | ||
2. Ensure that there is enough free space available in your cluster. The command to estimate disk space required for reindexing is: | ||
|
||
```sh | ||
cchq <env> django-manage elastic_sync_multiplexed estimated_size_for_reindex | ||
``` | ||
3. Check disk usage on each node | ||
- If you have separate data nodes, check disk usage on data nodes | ||
```sh | ||
cchq <env> run-shell-command es_data "df -h /opt/data" -b | ||
``` | ||
- If you don't have separate data nodes, check disk usage on ES nodes. | ||
```sh | ||
cchq <env> run-shell-command elasticsearch "df -h /opt/data" -b | ||
``` | ||
This will return disk usage for each node. You can check if the cumulative available space across all nodes is greater than the total recommended space from the `estimated_size_for_reindex` output. | ||
If you don't have enough space available in your elasticsearch cluster, it is recommended to add more storage capacity before reindexing. If you don't have the option to add more storage, you will need to reindex one index at a time. You should follow the process described in [Reindexing One Index at A Time](https://github.com/dimagi/commcare-hq/blob/56682492f20c60cdef0ccde6049b9945b3658493/corehq/apps/es/REINDEX_PROCESS.md#reindexing-one-index-at-a-time) | ||
4. It is advised to run the following steps in a tmux session as they might take a long time and can be detached/re-attached as needed for monitoring progress. | ||
- To start a tmux session in your django-manage machine, use the following steps, changing `<env>` to your environment name where needed: | ||
```bash | ||
cchq <env> tmux django_manage | ||
sudo -iu cchq | ||
cd /home/cchq/www/<env>/current | ||
source python_env/bin/activate | ||
``` | ||
5. The following steps should be performed serially for each canonical index name, replacing `${INDEX_CNAME}` with each of the following values `['apps', 'cases', 'case_search', 'domains', 'forms', 'groups', 'sms', 'users']` | ||
1. Set the `${INDEX_CNAME}` environment variable. For example, the first run should be: | ||
```bash | ||
INDEX_CNAME='apps' | ||
``` | ||
2. Start the reindex process | ||
```bash | ||
./manage.py elastic_sync_multiplexed start ${INDEX_CNAME} | ||
``` | ||
Note down the Task Number that is displayed by the command. It should be a numeric ID and will be required to verify the reindex process. | ||
3. Verify the reindex is completed by querying the logs and ensuring that the doc count matches between primary and secondary indices. The commands in this step should be run from the control machine. | ||
1. Login to the control machine in a new shell. | ||
``` | ||
cchq <env> ssh control | ||
``` | ||
2. From control machine, run the following command to query the reindex logs. | ||
``` | ||
cchq <env> run-shell-command elasticsearch "grep '<Task Number>.*ReindexResponse' /opt/data/elasticsearch*/logs/*.log" | ||
``` | ||
This command will query the Elasticsearch nodes to find any log entries containing the `ReindexResponse` for the given Task Number. The log should look something like: | ||
``` | ||
[2023-10-25 08:59:37,648][INFO] [tasks] 29216 finished with response ReindexResponse[took=1.8s,updated=0,created=1111,batches=2,versionConflicts=0,noops=0,retries=0,throttledUntil=0s,indexing_failures=[],search_failures=[]] | ||
``` | ||
Ensure that `search_failures` and `indexing_failures` are empty lists. | ||
3. Then verify doc counts match between primary and secondary indices using: | ||
``` | ||
cchq <env> django-manage elastic_sync_multiplexed display_doc_counts <index_cname> | ||
``` | ||
This command will display the document counts for both the primary and secondary indices for a given index. If the doc count matches between the two and there are no errors in the reindex logs, then reindexing is complete for that index. | ||
Please note that the counts may not match perfectly for high frequency indices like case_search, cases, and forms. In such cases, ensure the difference in counts is small (within one hundred) and there are no errors in reindex logs. | ||
If reindex fails for any index, please refer to the docs [here](https://github.com/dimagi/commcare-hq/blob/56682492f20c60cdef0ccde6049b9945b3658493/corehq/apps/es/REINDEX_PROCESS.md#common-issues-resolutions-during-reindex) for troubleshooting steps. | ||
4. If doc counts match and there are no errors present in the reindex logs, the reindex for the current index is complete. You can continue reindexing for the next index by repeating steps 5.1-5.3 with `${INDEX_CNAME}` set to 'cases', 'case_search' and so on for all indices. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters