Skip to content

Commit

Permalink
Merge pull request #507 from maplelabs/prd_doc
Browse files Browse the repository at this point in the history
Added PRD which was missed in older merge.
  • Loading branch information
Akhil-Nair-ML authored Mar 24, 2023
2 parents 79fd2b0 + 95002e5 commit 51349e7
Show file tree
Hide file tree
Showing 2 changed files with 374 additions and 3 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,11 @@ To get started find the guidelines for [contributing OpenSearch Scaling Manager]

To learn more about Scaling Manager, take a look at the following resources:

- [Design Architecture](https://github.com/maplelabs/opensearch-scaling-manager/blob/master/docs/DesignArchitecture.md) - Design Architecture of Scaling Manager
- [Product Requirements Document](https://github.com/maplelabs/opensearch-scaling-manager/blob/master/docs/PRD.md)

- [Run scaling manager](https://github.com/maplelabs/opensearch-scaling-manager/blob/master/docs/RunScalingManager.md)
- [Design Architecture](https://github.com/maplelabs/opensearch-scaling-manager/blob/master/docs/DesignArchitecture.md) - Design Architecture of Scaling Manager

- [Run scaling manager](https://github.com/maplelabs/opensearch-scaling-manager/blob/master/docs/RunScalingManager.md)

## Open Search Cluster Simulator

Expand Down
371 changes: 371 additions & 0 deletions docs/PRD.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,371 @@
# OpenSearch Scaling Manager



## Overview

OpenSearch scale manager is used to elastically scale a cluster to ensure optimum cluster performance and expenses involved.



## Approach

Cluster Scaling can happen based on the following rules

- **Scale-Up**:

- If the load is gradually increasing scale up by 1 node.
- If the load increase is significant and persistent then scale up by more than 1 node.

- **Rule based cluster sizing**

- Ensure X Nodes are present in the Cluster at a given time.
- 5 Nodes, 9 A.M, Monday
- 3 Nodes, 9 A.M, Saturday
- Consider Cluster sizing if the number of Nodes present does not match the number specified in the rule.

- **Scale down**

- If the load is gradually falling and ingest is low, Scale down by 1 Node.



**Detecting gradual increase in load**

- Monotonic Increase in CPU Utilization and CPU Utilization > Threshold.

- Monotonic increase in Memory Utilization and Memory Utilization > Threshold.

- Monotonic increase in JVM Utilization and JVM Utilization > Threshold



**Detecting sudden increase in load**

- Calculate the ingest rate based on the data increase in current indexes. This rate is expressed on GB/day.

- If ingest rate is steady and increasing.

- Calculate the number of Nodes corresponding to Ingest Rate.

- If number of Nodes required > current Cluster size, increase the cluster by **N** nodes,

where N = number of Nodes required - Current number of Nodes in the Cluster.



**Rule based Cluster sizing**

- Rule can be executed only if Current Ingest Rate < Ingest Rate that new Cluster can handle.

- Else the rule is ignored with a Warning log.



**Detecting the gradual decrease in load**

- Check if Cluster Utilization is consistently low and monotonically reducing.

- Check Ingest Rate and if warrants that the Cluster size has to be reduced.

- If above checks are true, then trigger Scale-down by 1 Node.



There are 2 key modules for lifecycle manager:

1. **Analyzer**: Performs analysis and recommends action to be take.

2. **Executor**: Executes the recommended action.



**Avoiding Conflict of Rules**

- Each scheme generates recommendations based on its analysis. There could be situations where multiple recommendations are generated in an analysis cycle. Analyzer has to reduce these multiple recommendations into a single recommendation.
- One approach to reduce multiple recommendation to a single recommendation, is to choose the largest cluster size recommended.
- Any new recommendation can be acted upon only when the previous recommendation has been executed and the cluster is stable.

**Sensor Failure**

- The solution completely depends on input signals
- If any input signal is not available, Scaling manager generates error logs and does not make any recommendation.

**Executor Failure**

- If there is any error in performing scale-up/scale-down, Scaling manager generates error logs and does not generate further recommendations until error state is cleared.



**Simulator**

- Simulator is a critical module to test,validate and refine solution.
- Simulator will model Elasticsearch through the variables that matter to scaling manager
- CPU Utilization
- RAM Utilization
- JVM Utilization
- Ingest Rate
- Number of Nodes
- Ingest rate is represented using a cyclical function with some randomness.
- Recommendation affects the number of nodes in the cluster and in some cases the executor may fail.



## Architecture

The solution includes the following components

- Fetch Metrics

- Recommendation

- Trigger

- Provision

- State



**Fetch Metrics**

- Code is deployed on each node available in the cluster.

- Only the current master node will have the privilege to execute the code.

- Monitored metrics are indexed into Elasticsearch.

- Old data is purged periodically from the index.



**Recommendation**

- Performs checks based on the configuration file.
- Based on the checks, provides a Scale-up-by-1 or Scale-down-by-1 recommendation.
- The data is maintained in a command queue.



**Trigger**

- Checks the state of the cluster
- If the cluster is in normal state
- Give command to Provision module
- Updates state = provisioning
- Log "provision triggered - Up/Down Number of Nodes"
- Else
- Clear the command queue, commands are ignored since the cluster health criteria is not satisfied.



**Provision**

- Receives the command from the trigger module and updates state = Provision
- Take action based on provisioning command. The execution engine has to be cloud independent
- If provisioning completed successfully, update state = provision completed.
- If provisioning failed, update state = provisioning failed.



**State**

- Check the current state and status of cluster and update state
- If state == provision completed
- Check if all system metrics are in a normal state
- Cluster state is green
- No relocating shards
- Update State = Normal



**Installation of the solution**

- Package to be installed on all nodes of Elasticsearch cluster.
- Should automatically create an Index if not present.
- On uninstalling, index should be deleted.



**Configuration inputs**

- Cluster details

- IP/DNS
- Elasticsearch Credentials
- Cloud Type - AWS, GCP, Azure
- Cloud Credentials/ IAM role
- Base node type
- CPUs per node
- RAM per node in GB
- Disk per node in GB
- Maximum nodes allowed

- Task details

- Scale-up-by-1 [ Triggered based on OR of the conditions]
- Data nodes required
- CPU Utilization Rule
- Enabled
- Limit - 80%
- Stat - Average/Count
- Decision Period - 60 minutes.
- Memory Utilization Rule
- Enabled
- Limit - 90%
- Stat - Average/Count
- Decision Period - 60 minutes
- Shard Rule
- Enabled
- Limit - 900
- Decision Period - 60 minutes
- Scale-down-by-1 [Triggered based on AND of the conditions]
- Data nodes required.
- CPU Utilization Rule
- Enabled
- Limit - 60%
- Stat - Average/Count
- Decision Period - 2 days
- Memory Utilization Rule
- Enabled
- Limit - 60%
- Stat - Average/Count
- Decision Period - 2 days
- Shard Rule
- Enabled
- Limit - 900
- Decision Period - 2 days
- Implementation of the tasks
- Implements following functions
- Average(time period): Returns struct Cluster Stats which contains CPU, Memory, Shards - Maximum, Minimum , Average.
- Count(time period,threshold_list): Returns struct Cluster Stats which contains number of times violated for CPU, Memory, Shards.
- A module looks at all the tasks and invokes build stats, then it evaluates each task which compares with the cluster average.

- Fetch Metrics

- Metrics required

- Node level
- CPU utilization
- RAM Utilization
- Heap Utilization
- Disk Utilization
- Shards count
- Cluster level
- Active Master nodes count
- Active Data nodes count
- Overall cluster status
- Initializing shards count
- Unassigned shards count
- Relocating shards count

- Periodicity: 5 minutes

- Maintains data for 72 hours in the Elasticsearch index.



- Elasticsearch simulator

- This module has following functions

- Model behavior based on configuration file

- Provides Metrics based on the API call

- Alter behavior based on provisioning commands


- Configuration file

- Data node count

- Shards

- Index count

- Shards per Index

- Aggregate data rate function

- Saw tooth(max,min,max_time,min_time)
- Add random(min %, max %)

- Index roll-over size

- Index clean-up age

- Disk function

- Derived from shards
- Disk clean-up limits

- Search events

- Simple search

- Probability
- CPU Load
- Memory Load

- Medium search

- Probability
- CPU Load
- Memory Load

- Complex search

- Probability

- CPU Load

- Memory Load


- Behavior

- Initial focus on Index and Shards

- Define Initial indexes

- Number of Shards = N * Indexes *shards per Index *2

- Calculate Index size every cycle based on

- Ingest function
- Random value

- If Index size = Threshold, then Roll over

- If disk size limit reached, clean up old indexes

- Equally distribute index and shard across all nodes.


- Disk size

- Totals disk size = sum of all indexes

- Equal distribution of disk size per node


- CPU Utilization

- CPU Load in cores = function(Ingest Rate) + function(CPU load from search)

- CPU Utilization of cluster = used cores/ total CPU cores

- CPU Utilization of a node is same as the cluster since it is equally distributed


- Memory Utilization

- Memory Load = function(Ingest Rate) + function(Memory load from search)
- Memory Utilization of a cluster = used memory / total memory
- Memory Utilization of a node is same as the cluster since it is equally distributed

0 comments on commit 51349e7

Please sign in to comment.