Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Advanced SaaS Offering with increased uptime #4361

Merged
merged 11 commits into from
Oct 4, 2024
Original file line number Diff line number Diff line change
Expand Up @@ -143,19 +143,31 @@ Now you can select a hardware package that can cover these requirements. In this

### Camunda 8 SaaS

Camunda 8 defines three fixed hardware packages you can select from. The table below gives you an indication what requirements you can fulfill with these. If your requirements are above the mentioned numbers, please contact us to discuss a customized sizing.
Camunda 8 defines four fixed hardware packages you can select from (1x, 2x, 3x, 4x) when choosing your cluster [type](/components/concepts/clusters.md#cluster-type) and [size](/components/concepts/clusters.md#cluster-size). The following table gives you an indication of what requirements you can fulfill with each cluster size.
mesellings marked this conversation as resolved.
Show resolved Hide resolved

| **\*** | S | M | L |
| :----------------------------------------------------------------------- | ------------------------------: | ------------------------------: | -------------------------------: |
| Max Throughput **Tasks/day** | 5.9 M | 23 M | 43 M |
| Max Throughput **Tasks/second** | 65 | 270 | 500 |
| Max Throughput **Process Instances/day** | 0.5 M | 2.3 M | 4.3 M |
| Max Total Number of Process Instances stored (in Elasticsearch in total) | 100 k | 5.4 M | 15 M |
| Approx resources provisioned **\*\*** | 15 vCPU, 20 GB mem, 640 GB disk | 28 vCPU, 50 GB mem, 640 GB disk | 56 vCPU, 85 GB mem, 1320 GB disk |
The numbers in the table were measured using Camunda 8 (version 8.6) and [the benchmark project](https://github.com/camunda-community-hub/camunda-8-benchmark) running on its own Kubernetes Cluster and using a [ten task process](https://github.com/camunda-community-hub/camunda-8-benchmark/blob/main/src/main/resources/bpmn/typical_process.bpmn). To calculate day-based metrics, an equal distribution over 24 hours is assumed.
mesellings marked this conversation as resolved.
Show resolved Hide resolved

**\*** The numbers in the table where measured using Camunda 8 (version 8.0) and [the benchmark project](https://github.com/camunda-community-hub/camunda-8-benchmark). It uses a [ten task process](https://github.com/camunda-community-hub/camunda-8-benchmark/blob/main/src/main/resources/bpmn/typical_process.bpmn). To calculate day-based metrics, an equal distribution over 24 hours is assumed.
| | Basic | Standard | Advanced |
| :---------------------------------------------------------------------------------- | :------------- | :------------- | :------------- |
| Max Throughput **Tasks/day** **\*** | 4.5 M | ? M | ? M |
| Max Throughput **Tasks/second** **\*** | 52 | ? | ? |
| Max Throughput **Process Instances/day** **\*\*** | 2.9 M | ? M | ? M |
| Max Total Number of Process Instances stored (in Elasticsearch in total) **\*\*\*** | ? | ? | ? |
| Typical cluster size for licensing **\*\*\*\*** | 1x, 2x, 3x, 4x | 1x, 2x, 3x, 4x | 1x, 2x, 3x, 4x |
mesellings marked this conversation as resolved.
Show resolved Hide resolved

**\*\*** These are the resource limits configured in the Kubernetes cluster and are always subject to change.
**\*** Tasks (Service Tasks, Send Tasks, User Tasks, and so on) completed per day is the primary metric, as this is easy to measure and has a strong influence on resource consumption. This number assumes a constant load over the day.

**\*\*** As Tasks are the primary resource driver, the number of Process Instances supported by a cluster is calculated based on the assumption of an average of 10 tasks per process. Customers can calculate a more accurate Process Instance estimate using their anticipated number of tasks per process.

**\*\*\*** Total number of process instances within the retention period, regardless of if they are active or finished. This is limited by disk space, CPU, and memory for running and historical process instances available to ElasticSearch. Calculated assuming a typical set of process variables for process instances. Note that it makes a difference if you add one or two strings (requiring ~ 1kb of space) to your process instances, or if you attach a full JSON document containing 1MB, as this data needs to be stored in various places, influencing memory and disk requirements. If this number increases, you can still retain the runtime throughput, but Tasklist, Operate, and/or Optimize may lag behind.

Data Retention has an influence on the amount of data that is kept for completed instances in your cluster. The default Data retention is set to 30 days, which means that data that is older than 30 days gets removed from Operate and Tasklist. If a process instance is still active, it is fully functioning in runtime, but customers are not able to access historical data older than 30 days from Operate and Tasklist. Data retention is set to 6 months, meaning that data that is older than 6 months will be removed from Optimize. Up to certain limits Data Retention can be adjusted by Camunda on request.
mesellings marked this conversation as resolved.
Show resolved Hide resolved

**\*\*\*\*** This information identifies how many hosting packages might be suitable for your Camunda license. This assumes an evenly distributed load over time; peak loads may necessitate a larger cluster.
mesellings marked this conversation as resolved.
Show resolved Hide resolved

:::note
Contact your Customer Success Manager if you require a cluster size above these requirements. This requires custom sizing and pricing.
:::

You might wonder why the total number of process instances stored is that low. This is related to limited resources provided to Elasticsearch, yielding performance problems with too much data stored there. By increasing the available memory to Elasticsearch you can also increase that number. At the same time, even with this rather low number, you can always guarantee the throughput of the core workflow engine during peak loads, as this performance is not influenced. Also, you can always increase memory for Elasticsearch later on if it is required.

Expand Down
89 changes: 50 additions & 39 deletions docs/components/concepts/clusters.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,72 +6,83 @@ description: "Learn more about the clusters available in your Camunda 8 plan."

A [cluster](../../guides/create-cluster.md) is a provided group of production-ready nodes that run Camunda 8.

- **Enterprise** plan customers can create as many production or development clusters as they want based on their Enterprise agreement.
- **Starter** plan customers are limited based on the [fair usage limits of the plan](https://camunda.com/legal/fair-usage-limits-for-starter-plan/).
When [creating a cluster](/components/console/manage-clusters/create-cluster.md), you can customize the cluster **type** and **size** to meet your organization's availability and scalability needs, and to provide control over cluster performance, uptime, and disaster recovery guarantees.

Production clusters come in three sizes: small (S), medium (M), and large (L). To learn more about the size of cluster best suited for your use case, refer to our [Best Practices](/components/best-practices/best-practices-overview.md) for more information on [sizing your runtime environment](/components/best-practices/architecture/sizing-your-environment.md#sizing-your-runtime-environment).
:::note

The following table shows each plan and available type or size of cluster:
Prior to 8.6, clusters were configured by hardware size (S, M, L).

| | Development | Production - S | Production - M | Production - L |
| ---------- | ----------- | -------------- | -------------- | -------------- |
| Free Trial | \- | X | \- | \- |
| Free | \- | \- | \- | \- |
| Starter | X | X | \- | \- |
| Enterprise | X | X | X | X |
- To learn more about clusters prior to 8.6, see previous documentation versions.
- To learn more about migrating your existing clusters to the newer model, contact your Customer Success Manager.

When you deploy and execute your [BPMN](/components/modeler/bpmn/bpmn.md) or [DMN](/components/modeler/dmn/dmn.md) models on a production cluster, this might impact your monthly (Starter) or annual (Enterprise) total fee, meaning the more you execute your models, the higher your total fee may be.
:::

## Free Trial cluster
## Cluster type

Free Trial clusters have the same functionality as a production cluster, but are size [small (S)](/components/best-practices/architecture/sizing-your-environment.md#camunda-8-saas) and only available during your trial period. You cannot convert a Free Trial cluster to a different kind of cluster.
The cluster type defines the level of availability and uptime for the cluster.

Once you sign up for a Free Trial, you are able to create one production cluster for the duration of your trial.
You can choose from three different cluster types:

When your Free Trial plan expires, you are automatically transferred to the Free Plan. This plan allows you to model BPMN and DMN collaboratively, but does not support execution of your models. Any cluster created during your trial is deleted, and you cannot create new clusters.
- Use a **Basic** cluster for experimentation, early development, and basic use cases that do not require a guaranteed high uptime.
- Use a **Standard** cluster for production-ready use cases, with a guaranteed higher uptime.
- Use an **Advanced** cluster for production, with guaranteed minimal disruption and the highest uptime.

### Auto-pause
| Type | Basic | Standard | Advanced |
| :---------------------------------------------------------------------------- | :------------------------------------------------------- | :------------------------------------------- | :----------------------------------------------------- |
| Usage | Experimentation, early development, and basic use cases. | Production-ready use cases with high uptime. | Production with minimal disruption and highest uptime. |
| Uptime Percentage<br/> (Core Automation Cluster<strong>\*</strong>) | 99% | 99.5% | 99.9% |
| RTO/RPO<strong>\*\*</strong><br/>(Core Automation Cluster<strong>\*</strong>) | RTO: 8 hours<br/>RPO: 24 hours | RTO: 2 hours<br/>RPO: 4 hours | RTO: < 1 hour<br/>RPO: < 1 hour |

Free Trial `dev` (or untagged) clusters are automatically paused eight hours after a cluster is created or resumed from a paused state. Auto-pause occurs regardless of cluster usage.
<p><strong>* Core Automation Cluster</strong> means the components critical for automating processes and decisions, such as Zeebe, Operate, Tasklist, Optimize and Connectors.</p>
<p><strong>** RTO (Recovery Time Objective)</strong> means the maximum allowable time that a system or application can be down after a failure or disaster before it must be restored. It defines the target time to get the system back up and running. <strong>RPO (Recovery Point Objective)</strong> means the maximum acceptable amount of data loss measured in time. It indicates the point in time to which data must be restored to resume normal operations after a failure. It defines how much data you can afford to lose. The RTO/RPO figures shown in the table are provided on a best-effort basis and are not guaranteed.</p>

You can resume a paused cluster at any time, which typically takes five to ten minutes to complete. See [resume your cluster](/components/console/manage-clusters/manage-cluster.md#resume-a-cluster).
:::info
See [Camunda Enterprise General Terms](https://legal.camunda.com/licensing-and-other-legal-terms#camunda-enterprise-general-terms) for term definitions for **Monthly Uptime Percentage** and **Downtime**.
:::

- Clusters tagged as `test`, `stage`, or `prod` do not auto-pause.
- Paused clusters are automatically deleted after 30 consecutive paused days. You can change the tag to avoid cluster deletion.
- No data is lost while a cluster is paused. All execution and configuration is saved, but cluster components such as Zeebe and Operate are temporarily disabled until you resume the cluster.
## Cluster size

:::tip
The cluster size defines the cluster performance and capacity.

To prevent auto-pause, you can:
Choose the cluster size that best meets your cluster environment requirements. See [sizing your environment](/components/best-practices/architecture/sizing-your-environment.md#sizing-your-runtime-environment).

- Tag the cluster as `test`, `stage`, or `prod` instead of `dev`.
- [Upgrade your Free Trial plan](https://camunda.com/pricing/) to a Starter, Professional, or Enterprise plan.
- You can choose from four cluster sizes: 1x, 2x, 3x, 4x.
mesellings marked this conversation as resolved.
Show resolved Hide resolved
- Each increase in size boosts cluster performance and adds capacity. Larger cluster sizes allow you to serve more workload.
- Increased usage such as higher throughput or longer data retention requires a larger cluster size.
- Each size increase uses one of your available cluster reservations.
mesellings marked this conversation as resolved.
Show resolved Hide resolved

:::note

Contact your Customer Success Manager to:

- Increase the cluster size beyond the maximum 4x size. This requires custom sizing and pricing.
mesellings marked this conversation as resolved.
Show resolved Hide resolved
- Increase the cluster size of an existing cluster.

:::

## Development clusters
## Free Trial clusters
mesellings marked this conversation as resolved.
Show resolved Hide resolved

Development clusters, available in the Starter and Enterprise plans, are recommended for development, testing, proof of concepts, and demos.
Free Trial clusters have the same functionality as a production cluster, but are of a Basic type and 1x size, and only available during your trial period. You cannot convert a Free Trial cluster to a different kind of cluster.

The way this type of cluster works varies depending on if you are using it in the Starter or the Enterprise plan.
Once you sign up for a Free Trial, you are able to create one production cluster for the duration of your trial.

### Development clusters in the Enterprise Plan
When your Free Trial plan expires, you are automatically transferred to the Free Plan. This plan allows you to model BPMN and DMN collaboratively, but does not support execution of your models. Any cluster created during your trial is deleted, and you cannot create new clusters.

Enterprise Plan users can purchase development clusters as part of their Enterprise subscription agreement. Deployment and execution of models (process instances, decision instances, and task users) are included at no extra cost for this type of cluster. Additionally, this type of cluster in the Enterprise plan follows the [standard data retention policy](/components/concepts/data-retention.md) and does not auto-pause when not in use.
### Auto-pause

Please [contact us](https://camunda.com/contact/) if you are an existing customer and would like to purchase a development cluster.
Free Trial `dev` (or untagged) clusters are automatically paused eight hours after a cluster is created or resumed from a paused state. Auto-pause occurs regardless of cluster usage.

### Development clusters in the Starter Plan
You can resume a paused cluster at any time, which typically takes five to ten minutes to complete. See [resume your cluster](/components/console/manage-clusters/manage-cluster.md#resume-a-cluster).

Starter Plan users have one **development cluster** with free execution for development included in their plan. Deployment and execution of models (process instances, decision instances, and task users) are provided at no cost.
- Clusters tagged as `test`, `stage`, or `prod` do not auto-pause.
- Paused clusters are automatically deleted after 30 consecutive paused days. You can change the tag to avoid cluster deletion.
- No data is lost while a cluster is paused. All execution and configuration is saved, but cluster components such as Zeebe and Operate are temporarily disabled until you resume the cluster.

Additional clusters can be purchased through your [billing reservations](/components/console/manage-plan/update-billing-reservations.md).
:::tip

Additionally in the Starter Plan, the following applies to **development clusters**:
To prevent auto-pause, you can:

- **Cluster is not highly available & includes less hardware**: Reduced hardware resources and availability compared to production cluster (for example, one Zeebe node only).
- **Shorter history of processes and decisions**: Data retention in Operate, Optimize, and Tasklist is reduced to one day. For example, pending or historical process instances are deleted after one day as per the [fair usage limits of the Starter plan](https://camunda.com/legal/fair-usage-limits-for-starter-plan/).
- Tag the cluster as `test`, `stage`, or `prod` instead of `dev`.
- [Upgrade your Free Trial plan](https://camunda.com/pricing/) to a Starter, Professional, or Enterprise plan.
mesellings marked this conversation as resolved.
Show resolved Hide resolved

:::caution
**Cluster auto-pause** is not yet available and only applies to non-Enterprise clusters. Development clusters will be paused if they go unused for two hours. When a cluster is paused, not all functionality is limited. For example, you may still execute BPMN timers and BPMN message catch events. To resume your cluster, review [how to resume a cluster](/components/console/manage-clusters/manage-cluster.md#resume-a-cluster).
:::
Loading