From ccceec0b396d9788d33c72ea71509fd3acadc194 Mon Sep 17 00:00:00 2001 From: Julien Clarysse Date: Tue, 3 Dec 2024 18:29:42 +0100 Subject: [PATCH 1/3] update(MirrorMaker): improve configuration concepts Explained the different configuration layers and moved monitoring instructions to howto section. [DOC-1169] --- .../concepts/mirrormaker2-tuning.md | 67 +++++++++---------- .../howto/monitor-replication-execution.md | 28 ++++++++ sidebars.ts | 1 + 3 files changed, 62 insertions(+), 34 deletions(-) create mode 100644 docs/products/kafka/kafka-mirrormaker/howto/monitor-replication-execution.md diff --git a/docs/products/kafka/kafka-mirrormaker/concepts/mirrormaker2-tuning.md b/docs/products/kafka/kafka-mirrormaker/concepts/mirrormaker2-tuning.md index 19c7a339c..ad4826921 100644 --- a/docs/products/kafka/kafka-mirrormaker/concepts/mirrormaker2-tuning.md +++ b/docs/products/kafka/kafka-mirrormaker/concepts/mirrormaker2-tuning.md @@ -1,12 +1,34 @@ --- -title: MirrorMaker 2 common parameters +title: Configuration parameters --- -MirrorMaker 2 (MM2) offers a suite of parameters to help with data replication and monitoring within Apache Kafka® ecosystems. -This topic outlines common parameters you can adjust, along with tips for -validating MM2's performance. +Apache Kafka® MirrorMaker 2 provides a suite of configuration parameters +to help with data replication within Apache Kafka® ecosystems. -1. Increase the value of `kafka_mirrormaker.tasks_max_per_cpu` in the +## Configuration layers + +1. **Service** configurations apply to the nodes and workers of Apache Kafka® MirrorMaker 2 cluster. + - They are documented under [Advanced parameters for Aiven for Apache Kafka® MirrorMaker 2](/docs/products/kafka/kafka-mirrormaker/reference/advanced-params). + - An example of service configuration is `kafka_mirrormaker.emit_checkpoints_enabled`: Whether to emit consumer group offset checkpoints to target cluster periodically. + - Changing the value of such parameter leads to a restart of the workers (along with their connectors and tasks). +1. **Replication-flow** configurations apply to the connectors (Source, Sink, Checkpoint, Heartbeat). + - They are documented under [Aiven Terraform provider mirrormaker_replication_flow resource documentation](https://registry.terraform.io/providers/aiven/aiven/latest/docs/resources/mirrormaker_replication_flow). + - An example of replication-flow configuration is `topics`: List of topics and/or regular expressions to replicate (see [topics included in a replication flow](/docs/products/kafka/kafka-mirrormaker/concepts/replication-flow-topics-regex)). + - Chaging the value of such parameter leads to the restart of impacted connectors (along with their tasks). +1. **Integration** configurations apply to the consumers and producers of the connectors. + - They are documented under [Aiven Terraform provider service_integration resource documentation](https://registry.terraform.io/providers/aiven/aiven/latest/docs/resources/service_integration#nested-schema-for-kafka_mirrormaker_user_configkafka_mirrormaker). + - An example of integration configuration is `consumer_fetch_min_bytes`: The minimum amount of data the server should return for a fetch request. + - Changing the value of such parameter leads to a restart of the workers (along with their connectors and tasks). + +:::note +Most configurations are directly inherited from the upstream [KIP-382: MirrorMaker 2.0 - Configuration Properties](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95650722#KIP382:MirrorMaker2.0-ConnectorConfigurationProperties). +::: + +## Common parameters + +This section outlines common parameters which you can adjust. + +1. Increase the value of `kafka_mirrormaker.tasks_max_per_cpu` _in the advanced options. Setting this to match the number of partitions can enhance performance. 1. Ensure the interval seconds for the following settings match. You @@ -19,32 +41,9 @@ validating MM2's performance. 1. To exclude internal topics, add these patterns to your topic blacklist: - `.*[\-\.]internal` `.*\.replica` `__.*` `connect.*` -1. Depending on your use case, consider adjusting these parameters: - - `kafka_mirrormaker.consumer_fetch_min_bytes` - - `kafka_mirrormaker.producer_batch_size` - - `kafka_mirrormaker.producer_buffer_memory` - - `kafka_mirrormaker.producer_linger_ms` - - `kafka_mirrormaker.producer_max_request_size` - -## MirrorMaker 2 validation tips - -To ensure MirrorMaker 2 is up-to-date with message processing, monitor -these: - -1. **Consumer lag metric**: Monitor the `kafka.consumer_lag` metric. - -1. **Dashboard metrics**: If MirrorMaker 2 stops adding records to a - topic, the `jmx.kafka.connect.mirror.record_count` metric stops - increasing, showing a flat line on the dashboard. - -1. **Retrieve latest messages with \`kt\`**: Use - [kt](https://github.com/fgeller/kt) to retrieve the latest messages - from all partitions with the following command: - - ``` - kt consume -auth ./mykafka.conf \ - -brokers SERVICE-PROJECT.aivencloud.com:PORT \ - -topic topicname -offsets all=newest:newest | \ - jq -c -s 'sort_by(.partition) | .[] | \ - {partition: .partition, value: .value, timestamp: .timestamp}' - ``` +1. Depending on your use case, consider adjusting these integration parameters: + - `consumer_fetch_min_bytes` + - `producer_batch_size` + - `producer_buffer_memory` + - `producer_linger_ms` + - `producer_max_request_size` diff --git a/docs/products/kafka/kafka-mirrormaker/howto/monitor-replication-execution.md b/docs/products/kafka/kafka-mirrormaker/howto/monitor-replication-execution.md new file mode 100644 index 000000000..d6f473676 --- /dev/null +++ b/docs/products/kafka/kafka-mirrormaker/howto/monitor-replication-execution.md @@ -0,0 +1,28 @@ +--- +title: Monitor replication execution +--- + +Apache Kafka® MirrorMaker 2 leverages Kafka Connect to help with state management +and monitoring. + +## Tips + +To ensure that the replication is up-to-date with message processing, check this: + +1. **Consumer lag metric**: Monitor the `kafka.consumer_lag` metric. + +1. **Dashboard metrics**: If MirrorMaker 2 stops adding records to a + topic, the `jmx.kafka.connect.mirror.record_count` metric stops + increasing, showing a flat line on the dashboard. + +1. **Retrieve latest messages with \`kt\`**: Use + [kt](https://github.com/fgeller/kt) to retrieve the latest messages + from all partitions with the following command: + + ``` + kt consume -auth ./mykafka.conf \ + -brokers SERVICE-PROJECT.aivencloud.com:PORT \ + -topic topicname -offsets all=newest:newest | \ + jq -c -s 'sort_by(.partition) | .[] | \ + {partition: .partition, value: .value, timestamp: .timestamp}' + ``` diff --git a/sidebars.ts b/sidebars.ts index 1492d1bb1..6d1e02433 100644 --- a/sidebars.ts +++ b/sidebars.ts @@ -1060,6 +1060,7 @@ const sidebars: SidebarsConfig = { items: [ 'products/kafka/kafka-mirrormaker/howto/integrate-external-kafka-cluster', 'products/kafka/kafka-mirrormaker/howto/setup-replication-flow', + 'products/kafka/kafka-mirrormaker/howto/monitor-replication-execution', 'products/kafka/kafka-mirrormaker/howto/remove-mirrormaker-prefix', 'products/kafka/kafka-mirrormaker/howto/datadog-customised-metrics', 'products/kafka/kafka-mirrormaker/howto/log-analysis-offset-sync-tool', From 6141a4897fe619ea32933baa841676b108aa55a0 Mon Sep 17 00:00:00 2001 From: Harshini Rangaswamy Date: Fri, 13 Dec 2024 15:19:02 +0100 Subject: [PATCH 2/3] udpate: content --- .../concepts/mirrormaker2-tuning.md | 117 ++++++++++++------ .../howto/monitor-replication-execution.md | 37 +++--- 2 files changed, 100 insertions(+), 54 deletions(-) diff --git a/docs/products/kafka/kafka-mirrormaker/concepts/mirrormaker2-tuning.md b/docs/products/kafka/kafka-mirrormaker/concepts/mirrormaker2-tuning.md index ad4826921..da3ec30c2 100644 --- a/docs/products/kafka/kafka-mirrormaker/concepts/mirrormaker2-tuning.md +++ b/docs/products/kafka/kafka-mirrormaker/concepts/mirrormaker2-tuning.md @@ -2,48 +2,93 @@ title: Configuration parameters --- -Apache Kafka® MirrorMaker 2 provides a suite of configuration parameters -to help with data replication within Apache Kafka® ecosystems. +Explore Aiven for Apache Kafka® MirrorMaker 2 configuration layers, including service, replication flow, and integration settings, to optimize data replication in your Apache Kafka® ecosystem. ## Configuration layers -1. **Service** configurations apply to the nodes and workers of Apache Kafka® MirrorMaker 2 cluster. - - They are documented under [Advanced parameters for Aiven for Apache Kafka® MirrorMaker 2](/docs/products/kafka/kafka-mirrormaker/reference/advanced-params). - - An example of service configuration is `kafka_mirrormaker.emit_checkpoints_enabled`: Whether to emit consumer group offset checkpoints to target cluster periodically. - - Changing the value of such parameter leads to a restart of the workers (along with their connectors and tasks). -1. **Replication-flow** configurations apply to the connectors (Source, Sink, Checkpoint, Heartbeat). - - They are documented under [Aiven Terraform provider mirrormaker_replication_flow resource documentation](https://registry.terraform.io/providers/aiven/aiven/latest/docs/resources/mirrormaker_replication_flow). - - An example of replication-flow configuration is `topics`: List of topics and/or regular expressions to replicate (see [topics included in a replication flow](/docs/products/kafka/kafka-mirrormaker/concepts/replication-flow-topics-regex)). - - Chaging the value of such parameter leads to the restart of impacted connectors (along with their tasks). -1. **Integration** configurations apply to the consumers and producers of the connectors. - - They are documented under [Aiven Terraform provider service_integration resource documentation](https://registry.terraform.io/providers/aiven/aiven/latest/docs/resources/service_integration#nested-schema-for-kafka_mirrormaker_user_configkafka_mirrormaker). - - An example of integration configuration is `consumer_fetch_min_bytes`: The minimum amount of data the server should return for a fetch request. - - Changing the value of such parameter leads to a restart of the workers (along with their connectors and tasks). +Aiven for Apache Kafka® MirrorMaker 2 configurations are organized into three layers: +**service**, **replication flow**, and **integration**. Each layer controls a specific +aspect of the replication process. + +### Service configurations + +Service configurations control the behavior of nodes and workers in the +Aiven for Apache Kafka® MirrorMaker 2 cluster. + +**Example of a service configuration**: + +- Parameter: [`kafka_mirrormaker.emit_checkpoints_enabled`](https://aiven.io/docs/products/kafka/kafka-mirrormaker/reference/advanced-params#kafka_mirrormaker_emit_checkpoints_enabled) +- Description: Enables or disables periodically emitting consumer group offset + checkpoints to the target cluster. +- Impact: + - Automatically restarts the workers. + - Restarts all connectors and tasks. + +### Replication-flow configurations + +Replication-flow configurations manage the behavior of connectors, such as Source, Sink, +Checkpoint, and Heartbeat. + +**Example of a replication-flow configuration**: + +- Parameter: [`topics`](https://registry.terraform.io/providers/aiven/aiven/latest/docs/resources/mirrormaker_replication_flow) +- Description: Specifies a list of topics or regular expressions to replicate. + For more information, see the [topics included in a replication flow](/docs/products/kafka/kafka-mirrormaker/concepts/replication-flow-topics-regex). +- Impact: + - Automatically restarts the affected connectors. + - Restarts their associated tasks. + +### Integration configurations + +Integration configurations fine-tune the interaction between producers and consumers +within connectors. + +**Example of an integration configuration**: + +- Parameter: [`consumer_fetch_min_bytes`](https://registry.terraform.io/providers/aiven/aiven/latest/docs/resources/service_integration#nested-schema-for-kafka_mirrormaker_user_configkafka_mirrormaker) +- Description: Sets the minimum amount of data the server should return for a fetch + request. +- Impact: + - Automatically restarts the workers. + - Restarts all connectors and tasks. :::note -Most configurations are directly inherited from the upstream [KIP-382: MirrorMaker 2.0 - Configuration Properties](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95650722#KIP382:MirrorMaker2.0-ConnectorConfigurationProperties). +Most configuration parameters are derived from +[KIP-382: MirrorMaker 2.0 - Configuration Properties](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95650722#KIP382:MirrorMaker2.0-ConnectorConfigurationProperties). Refer to this resource for additional details. ::: ## Common parameters -This section outlines common parameters which you can adjust. - -1. Increase the value of `kafka_mirrormaker.tasks_max_per_cpu` _in the - advanced options. Setting this to match the number of partitions can - enhance performance. -1. Ensure the interval seconds for the following settings match. You - can reduce these intervals for more frequent data synchronization: - - Advanced options: - - `kafka_mirrormaker.emit_checkpoints_interval_seconds` - - `kafka_mirrormaker.sync_group_offsets_interval_seconds` - - Replication flow: - - `Sync interval in seconds`. -1. To exclude internal topics, add these patterns to your topic - blacklist: - - `.*[\-\.]internal` `.*\.replica` `__.*` `connect.*` -1. Depending on your use case, consider adjusting these integration parameters: - - `consumer_fetch_min_bytes` - - `producer_batch_size` - - `producer_buffer_memory` - - `producer_linger_ms` - - `producer_max_request_size` +This section describes common parameters that can be adjusted to optimize the performance +and behavior of Aiven for Apache Kafka MirrorMaker 2 replication. + +1. **Optimize task allocation**: + Increase the value of + [`kafka_mirrormaker.tasks_max_per_cpu`](/docs/products/kafka/kafka-mirrormaker/reference/advanced-params#kafka_mirrormaker_tasks_max_per_cpu) + in the advanced configuration. + Setting this to match the number of partitions can improve performance. + +1. **Align interval settings**: + Ensure the following interval settings match to achieve more frequent and synchronized + data replication: + - **Advanced configurations**: + - [`kafka_mirrormaker.emit_checkpoints_interval_seconds`](/docs/products/kafka/kafka-mirrormaker/reference/advanced-params#kafka_mirrormaker_emit_checkpoints_interval_seconds) + - [`kafka_mirrormaker.sync_group_offsets_interval_seconds`](/docs/products/kafka/kafka-mirrormaker/reference/advanced-params#kafka_mirrormaker_sync_group_offsets_interval_seconds) + - **Replication flow**: + - `Sync interval in seconds` + +1. **Exclude internal topics**: + Add these patterns to your topic blacklist to exclude internal topics: + - `.*[\-\.]internal` + - `.*\.replica` + - `__.*` + - `connect.*` + +1. **Adjust integration parameters**: + Modify these integration parameters based on your use case to improve producer and + consumer performance: + - `consumer_fetch_min_bytes` + - `producer_batch_size` + - `producer_buffer_memory` + - `producer_linger_ms` + - `producer_max_request_size` diff --git a/docs/products/kafka/kafka-mirrormaker/howto/monitor-replication-execution.md b/docs/products/kafka/kafka-mirrormaker/howto/monitor-replication-execution.md index d6f473676..47ebc249e 100644 --- a/docs/products/kafka/kafka-mirrormaker/howto/monitor-replication-execution.md +++ b/docs/products/kafka/kafka-mirrormaker/howto/monitor-replication-execution.md @@ -2,27 +2,28 @@ title: Monitor replication execution --- -Apache Kafka® MirrorMaker 2 leverages Kafka Connect to help with state management -and monitoring. +Apache Kafka® MirrorMaker 2 uses Apache Kafka® Connect for state management and monitoring, allowing you to track the status of replication flows and address potential issues. -## Tips -To ensure that the replication is up-to-date with message processing, check this: +## Monitoring tips -1. **Consumer lag metric**: Monitor the `kafka.consumer_lag` metric. +Follow these tips to ensure that replication is up-to-date with message processing: -1. **Dashboard metrics**: If MirrorMaker 2 stops adding records to a - topic, the `jmx.kafka.connect.mirror.record_count` metric stops - increasing, showing a flat line on the dashboard. +1. **Monitor consumer lag:** Use the `kafka.consumer_lag` metric to track replication + progress and identify delays. -1. **Retrieve latest messages with \`kt\`**: Use - [kt](https://github.com/fgeller/kt) to retrieve the latest messages - from all partitions with the following command: +1. **Track dashboard metrics:** Check the `jmx.kafka.connect.mirror.record_count` metric. + If MirrorMaker 2 stops adding records to a topic, this metric will show a flat line, + indicating no new records are being replicated. - ``` - kt consume -auth ./mykafka.conf \ - -brokers SERVICE-PROJECT.aivencloud.com:PORT \ - -topic topicname -offsets all=newest:newest | \ - jq -c -s 'sort_by(.partition) | .[] | \ - {partition: .partition, value: .value, timestamp: .timestamp}' - ``` +1. **Retrieve the latest messages using `kt`:** Use the + [**kt**](https://github.com/fgeller/kt) tool to fetch the latest messages from all + partitions. Run the following command: + + ```bash + kt consume -auth ./mykafka.conf \ + -brokers SERVICE-PROJECT.aivencloud.com:PORT \ + -topic topicname -offsets all=newest:newest | \ + jq -c -s 'sort_by(.partition) | .[] | \ + {partition: .partition, value: .value, timestamp: .timestamp}' + ``` From 724ad084754f7868c47c3bec1da254400db05eda Mon Sep 17 00:00:00 2001 From: Harshini Rangaswamy Date: Fri, 13 Dec 2024 15:38:55 +0100 Subject: [PATCH 3/3] fix: SEO --- .../kafka/kafka-mirrormaker/concepts/mirrormaker2-tuning.md | 5 +++-- .../kafka-mirrormaker/howto/monitor-replication-execution.md | 4 +--- 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/docs/products/kafka/kafka-mirrormaker/concepts/mirrormaker2-tuning.md b/docs/products/kafka/kafka-mirrormaker/concepts/mirrormaker2-tuning.md index da3ec30c2..2fde5669b 100644 --- a/docs/products/kafka/kafka-mirrormaker/concepts/mirrormaker2-tuning.md +++ b/docs/products/kafka/kafka-mirrormaker/concepts/mirrormaker2-tuning.md @@ -1,8 +1,9 @@ --- -title: Configuration parameters +title: Configuration parameters for Aiven for Apache Kafka® MirrorMaker 2 --- -Explore Aiven for Apache Kafka® MirrorMaker 2 configuration layers, including service, replication flow, and integration settings, to optimize data replication in your Apache Kafka® ecosystem. +Learn about the configuration layers in Aiven for Apache Kafka® MirrorMaker 2, including service, replication flow, and integration settings. +Optimize data replication and performance in your Kafka ecosystem. ## Configuration layers diff --git a/docs/products/kafka/kafka-mirrormaker/howto/monitor-replication-execution.md b/docs/products/kafka/kafka-mirrormaker/howto/monitor-replication-execution.md index 47ebc249e..4095a7d83 100644 --- a/docs/products/kafka/kafka-mirrormaker/howto/monitor-replication-execution.md +++ b/docs/products/kafka/kafka-mirrormaker/howto/monitor-replication-execution.md @@ -1,9 +1,7 @@ --- title: Monitor replication execution --- - -Apache Kafka® MirrorMaker 2 uses Apache Kafka® Connect for state management and monitoring, allowing you to track the status of replication flows and address potential issues. - +Apache Kafka® MirrorMaker 2 uses Kafka® Connect for monitoring and state management, helping you track replication flows and address issues. ## Monitoring tips