Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add: Azure Blob Storage sink connector for Aiven for Apache Kafka #503

Merged
merged 6 commits into from
Oct 15, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -7,48 +7,47 @@ Discover a variety of connectors available for use with any Aiven for Apache Kaf

## Source connectors

Source connectors enable the integration of data from an existing
technology into an Apache Kafka topic. The following is the list of
available source connectors:
Source connectors enable the integration of data from an existing technology into an
Apache Kafka topic. The available source connectors include:

- [Couchbase](https://github.com/couchbase/kafka-connect-couchbase)
- [Official MongoDB®](https://www.mongodb.com/docs/kafka-connector/current/)
- [Debezium for MongoDB®](https://debezium.io/docs/connectors/mongodb/)
- [Debezium for MySQL](https://debezium.io/docs/connectors/mysql/)
- [Debezium for PostgreSQL®](/docs/products/kafka/kafka-connect/howto/debezium-source-connector-pg)
- [Debezium for SQL Server](https://debezium.io/docs/connectors/sqlserver/)
- [Google Cloud Pub/Sub](https://github.com/googleapis/java-pubsub-group-kafka-connector/)
- [Google Cloud Pub/Sub Lite](https://github.com/googleapis/java-pubsub-group-kafka-connector/)
- [JDBC](https://github.com/aiven/jdbc-connector-for-apache-kafka/blob/master/docs/source-connector.md)
- [Official MongoDB®](https://www.mongodb.com/docs/kafka-connector/current/)
- [Stream Reactor Cassandra®](https://docs.lenses.io/5.1/connectors/sources/cassandrasourceconnector/)
- [Stream Reactor MQTT](https://docs.lenses.io/5.1/connectors/sources/mqttsourceconnector/)

## Sink connectors

Sink connectors enable the integration of data from an existing Apache
Kafka topic to a target technology. The following is the list of
available sink connectors:
Sink connectors enable the integration of data from an existing Apache Kafka topic to a
target technology. The available sink connectors include:

- [Aiven for Apache Kafka® S3 sink connector](/docs/products/kafka/kafka-connect/howto/s3-sink-connector-aiven)
- [Amazon S3 sink connector](/docs/products/kafka/kafka-connect/howto/s3-sink-connector-aiven)
- [Azure Blob Storage sink connector](/docs/products/kafka/kafka-connect/howto/azure-blob-sink)
- [Confluent Amazon S3 sink](/docs/products/kafka/kafka-connect/howto/s3-sink-connector-confluent)
- [Couchbase®](https://github.com/couchbase/kafka-connect-couchbase)
- [OpenSearch®](/docs/products/kafka/kafka-connect/howto/opensearch-sink)
- [Elasticsearch](/docs/products/kafka/kafka-connect/howto/elasticsearch-sink)
- [Google BigQuery](https://github.com/confluentinc/kafka-connect-bigquery)
- [Google Cloud Pub/Sub](https://github.com/googleapis/java-pubsub-group-kafka-connector/)
- [Google Cloud Pub/Sub Lite](https://github.com/googleapis/java-pubsub-group-kafka-connector/)
- [Google Cloud Storage](/docs/products/kafka/kafka-connect/howto/gcs-sink)
- [HTTP](https://github.com/aiven/http-connector-for-apache-kafka)
- [IBM MQ sink connector](/docs/products/kafka/kafka-connect/howto/ibm-mq-sink-connector)
- [JDBC](https://github.com/aiven/jdbc-connector-for-apache-kafka/blob/master/docs/sink-connector.md)
- [Official MongoDB®](https://docs.mongodb.com/kafka-connector/current/)
- [OpenSearch®](/docs/products/kafka/kafka-connect/howto/opensearch-sink)
- [Snowflake](https://docs.snowflake.com/en/user-guide/kafka-connector)
- [Splunk](https://github.com/splunk/kafka-connect-splunk)
- [Stream Reactor Cassandra®](https://docs.lenses.io/5.1/connectors/sinks/cassandrasinkconnector/)
- [Stream Reactor InfluxDB®](https://docs.lenses.io/5.1/connectors/sinks/influxsinkconnector/)
- [Stream Reactor MongoDB®](https://docs.lenses.io/5.1/connectors/sinks/mongosinkconnector/)
- [Stream Reactor MQTT](https://docs.lenses.io/5.1/connectors/sinks/mqttsinkconnector/)
- [Stream Reactor Redis®*](https://docs.lenses.io/5.1/connectors/sinks/redissinkconnector/)
- [IBM MQ sink connector](/docs/products/kafka/kafka-connect/howto/ibm-mq-sink-connector)

## Preview connectors

Expand Down
139 changes: 139 additions & 0 deletions docs/products/kafka/kafka-connect/howto/azure-blob-sink.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
---
title: Create an Azure Blob Storage sink connector for Aiven for Apache Kafka®
sidebar_label: Azure Blob sink connector
---

import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
import ConsoleLabel from "@site/src/components/ConsoleIcons";

The Azure Blob Storage sink connector moves data from Apache Kafka® topics to Azure Blob Storage containers for long-term storage, such as archiving or creating backups.

## Prerequisites

Before you begin, make sure you have:

- An
[Aiven for Apache Kafka® service](https://docs.aiven.io/docs/products/kafka/kafka-connect/howto/enable-connect)
with Kafka Connect enabled, or a
[dedicated Aiven for Apache Kafka Connect® service](https://docs.aiven.io/docs/products/kafka/kafka-connect/get-started#apache_kafka_connect_dedicated_cluster).
- Access to an Azure Storage account and a container with the following:
- Azure Storage connection string: Required to authenticate and
connect to your Azure Storage account.
- Azure Storage container name: The name of the Blob Storage container where data is
saved.

## Create an Azure Blob Storage sink connector configuration file

Create a file named `azure_blob_sink_connector.json` with the following configuration:

```json
{
"name": "azure_blob_sink",
"connector.class": "io.aiven.kafka.connect.azure.sink.AzureBlobSinkConnector",
"tasks.max": "1",
"topics": "test-topic",
"azure.storage.connection.string": "DefaultEndpointsProtocol=https;AccountName=myaccount;AccountKey=mykey;EndpointSuffix=core.windows.net",
"azure.storage.container.name": "my-container",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter": "org.apache.kafka.connect.storage.StringConverter",
"header.converter": "org.apache.kafka.connect.storage.StringConverter",
"file.name.prefix": "connect-azure-blob-sink/test-run/",
"file.compression.type": "gzip",
"format.output.fields": "key,value,offset,timestamp",
"reload.action": "restart"
}
```

Parameters:

- `name`: Name of the connector.
- `topics`: Apache Kafka topics to sink data from.
- `azure.storage.connection.string`: Azure Storage connection string.
- `azure.storage.container.name`: Blob Storage container name.
- `key.converter`: Class to convert the Kafka record key.
- `value.converter`: Class to convert the Kafka record value.
- `header.converter`: Class to convert message headers.
- `file.name.prefix`: Prefix for the files created in Azure Blob Storage.
- `file.compression.type`: Compression type for the files, such as `gzip`.
- `reload.action`: Action to take when reloading the connector, set to restart.

You can view the full set of available parameters and advanced configuration options
in the [Aiven Azure Blob Storage sink connector GitHub repository](https://github.com/Aiven-Open/cloud-storage-connectors-for-apache-kafka/blob/main/azure-sink-connector/README.md).

## Create the connector

<Tabs groupId="setup-method">
<TabItem value="console" label="Aiven Console" default>

1. Access the [Aiven Console](https://console.aiven.io/).
1. Select your Aiven for Apache Kafka® or Aiven for Apache Kafka Connect® service.
1. Click <ConsoleLabel name="Connectors"/>.
1. Click **Create connector** if Kafka Connect is already enabled on the service.
If not, click **Enable connector on this service**.

Alternatively, to enable connectors:

1. Click <ConsoleLabel name="Service settings"/> in the sidebar.
1. In the **Service management** section, click
<ConsoleLabel name="Actions"/> > **Enable Kafka connect**.

1. In the sink connectors, find **Azure Blob Storage sink**, and click **Get started**.
1. On the **Azure Blob Storage sink** connector page, go to the **Common** tab.
1. Locate the **Connector configuration** text box and click <ConsoleLabel name="edit"/>.
1. Paste the configuration from your `azure_blob_sink_connector.json` file into the
text box.
1. Click **Create connector**.
1. Verify the connector status on the <ConsoleLabel name="Connectors"/> page.

Ensure that data from the Apache Kafka topics is being successfully delivered to the
Azure Blob Storage container.

</TabItem>

<TabItem value="cli" label="Aiven CLI">

To create the Azure Blob Storage sink connector using the Aiven CLI, run:

```bash
avn service connector create SERVICE_NAME @azure_blob_sink_connector.json
```

Parameters:

- `SERVICE_NAME`: Name of your Aiven for Apache Kafka® service.
- `@azure_blob_sink_connector.json`: Path to your JSON configuration file.

</TabItem>
</Tabs>

## Example: Define and create an Azure Blob Storage sink connector

This example shows how to create an Azure Blob Storage sink connector with the following properties:

- Connector name: `azure_blob_sink`
- Apache Kafka topic: `test-topic`
- Azure Storage connection string: `DefaultEndpointsProtocol=https;AccountName=myaccount;AccountKey=mykey;EndpointSuffix=core.windows.net`
- Azure container: `my-container`
- Output fields: `key, value, offset, timestamp`
- File name prefix: `connect-azure-blob-sink/test-run/`
- Compression type: `gzip`

```json
{
"name": "azure_blob_sink",
"connector.class": "io.aiven.kafka.connect.azure.sink.AzureBlobSinkConnector",
"tasks.max": "1",
"topics": "test-topic",
"azure.storage.connection.string": "DefaultEndpointsProtocol=https;AccountName=myaccount;AccountKey=mykey;EndpointSuffix=core.windows.net",
"azure.storage.container.name": "my-container",
"format.output.fields": "key,value,offset,timestamp",
"file.name.prefix": "connect-azure-blob-sink/test-run/",
"file.compression.type": "gzip"
}
```

Once this configuration is saved in the `azure_blob_sink_connector.json` file, you can
create the connector using the Aiven Console or CLI, and verify that data from the
Apache Kafka topic `test-topic` is successfully delivered to your Azure Blob
Storage container.
40 changes: 20 additions & 20 deletions sidebars.ts
Original file line number Diff line number Diff line change
Expand Up @@ -963,33 +963,33 @@ const sidebars: SidebarsConfig = {
{
type: 'category',
label: 'Sink connectors',

items: [
'products/kafka/kafka-connect/howto/jdbc-sink',
'products/kafka/kafka-connect/howto/s3-sink-prereq',
'products/kafka/kafka-connect/howto/s3-sink-connector-aiven',
'products/kafka/kafka-connect/howto/s3-iam-assume-role',
'products/kafka/kafka-connect/howto/s3-sink-connector-confluent',
'products/kafka/kafka-connect/howto/gcs-sink-prereq',
'products/kafka/kafka-connect/howto/gcs-sink',
'products/kafka/kafka-connect/howto/azure-blob-sink',
'products/kafka/kafka-connect/howto/cassandra-streamreactor-sink',
'products/kafka/kafka-connect/howto/couchbase-sink',
'products/kafka/kafka-connect/howto/elasticsearch-sink',
'products/kafka/kafka-connect/howto/gcp-bigquery-sink-prereq',
'products/kafka/kafka-connect/howto/gcp-bigquery-sink',
'products/kafka/kafka-connect/howto/opensearch-sink',
'products/kafka/kafka-connect/howto/elasticsearch-sink',
'products/kafka/kafka-connect/howto/snowflake-sink-prereq',
'products/kafka/kafka-connect/howto/snowflake-sink',
'products/kafka/kafka-connect/howto/gcp-pubsub-lite-sink',
'products/kafka/kafka-connect/howto/gcp-pubsub-sink',
'products/kafka/kafka-connect/howto/gcs-sink-prereq',
'products/kafka/kafka-connect/howto/gcs-sink',
'products/kafka/kafka-connect/howto/http-sink',
'products/kafka/kafka-connect/howto/mongodb-sink-mongo',
'products/kafka/kafka-connect/howto/mongodb-sink-lenses',
'products/kafka/kafka-connect/howto/ibm-mq-sink-connector',
'products/kafka/kafka-connect/howto/influx-sink',
'products/kafka/kafka-connect/howto/jdbc-sink',
'products/kafka/kafka-connect/howto/mongodb-sink-lenses',
'products/kafka/kafka-connect/howto/mongodb-sink-mongo',
'products/kafka/kafka-connect/howto/mqtt-sink-connector',
'products/kafka/kafka-connect/howto/opensearch-sink',
'products/kafka/kafka-connect/howto/redis-streamreactor-sink',
'products/kafka/kafka-connect/howto/cassandra-streamreactor-sink',
'products/kafka/kafka-connect/howto/couchbase-sink',
'products/kafka/kafka-connect/howto/gcp-pubsub-sink',
'products/kafka/kafka-connect/howto/gcp-pubsub-lite-sink',
'products/kafka/kafka-connect/howto/s3-iam-assume-role',
'products/kafka/kafka-connect/howto/s3-sink-connector-aiven',
'products/kafka/kafka-connect/howto/s3-sink-connector-confluent',
'products/kafka/kafka-connect/howto/s3-sink-prereq',
'products/kafka/kafka-connect/howto/snowflake-sink-prereq',
'products/kafka/kafka-connect/howto/snowflake-sink',
'products/kafka/kafka-connect/howto/splunk-sink',
'products/kafka/kafka-connect/howto/mqtt-sink-connector',
'products/kafka/kafka-connect/howto/ibm-mq-sink-connector',
],
},
],
Expand Down