lovely-kafka-backup

This application is meant to be used as a backup tool for Kafka. It comes with connectors to read and write data from Kafka to different storage systems and provides a CLI for manual backup and restore operations.

Features

Backup Kafka topics to any S3 compatible storage system (e.g. AWS S3, Minio, etc.)
Multiple formats for backup files (e.g. Binary, Avro, JSON, etc.) supported
Backup and restore Kafka topics using the kbackup cli
Repair corrupted Kafka log files using the kbackup cli
Prometheus metrics for Kafka S3 Backup Connector

Requirements

Docker
Kafka
S3 compatible storage system (e.g. AWS S3, Minio, etc.)

Kafka S3 Backup Connector Setup

By default (no CMD or ENTRYPOINT defined) the container starts the Kafka S3 Backup Connector. This connector uses the Kafka Connect S3 Connector to backup Kafka topics to S3. It uses the Kafka Connect API to stream data from Kafka.

The connector is created automatically after the Kafka-Connect worker has started. See the Configuration Properties for a list of all available properties.

The connector can be configured using a configuration file. See localdev for an example configuration file.

Metrics

Metrics are exported using the Prometheus JMX Exporter and is configured to expose metrics in Prometheus format at /metrics.

The following environment variables can be used to configure the exporter:

METRICS_ENABLED: Enable or disable the metrics exporter. Default: true
METRICS_PORT: The port the metrics are exposed on. Default: 9876

Kbackup CLI

The docker image also contains a CLI for manual restore operations. The CLI can be run using the kbackup command. kbackup generally assumes that topics it writes to either exist or are auto created. It doesn't create any topics.

Usage

The cli can be run using gradle:

./gradlew :cli:run --args="kbackup <subcommand> <args>..."

or on container startup with docker:

docker run --network host lovelysystems/lovely-kafka-backup:dev kbackup <subcommand> <args>...

NOTE: network host is required if kafka is also running locally in a container

or kubectl:

kubectl run my-cli-container --rm -i --image lovelysystems/lovely-kafka-backup:dev "kbackup <subcommand> <args>..."

or within a standalone container:

docker run --rm -it --entrypoint /bin/bash lovelysystems/lovely-kafka-backup:dev
kbackup <subcommand>

or from within a running container:

$ docker exec -it <container> /bin/bash
kbackup <subcommand>

Subcommands

Restore

To restore records from a backup run the program. The restore reads backed up records from s3 and appends them to the target topics. Offsets of the records are not restored.

Demo call

kbackup restore --bucket user-devices --s3Endpoint http://localhost:9000 --bootstrapServers localhost:9092

This command restores all the backed up records in the bucket user-devices on S3 hosted at http://localhost:9000 to their original topics.

All options:

Option name	Short option	Required	Format	Description
bucket	b	always	String	Bucket in which the backup is stored
prefix	p		String	Limit S3 objects that begin with the specified prefix.
key	k		Regex	Allows to filter S3 objects by the specified key pattern.
partition			Number	Partition of the source topics to restore.
from			Number	Start Kafka offset of objects to restore. If not set, records from earliest available offset are restored.
to			Number	End Kafka offset of records to restore (exclusive). If not set, records to latest available are restored.
s3Endpoint		If not restoring from AWS	Url	Endpoint for S3 backup storage
bootstrapServers		If env `KAFKA_BOOTSTRAP_SERVERS` is not set	(list of) Urls	Kafka cluster to restore the backup to
topic	t		String	The target topic where records are restored to. Otherwise topics get restored into their original topic name
profile			String	Profile to user for S3 access. If not set uses `AWS_PROFILE` environment variable or the default profile.

S3 Config

S3 can be configured using the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY or by setting a profile in the parameter --profile. Profile takes priority.

NOTE: configuration via profile is mostly useful for development and running the cli via gradle :cli:run. To use the it in docker the config file from ~/.aws would need to be mounted into the container.

KafkaConfig

Additional configs for kafka can be set via environment variables prefixed with KAFKA_. If an argument is passed the argument takes priority.

Repair

Repairs corrupted records from with records from backup:

Demo call

kbackup repair --bucket s3-backup --data-directory kafka-data

This calls checks the kafka data in kafka-data and repairs them with backed up records in s3-backup if there are any corrupted.

All options:

Option name	Short	Required	Description
bucket	b	always	Backup bucket to repair from
data-directory		always	Data directory to which contains the kafka data
filter	f		glob pattern to filter log directories to repair. Defaults to all
skipBroken	s		Flag. If set records which aren't in the backup are skipped
repair	r		Flag. If set the files are repaired otherwise just listed
s3Endpoint			Url to S3. If not given defaults to AWS-S3
profile			The profile to use for s3 access

S3 Config

S3 can be configured using the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY or by setting a profile in the parameter --profile. Profile takes priority.

NOTE: configuration via profile is mostly useful for development and running the cli via gradle :cli:run. To use the it in docker the config file from ~/.aws would need to be mounted into the container.

Run Tests

Tests can be run using Gradle:

./gradlew test

Publish a new version

Use Gradle to build and publish a new version of the docker image. The version is read from the CHANGES.md file.

./gradlew createTag
./gradlew pushDockerImage

Publish the lovely-kafka-format library to Github Packages:

./gradlew publish

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.circleci		.circleci
cli		cli
config/detekt		config/detekt
confluent-connect		confluent-connect
docker		docker
format		format
gradle/wrapper		gradle/wrapper
localdev		localdev
testing		testing
.gitignore		.gitignore
.sdkmanrc		.sdkmanrc
.tool-versions		.tool-versions
CHANGES.md		CHANGES.md
LICENSE		LICENSE
README.md		README.md
build.gradle.kts		build.gradle.kts
cache_checksum.sh		cache_checksum.sh
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle.kts		settings.gradle.kts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

lovely-kafka-backup

Features

Requirements

Kafka S3 Backup Connector Setup

Metrics

Kbackup CLI

Usage

Subcommands

Restore

Demo call

All options:

S3 Config

KafkaConfig

Repair

Demo call

All options:

S3 Config

Run Tests

Publish a new version

About

Releases 1

Packages

Contributors 3

Languages

License

lovelysystems/lovely-kafka-backup

Folders and files

Latest commit

History

Repository files navigation

lovely-kafka-backup

Features

Requirements

Kafka S3 Backup Connector Setup

Metrics

Kbackup CLI

Usage

Subcommands

Restore

Demo call

All options:

S3 Config

KafkaConfig

Repair

Demo call

All options:

S3 Config

Run Tests

Publish a new version

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 3

Languages

Packages