Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md

README.md

CHAPTER 7: Reliable Data Delivery

Reliability is not just a matter of specific Kafka features. An entire reliable system needs to be built, including the application architecture, the way applications use the producer and consumer APIs, producer and consumer configuration, topic configuration, and broker configuration. Making the system more reliable always has trade-offs in application complexity, performance, availability, or disk-space usage. By understanding all the options and common patterns and understanding requirements for each use case, more informed decisions can be made regarding how reliable the application and Kafka deployment need to be and which trade-offs make sense.

Guarantees provided by Kafka:

Kafka provides order guarantee of messages in a partition. If message B was written after message A, using the same producer in the same partition, then Kafka guarantees that the offset of message B will be higher than message A, and that consumers will read message B after message A.
Produced messages are considered “committed” when they were written to the partition on all its in-sync replicas (but not necessarily flushed to disk). Producers can choose to receive acknowledgments of sent messages when the message was fully committed, when it was written to the leader, or when it was sent over the network.
Messages that are committed will not be lost as long as at least one replica remains alive.
Consumers can only read messages that are committed.

These guarantees can then be "tweaked" by changing Kafka's configuration, depending on the business needs and trade-offs made taking into account other important considerations, such as availability, high throughput, low latency, and hardware costs.

If we allow out-of-sync replicas to become leaders, we risk data loss and inconsistencies. If we don’t allow them to become leaders, we face lower availability as we must wait for the original leader to become available before the partition is back online.

There are two important things that everyone who writes applications that produce to Kafka must pay attention to:

Use the correct acks configuration to match reliability requirements
Handle errors correctly both in configuration and in code

Some failure conditions under which integration tests can be run:

Clients lose connectivity to one of the brokers
High latency between client and broker
Disk full
Hanging disk (also called “brown out”)
Leader election
Rolling restart of brokers
Rolling restart of consumers
Rolling restart of producers

Trogdor, the test framework for Apache Kafka could be used to simulate systems faults.

Burrow, the Kafka Consumer Lag Checking could be useful in helping monitoring of consumer lags.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chapter7

Chapter7

README.md

CHAPTER 7: Reliable Data Delivery

Files

Chapter7

Directory actions

More options

Directory actions

More options

Latest commit

History

Chapter7

Folders and files

parent directory

README.md

CHAPTER 7: Reliable Data Delivery