This document covers the fundamentals of Kafka, including producers, consumers, consumer groups, and common configuration properties, with both theoretical explanations and coding examples.
Apache Kafka is a distributed streaming platform used for building real-time data pipelines and streaming applications.
It provides:
- Publish/Subscribe messaging.
- Durability & Scalability through topic partitions.
- Fault Tolerance via replication.
- Stream Processing with Kafka Streams or external frameworks.
- A logical channel where records are published.
- Split into partitions for parallelism and scalability.
- Each partition is an ordered, immutable log of records.
- Applications that publish (write) data to Kafka topics.
- Can control durability, ordering, and delivery guarantees via configs.
- Applications that subscribe (read) data from Kafka topics.
- Can be part of a consumer group for parallel consumption.
- A set of consumers that share the work of reading from topic partitions.
- Each partition is consumed by only one consumer in the group at a time.
- Provides scalability and fault tolerance.
| Property | Description | Example |
|---|---|---|
bootstrap.servers |
List of Kafka brokers | localhost:9092 |
key.serializer |
Serializer class for keys | StringSerializer |
value.serializer |
Serializer class for values | StringSerializer |
acks |
Level of acknowledgment | all (leader + ISR must ack) |
retries |
Number of retries if request fails | 3 |
linger.ms |
Delay before sending batch | 5 |
batch.size |
Batch size in bytes | 16384 |
🔑 Example: ACKS_CONFIG
props.put(ProducerConfig.ACKS_CONFIG, "all");
// "all" means leader + all in-sync replicas must acknowledge
// Provides strongest durability guarantees- Suppose a topic has 4 partitions.
- Group A with 2 consumers → each consumer will consume from 2 partitions.
- Group B with 4 consumers → each consumer will consume from 1 partition.
- If a consumer dies, Kafka will rebalance partitions among the remaining consumers.
-
Produce Messages kafka-console-producer.sh --broker-list localhost:9092 --topic test-topic
-
Consume Messages kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test-topic --from-beginning
-
List Topics kafka-topics.sh --list --bootstrap-server localhost:9092
-
Describe Topic kafka-topics.sh --describe --topic test-topic --bootstrap-server localhost:9092