Kafka's Log, Partitions, and Consumer Groups
Kafka’s Log, Partitions, and Consumer Groups
The Black Box
The logistics platform’s tracking dashboard consumer group has 3 consumers reading from a topic with 12 partitions. Consumer lag is normally under 100 messages. Periodically, lag spikes to 50,000 messages and takes 3 minutes to recover. The application code has not changed. The message rate has not changed. The cause is a consumer group rebalance.
The Mechanism
Offset Management
Each consumer in a group tracks its position in each assigned partition via an offset: the index of the next message to read. Offsets are stored in Kafka’s internal __consumer_offsets topic.
// Concept: manual offset commit vs auto-commit
// Auto-commit (default): offsets are committed every 5 seconds regardless
// of whether the message was successfully processed.
// If the consumer crashes between commit and processing, the message is skipped.
// Manual commit: offset is committed after successful processing.
Properties props = new Properties();
props.put("enable.auto.commit", "false"); // Disable auto-commit
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(List.of("package-events"));
while (true) {
ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
for (ConsumerRecord<String, String> record : records) {
processPackageEvent(record.value()); // Process first
}
consumer.commitSync(); // Commit after processing
// If the consumer crashes before commitSync(), the uncommitted messages
// are re-delivered on the next poll. At-least-once delivery.
}
The tradeoff: auto-commit risks skipping messages. Manual commit risks re-processing messages. For the logistics platform’s tracking dashboard, re-processing a status update is harmless (idempotent). Skipping a status update causes the dashboard to show stale data. Manual commit is the correct choice.
Rebalancing
Rebalancing redistributes partitions among consumers when the group membership changes: a consumer joins, leaves, or is considered dead (missed heartbeat).
Eager rebalancing (default in older Kafka versions): all consumers release all partitions, then all partitions are reassigned. During this window, no consumer processes any messages. For a group with 12 partitions, the entire group pauses.
Cooperative rebalancing (available since Kafka 2.4): only the affected partitions are revoked and reassigned. Consumers continue processing their unaffected partitions.
// Concept: cooperative rebalancing to minimize processing gaps
Properties props = new Properties();
props.put("partition.assignment.strategy",
"org.apache.kafka.clients.consumer.CooperativeStickyAssignor");
// With CooperativeStickyAssignor:
// 1. Consumer C3 crashes.
// 2. Kafka detects C3's missed heartbeat (session.timeout.ms, default 45s).
// 3. Only C3's 4 partitions are revoked.
// 4. C1 and C2 continue processing their existing partitions without interruption.
// 5. C3's 4 partitions are distributed to C1 and C2.
// Processing gap: only for C3's partitions, not the entire group.
Consumer Lag Diagnosis
When lag grows on specific partitions, the consumer assigned to those partitions is the bottleneck.
# Concept: identifying the slow consumer
kafka-consumer-groups.sh --describe --group tracking-dashboard
# GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID
# tracking-dashboard package-events 0 4521890 4521895 5 consumer-1
# tracking-dashboard package-events 1 3892100 3892100 0 consumer-1
# tracking-dashboard package-events 2 5102340 5102380 40 consumer-1
# tracking-dashboard package-events 3 2891000 2891000 0 consumer-1
# tracking-dashboard package-events 4 6201000 6201000 0 consumer-2
# ...
# tracking-dashboard package-events 8 3100000 3150000 50000 consumer-3
# tracking-dashboard package-events 9 2900000 2950200 50200 consumer-3
# consumer-3 is lagging on partitions 8 and 9 by 50,000+ messages.
# Possible causes:
# 1. Slow processing (e.g., a slow downstream database write)
# 2. GC pauses on consumer-3's JVM
# 3. Network issues between consumer-3 and Kafka
# 4. Uneven partition sizing (partitions 8 and 9 have higher message rates)
The Observable Consequence
The 50,000-message lag spike in the logistics platform was caused by eager rebalancing. When consumer-3 was restarted for a deployment:
- consumer-3 leaves the group. Kafka revokes all 12 partitions from all consumers.
- All consumers stop processing. The processing gap begins.
- Kafka reassigns 12 partitions to consumer-1 and consumer-2.
- consumer-1 and consumer-2 resume processing, now handling 6 partitions each instead of 4.
- During the gap (approximately 45 seconds: session timeout + rebalance), messages accumulated.
- With 1,000 messages/second, the 45-second gap produces 45,000 messages of lag.
Switching to CooperativeStickyAssignor reduces the gap to approximately 5 seconds (only consumer-3’s partitions pause), reducing the lag spike to 5,000 messages.
The Decision Rule
Use manual offset commit when message loss is unacceptable. Use auto-commit only for purely informational consumers where skipping a message has no business impact.
Use CooperativeStickyAssignor for all consumer groups. The only reason to use the default RangeAssignor is if you are running a Kafka version older than 2.4, in which case the upgrade is the fix.
Match the number of partitions to the maximum parallelism you need. 12 partitions allow up to 12 consumers in a group. More partitions than consumers means some consumers read multiple partitions. Fewer partitions than consumers means some consumers sit idle. For the logistics platform with 3 dashboard consumers, 12 partitions provide room to scale to 12 consumers without repartitioning.