Skip to main content
reactive microservices architecture transactional outbox and event sourcing with java 21

Idempotency and Exactly-Once Semantics

3 min read Chapter 8 of 10
Summary

Idempotency and exactly-once semantics in Kafka ensure reliable...

Idempotency and exactly-once semantics in Kafka ensure reliable message processing

Introduction to Idempotency and Exactly-Once Semantics

Idempotency in the context of distributed systems refers to the ability of a consumer to process a message multiple times with the same effect as processing it once. This concept is crucial for achieving exactly-once semantics (EOS), a delivery guarantee where the system ensures that even if a producer retries or a consumer restarts, the effect of the message on the target state occurs precisely once. In Kafka, achieving EOS involves several components and configurations, including the use of transactional IDs, the transaction coordinator, and appropriate consumer settings.

Definitions and Key Concepts

To delve into the world of idempotency and EOS, it’s essential to understand some key definitions:

  • Idempotent Consumer: A message processing pattern where the outcome of processing the same message multiple times is identical to processing it once.
  • Exactly-Once Semantics (EOS): A delivery guarantee ensuring that the effect of a message on the target state occurs precisely once, even in the face of retries or restarts.
  • Transactional ID: A unique identifier assigned to a Kafka producer, enabling the transaction coordinator to identify and manage state across producer sessions.

Achieving Exactly-Once Semantics in Kafka

Kafka introduced EOS in version 0.11 (KIP-98) and later improved it for performance and scalability in ‘exactly_once_v2’ (KIP-447). To achieve EOS in Kafka, producers must be configured with a transactional ID, and consumers must have their ‘isolation.level’ set to ‘read_committed’. This setup ensures that consumers only see committed data and prevents them from processing messages that are part of an aborted transaction.

Implementation of Idempotent Consumer Pattern

The idempotent consumer pattern can be implemented using a processed events log. This involves checking if an event has already been processed before executing the business logic. If the event is found in the log, it is skipped; otherwise, the business logic is executed, and the event is logged. This approach ensures that even if a consumer crashes after executing the business logic but before committing the offset, the event will not be processed more than once upon restart.

@Transactional
public void handleEvent(EventRecord event) {
    if (processedEventRepository.existsById(event.id())) {
        log.info("Duplicate event detected: {}", event.id());
        return;
    }
    
    // Execute business logic
    updateBusinessState(event.payload());
    
    // Atomic log of processing
    processedEventRepository.save(new ProcessedEvent(event.id(), "ORDER_SERVICE"));
}

Configuration Comparison for Kafka Consumer/Producer Settings

The following table compares the configuration settings required for at-least-once and exactly-once semantics in Kafka:

ParameterAt-Least-OnceExactly-Once (v2)
enable.idempotencetrue/falsetrue (forced)
isolation.levelread_uncommittedread_committed
processing.guaranteeat_least_onceexactly_once_v2
transactional.idN/AMust be set

Performance Impact

Achieving exactly-once semantics in Kafka comes with a performance overhead due to the additional coordination and markers required in the log. However, this overhead is necessary to ensure the reliability and consistency of the system. Optimizations such as using virtual threads in Java 21 can improve the throughput of idempotent consumers performing blocking I/O on the processed events log.

Sources

[1] https://strimzi.io/blog/2023/05/03/kafka-transactions/ [2] https://github.com/AutoMQ/automq/wiki/What-is-Kafka-Exactly-Once-Semantics [3] https://microservices.io/patterns/communication-style/idempotent-consumer.html