Skip to main content
reactive microservices architecture transactional outbox and event sourcing with java 21

The Fallacy of Distributed Transactions

3 min read Chapter 2 of 10
Summary

Distributed transactions face challenges in cloud-native environments. 2PC...

Distributed transactions face challenges in cloud-native environments. 2PC prioritizes consistency over availability, making it less suitable. Saga pattern and distributed locking offer alternatives.

The Fallacy of Distributed Transactions

Introduction to Distributed Transactions

Distributed transactions aim to ensure data consistency across multiple nodes in a distributed system. However, achieving this goal is fraught with challenges, particularly in high-throughput cloud-native environments. The Two-Phase Commit (2PC) protocol, a widely used approach for distributed transactions, prioritizes consistency over availability during failures, making it less suitable for modern distributed systems.

Limitations of Two-Phase Commit

The 2PC protocol requires a prepare phase and a commit or rollback phase, which can lead to significant latency, especially in Wide Area Networks (WANs) where round-trip latency can exceed 100-250 milliseconds [1]. Furthermore, the protocol’s blocking nature can result in ‘In-Doubt’ transactions, where a resource manager waits indefinitely if the coordinator fails after the prepare phase. This limitation makes 2PC impractical for global-scale applications.

Alternative Approaches: Saga Pattern and Distributed Locking

The Saga pattern offers an alternative approach, optimizing for availability and partition tolerance by utilizing asynchronous message-driven architecture. This pattern consists of a sequence of local transactions, where each step updates a service and publishes a message or event to trigger the next step. If a step fails, the saga executes compensating transactions to maintain data consistency. Distributed locking mechanisms, such as those using fencing tokens, can also be employed to coordinate access to shared resources, ensuring that only one process can modify the data at a time.

Performance Impact of Distributed Transactions

The choice of distributed transaction protocol significantly impacts the performance of a system. As shown in the following table, local locks offer low latency and high throughput, while distributed locks using Redis or Etcd introduce higher latency and lower throughput. The 2PC protocol, with its high latency and low throughput, is the least performant option.

MetricLocal LockDistributed (Redis/Etcd)2PC (XA)
Latency<1ms5-50ms100ms+ (WAN)
ThroughputVery HighMediumLow
ResilienceLowHighVery Low
ComplexityLowMediumHigh

Implementing Saga Pattern in Java 21

To implement the Saga pattern in Java 21, we can utilize the record feature to define a Saga state management class. The following code example demonstrates a simplified Java record for Saga state management using compensating logic.

public record OrderSaga(String orderId, List<Task> steps) {
    public void execute() {
        try {
            for (Task step : steps) step.process(); 
        } catch (Exception e) {
            for (int i = currentStep; i >= 0; i--) steps.get(i).compensate();
        }
    }
}

Conclusion

In conclusion, while distributed transactions are essential for maintaining data consistency in distributed systems, the traditional 2PC protocol is no longer suitable for high-throughput cloud-native environments. Alternative approaches, such as the Saga pattern and distributed locking, offer better performance and availability. By understanding the limitations and trade-offs of each approach, developers can design more efficient and scalable distributed systems.

Sources

[1] https://bool.dev/blog/detail/part-9-microservices-dotnet-interview-questions [2] Gilbert, S. and Lynch, N. (2002). ‘Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services’. ACM SIGACT News. [3] https://www.mo4tech.com/take-you-through-a-distributed-transaction-tcc-with-go-a-nanny-level-tutorial.html