Skip to main content

On This Page

How Fiserv Optimized Payment Throughput by 25% Using Apache Kafka

3 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

How We Improved Payment System Throughput by 25% Using Apache Kafka at a Fortune 500 FinTech

Java Backend Engineer Disha Sune details the architectural overhaul at Fiserv to handle millions of daily transactions for clients like Google and McDonald’s. By replacing synchronous REST calls with Apache Kafka, the team achieved a 25% throughput improvement and eliminated transaction loss.

Why This Matters

In high-volume financial systems, synchronous REST architectures create tight coupling where a single service failure can cause cascading timeouts across the entire transaction chain. Transitioning to an event-driven model replaces fragile point-to-point connections with durable message streams, allowing services like settlement and reporting to scale independently. This shift addresses the technical reality that distributed systems must anticipate failure; by using Kafka’s message persistence and manual offset commits, developers can ensure that no financial event is lost during transient downstream outages or peak load spikes.

Key Insights

  • Fiserv processed millions of daily transactions for 600+ enterprise clients using a legacy REST architecture that suffered from cascading failures.
  • Configuring ACKS_CONFIG to ‘all’ and ENABLE_IDEMPOTENCE_CONFIG to true ensures zero message loss and prevents duplicate processing in financial streams.
  • Using transactionId as the message key guarantees that all events for a specific transaction are routed to the same Kafka partition for ordered processing.
  • The Dead Letter Queue (DLQ) pattern allows for automatic recovery of transient failures and manual review of non-recoverable errors, ensuring 100% durability.
  • Monitoring consumer lag is critical; high lag in a payment system indicates delayed transactions, requiring real-time alerts when lag exceeds specific thresholds.

Working Examples

Producer configuration for high-durability financial transactions.

@Configuration
public class KafkaProducerConfig {
@Bean
public ProducerFactory<String, PaymentEvent> producerFactory() {
Map<String, Object> config = new HashMap<>();
config.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
config.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
config.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, JsonSerializer.class);
config.put(ProducerConfig.ACKS_CONFIG, "all");
config.put(ProducerConfig.RETRIES_CONFIG, 3);
config.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
return new DefaultKafkaProducerFactory<>(config);
}
}

Asynchronous event publishing using transactionId as the partition key.

public void publishPaymentAuthorized(PaymentEvent event) {
CompletableFuture<SendResult<String, PaymentEvent>> future =
kafkaTemplate.send(TOPIC, event.getTransactionId(), event);
future.whenComplete((result, ex) -> {
if (ex != null) {
log.error("Failed to publish payment event: {}", event.getTransactionId(), ex);
} else {
log.info("Published to partition {} offset {}", result.getRecordMetadata().partition(), result.getRecordMetadata().offset());
}
});
}

Consumer implementation with manual acknowledgement and DLQ routing.

@KafkaListener(topics = "payment-events", groupId = "settlement-service")
public void handlePaymentEvent(PaymentEvent event, Acknowledgment acknowledgment) {
try {
settlementService.processSettlement(event);
acknowledgment.acknowledge();
} catch (RecoverableException ex) {
retryPublisher.publishToRetryTopic(event);
acknowledgment.acknowledge();
} catch (Exception ex) {
deadLetterPublisher.publishToDeadLetterTopic(event);
acknowledgment.acknowledge();
}
}

Practical Applications

  • Fiserv implementation: Decoupled authorization from settlement, reporting, and chargeback services to allow independent scaling.
  • Pitfall: Using auto-commit in consumers can mark a message as processed before logic finishes, leading to lost transactions.
  • Use Case: Implementing idempotent consumers with a ‘processed_transactions’ table to prevent double-processing caused by network retries.

References:

Continue reading

Next article

Inside PreviewDrop: Architecting Instant Backend Preview Environments with Docker and WebSockets

Related Content