Airbnb Adds Adaptive Traffic Control to Manage Key Value Store Spikes
These articles are AI-generated summaries. Please check the original sources for full details.
Airbnb Adds Adaptive Traffic Control to Manage Key Value Store Spikes
Airbnb upgraded its multi-tenant key-value store, Mussel, with an adaptive traffic control system. During a DDoS drill, the system reduced traffic spikes from 1 million QPS to a trickle, preventing backend overload.
Why This Matters
Static rate limits, like those used in Redis-backed counters, fail to account for real-world costs of requests—such as latency, data size, or resource contention. This approach risks service degradation during high-variance traffic, with potential costs of outages or degraded user experience. Airbnb’s shift to resource-aware rate control ensures fair usage and resilience without cross-node coordination overhead.
Key Insights
- “Mussel’s adaptive rate control reduces DDoS impact from 1M QPS to trickle, 2025”: https://www.infoq.com/news/2025/11/airbnb-mussel-adaptive-traffic/
- “Resource-aware RU over fixed QPS for multi-tenancy”: Airbnb’s shift from static QPS limits to request units (RU) that factor in latency, data size, and rows processed.
- “Hot-key caching used by Airbnb to prevent backend overload”: In-memory top-k detection and LRU caches mitigate disproportionate traffic on specific keys.
Practical Applications
- Use Case: Airbnb’s Mussel handles terabyte-scale uploads and DDoS attacks via load shedding and hot-key caching.
- Pitfall: Static rate limits can’t adapt to variable workloads, risking service degradation during traffic spikes.
References:
Continue reading
Next article
An Implementation of Fully Traced and Evaluated Local LLM Pipeline Using Opik
Related Content
AI Agents Evolve: From Assistance to Execution Engines in Enterprise Architecture
A significant shift is occurring in enterprise software architecture as AI agents transition from providing assistance to autonomously executing tasks. This article details the architectural changes, adoption rates, real-world examples, and key considerations for implementing agentic AI, including governance, transparency, and cost management.
LinkedIn Achieves 90% Offline Cost Reduction with Real-Time Recommendation Architecture
LinkedIn reduced offline costs by 90% by migrating from batch-based recommendations to a real-time architecture leveraging dynamic scoring and decoupled pipelines.
Building a Single-Cell RNA-seq Analysis Pipeline with Scanpy: From PBMC Clustering to Trajectory Discovery
Learn to build a complete single-cell RNA-seq pipeline using Scanpy for PBMC analysis, covering quality control, doublet detection with Scrublet, and lineage trajectory discovery on benchmark datasets.