Cache Eviction and Checkpointing
Cache Eviction and Checkpointing
WiredTiger is MongoDB’s default storage engine. It maintains an internal cache in memory that holds recently accessed data pages and index pages. Every read that hits the cache avoids a disk I/O. Every write modifies a page in cache, marking it dirty. Dirty pages must eventually be written to disk through two mechanisms: eviction and checkpoints.
Understanding these mechanisms explains the latency spikes that occur under sustained write loads and why increasing RAM does not always improve performance.
The Cache Architecture
WiredTiger’s internal cache is not the only memory cache. The operating system maintains a filesystem cache (page cache) that also caches recently read disk blocks. Data flows through both caches:
- Disk to filesystem cache: OS reads 4 KB pages from disk on cache miss.
- Filesystem cache to WiredTiger cache: WiredTiger decompresses and decodes the data into its internal B-tree page format.
- WiredTiger cache to application: The driver reads from WiredTiger’s internal cache.
The default WiredTiger cache size is 50% of (RAM - 1 GB). On a 32 GB server: (32 - 1) * 0.5 = 15.5 GB. The remaining memory is available for the filesystem cache, MongoDB’s own memory (connections, cursors, aggregation buffers), and the operating system.
Eviction Mechanics
WiredTiger evicts pages from cache when memory pressure exceeds configured thresholds:
eviction_target(default 80%): When cache usage exceeds this percentage, background eviction threads start removing clean pages.eviction_trigger(default 95%): When cache usage exceeds this, application threads are forced to perform eviction before they can proceed with their operation. This is the latency spike trigger.eviction_dirty_target(default 5%): Background eviction starts writing dirty pages when dirty pages exceed this percentage of cache.eviction_dirty_trigger(default 20%): Application threads perform dirty page eviction above this threshold.
// Check cache pressure via serverStatus
db.serverStatus().wiredTiger.cache
{
"bytes currently in the cache": 12800000000, // 12.8 GB
"maximum bytes configured": 15500000000, // 15.5 GB
"tracked dirty bytes in the cache": 310000000, // 310 MB (2% of cache)
"pages evicted by application threads": 45000, // Bad: should be near zero
"unmodified pages evicted": 2800000,
"modified pages evicted": 120000
}
The critical metric is pages evicted by application threads. When this counter is increasing, application operations are stalling to perform eviction. Each stall adds 1-50ms of latency to the operation.
Checkpoint Mechanics
Every 60 seconds (default), WiredTiger performs a checkpoint. The checkpoint:
- Creates a consistent snapshot of all dirty pages.
- Writes all dirty pages to disk.
- Updates the root pages of all B-trees.
- Syncs the data files to durable storage.
During a checkpoint, WiredTiger holds a checkpoint lock that can conflict with page splits and eviction. On a write-heavy workload, the checkpoint may need to write hundreds of megabytes of dirty data. This I/O burst competes with normal read and write operations.
The telemetry platform at 2,000 writes/sec with 340-byte documents generates approximately 680 KB/sec of dirty data. Over 60 seconds: 40 MB of dirty data per checkpoint. This is manageable. But if the workload includes bulk imports or large updates, dirty data can accumulate to gigabytes, making checkpoints I/O-intensive.