Skip to main content
unbound mongodb at scale

Cache Eviction and Checkpointing

3 min read Chapter 37 of 72

Cache Eviction and Checkpointing

WiredTiger is MongoDB’s default storage engine. It maintains an internal cache in memory that holds recently accessed data pages and index pages. Every read that hits the cache avoids a disk I/O. Every write modifies a page in cache, marking it dirty. Dirty pages must eventually be written to disk through two mechanisms: eviction and checkpoints.

Understanding these mechanisms explains the latency spikes that occur under sustained write loads and why increasing RAM does not always improve performance.

WiredTiger cache eviction state machine: pages enter as CLEAN from disk reads, transition to DIRTY on writes, move through EVICTION_QUEUE when cache pressure exceeds thresholds (eviction_target=80%, eviction_trigger=95%, eviction_dirty_target=5%, eviction_dirty_trigger=20%). Application thread eviction occurs above eviction_trigger. Checkpoint writes all dirty pages every 60 seconds.

The Cache Architecture

WiredTiger’s internal cache is not the only memory cache. The operating system maintains a filesystem cache (page cache) that also caches recently read disk blocks. Data flows through both caches:

  1. Disk to filesystem cache: OS reads 4 KB pages from disk on cache miss.
  2. Filesystem cache to WiredTiger cache: WiredTiger decompresses and decodes the data into its internal B-tree page format.
  3. WiredTiger cache to application: The driver reads from WiredTiger’s internal cache.

The default WiredTiger cache size is 50% of (RAM - 1 GB). On a 32 GB server: (32 - 1) * 0.5 = 15.5 GB. The remaining memory is available for the filesystem cache, MongoDB’s own memory (connections, cursors, aggregation buffers), and the operating system.

Eviction Mechanics

WiredTiger evicts pages from cache when memory pressure exceeds configured thresholds:

  • eviction_target (default 80%): When cache usage exceeds this percentage, background eviction threads start removing clean pages.
  • eviction_trigger (default 95%): When cache usage exceeds this, application threads are forced to perform eviction before they can proceed with their operation. This is the latency spike trigger.
  • eviction_dirty_target (default 5%): Background eviction starts writing dirty pages when dirty pages exceed this percentage of cache.
  • eviction_dirty_trigger (default 20%): Application threads perform dirty page eviction above this threshold.
// Check cache pressure via serverStatus
db.serverStatus().wiredTiger.cache
{
  "bytes currently in the cache": 12800000000,       // 12.8 GB
  "maximum bytes configured": 15500000000,           // 15.5 GB
  "tracked dirty bytes in the cache": 310000000,     // 310 MB (2% of cache)
  "pages evicted by application threads": 45000,      // Bad: should be near zero
  "unmodified pages evicted": 2800000,
  "modified pages evicted": 120000
}

The critical metric is pages evicted by application threads. When this counter is increasing, application operations are stalling to perform eviction. Each stall adds 1-50ms of latency to the operation.

Checkpoint Mechanics

Every 60 seconds (default), WiredTiger performs a checkpoint. The checkpoint:

  1. Creates a consistent snapshot of all dirty pages.
  2. Writes all dirty pages to disk.
  3. Updates the root pages of all B-trees.
  4. Syncs the data files to durable storage.

During a checkpoint, WiredTiger holds a checkpoint lock that can conflict with page splits and eviction. On a write-heavy workload, the checkpoint may need to write hundreds of megabytes of dirty data. This I/O burst competes with normal read and write operations.

The telemetry platform at 2,000 writes/sec with 340-byte documents generates approximately 680 KB/sec of dirty data. Over 60 seconds: 40 MB of dirty data per checkpoint. This is manageable. But if the workload includes bulk imports or large updates, dirty data can accumulate to gigabytes, making checkpoints I/O-intensive.