The In-Memory Sort Limit and How to Avoid It
The In-Memory Sort Limit and How to Avoid It
The Symptom
The activity feed query fails intermittently with this error:
com.mongodb.MongoQueryException: Query failed with error code 292
and error message 'Executor error during find command:
Sort exceeded memory limit of 104857600 bytes'
The query works for most users. It fails for users who follow more than 500 sensors. The error appears only during peak hours when multiple large sorts compete for memory.
The Cause
MongoDB’s query executor has a hard limit of 100 MB for in-memory sorts. When a query includes a sort that cannot be served by an index, all matching documents are loaded into memory and sorted. If the total size exceeds 100 MB, the query is aborted.
// SLOW: Query with sort on non-indexed field
MongoCollection<Document> activities = database.getCollection("activities");
FindIterable<Document> feed = activities.find(
Filters.in("sensorId", user.getFollowedSensors()) // 500+ sensors
)
.sort(Sorts.descending("timestamp"))
.limit(50);
The explain("executionStats") shows the problem:
{
"executionStats": {
"nReturned": 50,
"executionTimeMillis": 8420,
"totalDocsExamined": 2847000,
"executionStages": {
"stage": "LIMIT",
"inputStage": {
"stage": "SORT",
"sortPattern": {"timestamp": -1},
"memLimit": 104857600,
"memUsage": 98741200,
"inputStage": {
"stage": "IXSCAN",
"indexName": "sensorId_1",
"totalKeysExamined": 2847000
}
}
}
}
}
The IXSCAN on sensorId finds 2.8 million matching documents. All 2.8 million are loaded into the SORT stage. At 98.7 MB, this is dangerously close to the 100 MB limit. One more sensor followed and the query fails.
The sensorId_1 index helps with the filter but does nothing for the sort. MongoDB must load all matching documents to sort them by timestamp.
The Benchmark
// JMH benchmark: in-memory sort vs index-supported sort
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@Warmup(iterations = 3, time = 5)
@Measurement(iterations = 5, time = 10)
@Fork(1)
@State(Scope.Benchmark)
public class SortBenchmark {
private MongoCollection<Document> activities;
@Param({"100", "1000", "5000"})
private int followedSensors;
@Setup
public void setup() {
MongoClient client = MongoClients.create("mongodb://localhost:27017");
activities = client.getDatabase("telemetry").getCollection("activities");
}
@Benchmark
public List<Document> inMemorySort() {
List<String> sensors = generateSensorIds(followedSensors);
return activities.find(Filters.in("sensorId", sensors))
.sort(Sorts.descending("timestamp"))
.limit(50)
.into(new ArrayList<>());
}
@Benchmark
public List<Document> indexSupportedSort() {
List<String> sensors = generateSensorIds(followedSensors);
return activities.find(Filters.in("sensorId", sensors))
.sort(Sorts.descending("timestamp"))
.hint(new Document("timestamp", -1).append("sensorId", 1))
.limit(50)
.into(new ArrayList<>());
}
}
JMH results:
Benchmark (followedSensors) Mode Cnt Score Error Units
SortBenchmark.inMemorySort 100 avgt 5 45.200 ± 3.100 ms/op
SortBenchmark.inMemorySort 1000 avgt 5 420.000 ± 28.000 ms/op
SortBenchmark.inMemorySort 5000 avgt 5 FAIL (OOM 292)
SortBenchmark.indexSupportedSort 100 avgt 5 4.100 ± 0.300 ms/op
SortBenchmark.indexSupportedSort 1000 avgt 5 8.200 ± 0.500 ms/op
SortBenchmark.indexSupportedSort 5000 avgt 5 12.400 ± 0.800 ms/op
At 1,000 followed sensors, in-memory sort takes 420ms. Index-supported sort takes 8.2ms. At 5,000 followed sensors, in-memory sort fails entirely.
The Fix
Two approaches, with different trade-offs.
Option 1: Create an index that supports the sort. The compound index {timestamp: -1, sensorId: 1} allows MongoDB to walk the index in sort order and filter by sensorId without loading all documents into memory.
// FAST: Index that supports both filter and sort
activities.createIndex(
Indexes.compoundIndex(
Indexes.descending("timestamp"),
Indexes.ascending("sensorId")
),
new IndexOptions().name("ts_desc_sensorId")
);
With this index, the execution plan changes:
{
"executionStats": {
"nReturned": 50,
"executionTimeMillis": 8,
"totalKeysExamined": 2200,
"totalDocsExamined": 50,
"executionStages": {
"stage": "LIMIT",
"inputStage": {
"stage": "FETCH",
"inputStage": {
"stage": "IXSCAN",
"indexName": "ts_desc_sensorId",
"direction": "forward",
"totalKeysExamined": 2200
}
}
}
}
}
The SORT stage is gone. MongoDB walks the index in timestamp-descending order, checking each key against the sensorId filter, and stops after finding 50 matches. It examined 2,200 keys to find 50 results (ratio 44:1), which is acceptable. The 44:1 ratio reflects the selectivity: the index walks past entries for sensors this user does not follow.
Option 2: allowDiskUse for analytical queries. If the query is for analytics and latency is less critical, you can allow MongoDB to spill the sort to disk:
// SLOW but survivable: allowDiskUse for large sorts
FindIterable<Document> results = activities.find(
Filters.in("sensorId", sensors)
)
.sort(Sorts.descending("timestamp"))
.limit(50)
.allowDiskUse(true); // Spill to /tmp if sort exceeds 100MB
This prevents the error code 292 failure but replaces in-memory sort with disk sort. Disk sort is 10-50x slower than in-memory sort. Use this only as a safety net for queries you cannot index.
The Proof
| Metric | In-memory sort | Index-supported sort | allowDiskUse |
|---|---|---|---|
| 100 sensors | 45ms | 4ms | N/A (fits in memory) |
| 1,000 sensors | 420ms | 8ms | 890ms |
| 5,000 sensors | FAIL (292) | 12ms | 2,400ms |
| Memory usage | Up to 100MB | 0 (index order) | Limited |
| SORT stage | Yes | No | Yes (disk) |
The Trade-off
The {timestamp: -1, sensorId: 1} index is large. On a collection with 280 million activity documents, it consumes approximately 8.4 GB. Every insert pays the index maintenance cost. For a write-heavy activity collection receiving 5,000 inserts per second, this adds measurable write latency. Chapter 10 covers how to design compound indexes that serve multiple query patterns with a single index, reducing the total index count and write amplification.
allowDiskUse is not free either. Disk-based sorts use temporary files in the configured dbPath. On cloud instances with limited IOPS (Chapter 23), this can saturate the storage throughput and degrade all operations, not just the slow query.