Multi-Key Indexes and the $in Selectivity Trap
Multi-Key Indexes and the $in Selectivity Trap
The Symptom
The alerting system queries readings for a set of sensors in a specific temperature range:
List<Document> results = collection.find(Filters.and(
Filters.in("sensorId", sensorIds), // 50-500 sensor IDs
Filters.gt("temperature", threshold),
Filters.gte("ts", Date.from(cutoff))
)).sort(Sorts.descending("ts"))
.limit(100)
.into(new ArrayList<>());
With the ESR-ordered index {sensorId: 1, ts: -1, temperature: 1}, this query takes 2ms when sensorIds has 5 entries but 850ms when it has 500 entries. The explain plan shows 500 separate index scans being merged.
The Cause
MongoDB treats $in as multiple equality matches. For a compound index {sensorId: 1, ts: -1, temperature: 1} and $in with 500 values, MongoDB performs 500 index scans (one per sensorId value), then merges the results. Each scan traverses the ts portion of the index to maintain sort order. With 500 scans, the merge operation becomes the bottleneck.
// explain() output for $in with 500 values
{
"winningPlan": {
"stage": "FETCH",
"inputStage": {
"stage": "SORT_MERGE", // Merging 500 index scans
"inputStages": [
// 500 IXSCAN stages, one per sensorId
{ "stage": "IXSCAN", "indexBounds": { "sensorId": ["[\"sensor-00001\", ...]"] } },
// ... 499 more
]
}
},
"executionStats": {
"totalKeysExamined": 125000,
"totalDocsExamined": 100,
"executionTimeMillis": 850
}
}
125,000 keys examined for 100 returned documents.
The Benchmark
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@Warmup(iterations = 3, time = 5)
@Measurement(iterations = 5, time = 10)
@Fork(1)
@State(Scope.Benchmark)
public class InQueryBenchmark {
private MongoCollection<Document> collection;
@Param({"5", "50", "100", "500"})
private int inListSize;
@Setup
public void setup() {
MongoClient client = MongoClients.create("mongodb://localhost:27017");
collection = client.getDatabase("telemetry").getCollection("readings");
}
@Benchmark
public List<Document> inQueryWithSort() {
List<String> sensorIds = IntStream.range(0, inListSize)
.mapToObj(i -> String.format("sensor-%05d", i))
.toList();
return collection.find(Filters.and(
Filters.in("sensorId", sensorIds),
Filters.gt("temperature", 30.0),
Filters.gte("ts", Date.from(Instant.now().minus(1, ChronoUnit.HOURS)))
)).sort(Sorts.descending("ts"))
.limit(100)
.into(new ArrayList<>());
}
@Benchmark
public List<Document> inQueryWithoutSort() {
List<String> sensorIds = IntStream.range(0, inListSize)
.mapToObj(i -> String.format("sensor-%05d", i))
.toList();
return collection.find(Filters.and(
Filters.in("sensorId", sensorIds),
Filters.gt("temperature", 30.0),
Filters.gte("ts", Date.from(Instant.now().minus(1, ChronoUnit.HOURS)))
)).limit(100)
.into(new ArrayList<>());
}
}
Results:
Benchmark (inListSize) Mode Cnt Score Error Units
InQueryBenchmark.inQueryWithSort 5 avgt 5 2.000 ± 0.300 ms/op
InQueryBenchmark.inQueryWithSort 50 avgt 5 45.000 ± 5.000 ms/op
InQueryBenchmark.inQueryWithSort 100 avgt 5 180.000 ± 15.000 ms/op
InQueryBenchmark.inQueryWithSort 500 avgt 5 850.000 ± 60.000 ms/op
InQueryBenchmark.inQueryWithoutSort 5 avgt 5 1.500 ± 0.200 ms/op
InQueryBenchmark.inQueryWithoutSort 50 avgt 5 8.000 ± 1.000 ms/op
InQueryBenchmark.inQueryWithoutSort 100 avgt 5 15.000 ± 2.000 ms/op
InQueryBenchmark.inQueryWithoutSort 500 avgt 5 65.000 ± 8.000 ms/op
Without sort, the $in with 500 values takes 65ms (acceptable). With sort, it takes 850ms because the SORT_MERGE of 500 index scans dominates.
The Fix
Two approaches for large $in lists.
Approach 1: Batch the $in list and sort client-side.
// FAST: Batch $in queries and sort in application
public List<Document> getAlertReadings(List<String> sensorIds, double threshold,
Instant cutoff, int limit) {
int batchSize = 25;
List<Document> allResults = new ArrayList<>();
for (int i = 0; i < sensorIds.size(); i += batchSize) {
List<String> batch = sensorIds.subList(i, Math.min(i + batchSize, sensorIds.size()));
List<Document> batchResults = collection.find(Filters.and(
Filters.in("sensorId", batch),
Filters.gt("temperature", threshold),
Filters.gte("ts", Date.from(cutoff))
)).sort(Sorts.descending("ts"))
.limit(limit)
.into(new ArrayList<>());
allResults.addAll(batchResults);
}
// Client-side sort and limit
allResults.sort((a, b) -> b.getDate("ts").compareTo(a.getDate("ts")));
return allResults.subList(0, Math.min(limit, allResults.size()));
}
With batches of 25, each query performs a 25-way SORT_MERGE, which takes approximately 10ms. For 500 sensors, 20 batches execute in 200ms total instead of a single 850ms query.
Approach 2: Use an aggregation pipeline with $group to avoid SORT_MERGE.
// FAST: Aggregation avoids SORT_MERGE for large $in
List<Document> results = collection.aggregate(List.of(
Aggregates.match(Filters.and(
Filters.in("sensorId", sensorIds),
Filters.gt("temperature", threshold),
Filters.gte("ts", Date.from(cutoff))
)),
Aggregates.sort(Sorts.descending("ts")),
Aggregates.limit(limit)
)).into(new ArrayList<>());
The aggregation framework may choose a different execution strategy (block sort instead of SORT_MERGE) that performs better for large $in lists.
The Proof
| Query pattern | $in size | Before | After (batched) |
|---|---|---|---|
| With sort | 50 | 45ms | 20ms (2 batches) |
| With sort | 100 | 180ms | 40ms (4 batches) |
| With sort | 500 | 850ms | 200ms (20 batches) |
The Trade-off
Batching adds application complexity and increases the total number of queries. For 500 sensors in batches of 25, the application sends 20 queries instead of 1. Each query is fast, but the serial execution adds round trip overhead. In a cross-region deployment with 30ms round trip time, 20 queries add 600ms of network latency. Parallelize the batches with CompletableFuture to reduce the wall-clock time:
List<CompletableFuture<List<Document>>> futures = new ArrayList<>();
for (int i = 0; i < sensorIds.size(); i += batchSize) {
List<String> batch = sensorIds.subList(i, Math.min(i + batchSize, sensorIds.size()));
futures.add(CompletableFuture.supplyAsync(() ->
collection.find(Filters.and(
Filters.in("sensorId", batch),
Filters.gt("temperature", threshold),
Filters.gte("ts", Date.from(cutoff))
)).sort(Sorts.descending("ts")).limit(limit).into(new ArrayList<>()),
executor
));
}
This trades connection pool capacity for latency. Each parallel batch consumes a connection from the pool (sized in CH4).