Count Queries, Exists Checks, and the Coverage Gotchas

The Symptom

The monitoring service checks whether each sensor has reported in the last 5 minutes. With 10,000 sensors, this health check runs 10,000 countDocuments() calls every 30 seconds. The health check takes 8 seconds to complete, consuming significant CPU on both the application and the database.

The Cause

// SLOW: countDocuments() with unnecessary work
for (String sensorId : allSensorIds) {
    long count = collection.countDocuments(Filters.and(
        Filters.eq("sensorId", sensorId),
        Filters.gte("ts", Date.from(Instant.now().minus(5, ChronoUnit.MINUTES)))
    ));
    if (count == 0) {
        alertSensorDown(sensorId);
    }
}

The code only needs to know if any reading exists in the last 5 minutes. countDocuments() counts all matching documents. For an active sensor with 60 readings in 5 minutes, it counts all 60 when finding 1 would suffice.

The Benchmark

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@Warmup(iterations = 3, time = 5)
@Measurement(iterations = 5, time = 10)
@Fork(1)
@State(Scope.Benchmark)
public class ExistsCheckBenchmark {

    private MongoCollection<Document> collection;
    private Date cutoff;

    @Setup
    public void setup() {
        MongoClient client = MongoClients.create("mongodb://localhost:27017");
        collection = client.getDatabase("telemetry").getCollection("readings");
        cutoff = Date.from(Instant.now().minus(5, ChronoUnit.MINUTES));
    }

    @Benchmark
    public long countDocuments() {
        return collection.countDocuments(Filters.and(
            Filters.eq("sensorId", "sensor-00042"),
            Filters.gte("ts", cutoff)
        ));
    }

    @Benchmark
    public boolean existsWithLimit() {
        Document doc = collection.find(Filters.and(
            Filters.eq("sensorId", "sensor-00042"),
            Filters.gte("ts", cutoff)
        )).projection(Projections.fields(
            Projections.include("sensorId"),
            Projections.excludeId()
        )).limit(1).first();
        return doc != null;
    }

    @Benchmark
    public long estimatedCount() {
        // Note: estimatedDocumentCount() ignores filters, counts whole collection
        return collection.estimatedDocumentCount();
    }
}

Results:

Benchmark                                 Mode  Cnt    Score    Error  Units
ExistsCheckBenchmark.countDocuments       avgt    5  850.000 ± 45.000  us/op
ExistsCheckBenchmark.existsWithLimit      avgt    5   95.000 ±  8.000  us/op
ExistsCheckBenchmark.estimatedCount       avgt    5    5.000 ±  0.500  us/op

limit(1) is 8.9x faster than countDocuments() for exists checks. It scans exactly one index key and returns immediately. countDocuments() scans all 60 matching keys.

The Fix

Fix 1: Replace count-based exists with limit(1).

// FAST: Exists check with limit(1) and covered projection
public boolean isSensorActive(String sensorId, Instant cutoff) {
    return collection.find(Filters.and(
        Filters.eq("sensorId", sensorId),
        Filters.gte("ts", Date.from(cutoff))
    )).projection(Projections.fields(
        Projections.include("sensorId"),
        Projections.excludeId()
    )).limit(1).first() != null;
}

Fix 2: Batch the health check with a single aggregation.

// FAST: Single query replaces 10,000 individual queries
public Set<String> getActiveSensorIds(Instant cutoff) {
    List<Document> results = collection.aggregate(List.of(
        Aggregates.match(Filters.gte("ts", Date.from(cutoff))),
        Aggregates.group("$sensorId")  // Distinct sensorIds with recent readings
    )).into(new ArrayList<>());

    return results.stream()
        .map(doc -> doc.getString("_id"))
        .collect(Collectors.toSet());
}

public void checkSensorHealth(List<String> allSensorIds, Instant cutoff) {
    Set<String> activeSensors = getActiveSensorIds(cutoff);
    for (String sensorId : allSensorIds) {
        if (!activeSensors.contains(sensorId)) {
            alertSensorDown(sensorId);
        }
    }
}

One aggregation replaces 10,000 queries. The $group on sensorId with no accumulator performs a distinct operation using the {sensorId: 1, ts: -1} index.

Coverage gotchas to remember:

Queries involving array fields are never covered, even if the array field is indexed. Multi-key indexes store one index entry per array element, and MongoDB cannot reconstruct the array from the index without fetching the document.

// NOT coverable: tags is an array field
collection.find(Filters.eq("tags", "indoor"))
    .projection(Projections.fields(
        Projections.include("tags"),
        Projections.excludeId()
    ));
// Even with index {tags: 1}, this requires FETCH

Queries using $elemMatch on embedded arrays are never covered. Queries projecting dotted paths (sensor.building) are covered only if the entire embedded document path is in the index.

The Proof

After replacing 10,000 individual count queries with the batched aggregation:

Metric	Before (10K counts)	After (1 aggregation)
Health check duration	8,000ms	120ms
MongoDB ops per check	10,000	1
Server CPU during check	35% spike	2% blip
Network round trips	10,000	1

The Trade-off

The batched aggregation loads all active sensor IDs into application memory. With 10,000 sensors, this is approximately 150 KB of string data, negligible. If the sensor count grows to 1 million, the set would be 15 MB, which is still manageable but worth monitoring.

estimatedDocumentCount() is the fastest count operation because it reads the collection metadata directly, avoiding any index scan. But it returns the total document count for the collection, not a filtered count. Use it only when you need the total. For filtered counts where the exact number matters (pagination total), countDocuments() remains the correct choice. For filtered counts where you only need to know “at least one exists,” always use limit(1).