Bucket Granularity Selection and Write Amplification
Bucket Granularity Selection and Write Amplification
The Symptom
The telemetry platform adopted hourly buckets as described in the chapter introduction. Write throughput is stable at 2,000 ops/sec for the first 30 minutes of each hour. Then, around minute 40, write latency spikes from 2ms to 45ms and throughput drops to 1,200 ops/sec. By minute 55, writes take 80ms and throughput is at 800 ops/sec.
The Cause
Write amplification. When MongoDB appends to an array with $push, it must rewrite the entire document if the document has grown beyond its allocated storage space. At minute 0, the bucket document is small (the initial upsert creates a document with one measurement, approximately 100 bytes). WiredTiger allocates storage with some padding. By minute 40, the document has 480 measurements and is 19 KB. By minute 55, it has 660 measurements and is 26 KB. Each $push triggers a document relocation because the document outgrows its storage allocation.
Document relocations are expensive. WiredTiger must:
- Find a new contiguous space in the data file
- Copy the existing document to the new location
- Update the index entries to point to the new location
- Mark the old space as free
For a 26 KB document, this involves writing 26 KB, updating the _id index and the compound {sensorId, bucketStart} index. At 2,000 writes per second across 10,000 sensors, many sensors hit the relocation threshold simultaneously.
The Benchmark
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@Warmup(iterations = 2, time = 10)
@Measurement(iterations = 3, time = 20)
@Fork(1)
@State(Scope.Benchmark)
public class BucketGranularityBenchmark {
private MongoCollection<Document> minuteCollection;
private MongoCollection<Document> fiveMinCollection;
private MongoCollection<Document> hourCollection;
@Setup
public void setup() {
MongoClient client = MongoClients.create("mongodb://localhost:27017");
var db = client.getDatabase("telemetry_bench");
minuteCollection = db.getCollection("buckets_1min");
fiveMinCollection = db.getCollection("buckets_5min");
hourCollection = db.getCollection("buckets_1hour");
// Pre-fill buckets to simulate mid-bucket state
prefillBucket(minuteCollection, 12); // 12 readings in 1-minute bucket
prefillBucket(fiveMinCollection, 60); // 60 readings in 5-minute bucket
prefillBucket(hourCollection, 360); // 360 readings in 1-hour bucket (half full)
}
@Benchmark
public UpdateResult pushToMinuteBucket() {
return pushMeasurement(minuteCollection, "minute");
}
@Benchmark
public UpdateResult pushToFiveMinBucket() {
return pushMeasurement(fiveMinCollection, "fivemin");
}
@Benchmark
public UpdateResult pushToHourBucket() {
return pushMeasurement(hourCollection, "hour");
}
private UpdateResult pushMeasurement(MongoCollection<Document> coll, String bucket) {
return coll.updateOne(
Filters.eq("sensorId", "sensor-bench"),
Updates.combine(
Updates.push("measurements", new Document()
.append("ts", new Date())
.append("t", 23.5).append("h", 65.0)
.append("p", 1013.0).append("v", 3.28)),
Updates.inc("count", 1)
)
);
}
}
Results:
Benchmark Mode Cnt Score Error Units
BucketGranularityBenchmark.pushToMinuteBucket avgt 3 85.000 ± 12.000 us/op
BucketGranularityBenchmark.pushToFiveMinBucket avgt 3 180.000 ± 25.000 us/op
BucketGranularityBenchmark.pushToHourBucket avgt 3 650.000 ± 85.000 us/op
The 1-minute bucket with 12 entries: 85us. The hourly bucket with 360 entries: 650us. That is 7.6x slower. The write cost scales with document size, not with the number of measurements being added.
The Fix
Use 5-minute buckets instead of hourly buckets. This balances write amplification against document count:
| Granularity | Readings per bucket | Doc size at fill | Docs per day | Write latency at fill |
|---|---|---|---|---|
| 1 minute | 12 | 0.5 KB | 14.4M | 85us |
| 5 minutes | 60 | 2.4 KB | 2.88M | 180us |
| 15 minutes | 180 | 7.2 KB | 960K | 350us |
| 1 hour | 720 | 28 KB | 240K | 650us |
Five-minute buckets keep write latency under 200us while reducing document count to 2.88M per day (60x fewer than per-event).
// FAST: 5-minute bucket granularity
Instant timestamp = reading.getTimestamp();
Instant bucketStart = timestamp.truncatedTo(ChronoUnit.MINUTES)
.minusSeconds(timestamp.atZone(ZoneOffset.UTC).getMinute() % 5 * 60);
collection.updateOne(
Filters.and(
Filters.eq("sensorId", reading.getSensorId()),
Filters.eq("bucketStart", Date.from(bucketStart))
),
Updates.combine(
Updates.push("measurements", new Document()
.append("ts", Date.from(timestamp))
.append("t", reading.getTemperature())
.append("h", reading.getHumidity())
.append("p", reading.getPressure())
.append("v", reading.getVoltage())
),
Updates.inc("count", 1),
Updates.setOnInsert("sensorId", reading.getSensorId()),
Updates.setOnInsert("bucketStart", Date.from(bucketStart))
),
new UpdateOptions().upsert(true)
);
The Proof
After switching from hourly to 5-minute buckets:
| Metric | Hourly buckets | 5-minute buckets |
|---|---|---|
| Write p50 | 12ms (start) to 45ms (end) | 2ms (stable) |
| Write p99 | 30ms (start) to 120ms (end) | 8ms (stable) |
| Write throughput | 2,000 dropping to 800 ops/s | 2,000 stable ops/s |
| Docs per day | 240K | 2.88M |
| Storage per day | 6.72 GB | 8.2 GB |
| Index entries per day | 240K | 2.88M |
The Trade-off
Five-minute buckets produce 12x more documents and 12x more index entries than hourly buckets. Storage per day increases from 6.72 GB to 8.2 GB because of the additional per-document overhead. For range queries spanning a full day, the query reads 2,880 documents instead of 240. If your primary query pattern is “give me all readings for sensor X for the last 24 hours,” hourly buckets serve that query 12x more efficiently.
The right granularity depends on your data rate and your query patterns. If readings arrive every 5 seconds and queries typically span 1-6 hours, 5-minute buckets are the sweet spot. If readings arrive every minute and queries span weeks, hourly or daily buckets are better.
MongoDB 5.0 introduced native time-series collections that handle bucketing automatically. For new deployments, consider time-series collections first. They handle granularity selection, compression, and the summary statistics automatically. The manual bucket pattern remains relevant for existing collections and for cases requiring custom summary computation.