Compression Algorithms: Gzip, Brotli, Zstandard Compared
Compression Algorithms: Gzip, Brotli, Zstandard Compared
The main chapter showed the headline numbers: Brotli-4 achieves 7.1x compression on the article list at 230us CPU cost. This section digs deeper: how do these algorithms perform across different payload sizes? Where are the diminishing returns? How does dictionary compression change the equation for small, repetitive payloads?
Test Methodology
All benchmarks run on the content platform’s production server hardware (AMD EPYC 7763, single core isolated via taskset). Each measurement is the median of 10,000 iterations after 1,000 warmup iterations. Payloads are real content platform responses:
// Payload corpus from content platform:
// 1. User preferences: 2,180 bytes (small JSON, repetitive keys)
// 2. Category list: 10,240 bytes (medium JSON, moderate entropy)
// 3. Article list (50): 37,450 bytes (large JSON, mixed content)
// 4. Full article HTML: 148,600 bytes (very large, natural language)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@State(Scope.Benchmark)
@Warmup(iterations = 5, time = 2)
@Measurement(iterations = 10, time = 2)
@Fork(2)
public class CompressionBenchmark {
private byte[] userPrefs; // 2.1KB
private byte[] categoryList; // 10KB
private byte[] articleList; // 37KB
private byte[] fullArticle; // 148KB
@Setup
public void setup() throws Exception {
userPrefs = Files.readAllBytes(Path.of("test-data/user-prefs.json"));
categoryList = Files.readAllBytes(Path.of("test-data/categories.json"));
articleList = Files.readAllBytes(Path.of("test-data/article-list.json"));
fullArticle = Files.readAllBytes(Path.of("test-data/full-article.html"));
}
@Benchmark
public byte[] gzip6_articleList() {
return compressGzip(articleList, 6);
}
@Benchmark
public byte[] brotli4_articleList() {
return compressBrotli(articleList, 4);
}
@Benchmark
public byte[] zstd3_articleList() {
return compressZstd(articleList, 3);
}
// ... all combinations benchmarked
private byte[] compressGzip(byte[] input, int level) {
var baos = new ByteArrayOutputStream(input.length / 4);
var deflater = new Deflater(level);
try (var gos = new GZIPOutputStream(baos) {{
def = deflater;
}}) {
gos.write(input);
} catch (IOException e) {
throw new UncheckedIOException(e);
}
return baos.toByteArray();
}
private byte[] compressBrotli(byte[] input, int quality) {
var params = new Encoder.Parameters().setQuality(quality);
return Encoder.compress(input, params);
}
private byte[] compressZstd(byte[] input, int level) {
return Zstd.compress(input, level);
}
}
Results: Small Payload (2KB User Preferences)
Algorithm Level Compressed Ratio Compress Decompress
Gzip 1 812 B 2.7x 28 us 12 us
Gzip 6 645 B 3.4x 52 us 12 us
Gzip 9 638 B 3.4x 105 us 12 us
Brotli 1 710 B 3.1x 32 us 9 us
Brotli 4 520 B 4.2x 78 us 9 us
Brotli 6 485 B 4.5x 165 us 9 us
Brotli 11 410 B 5.3x 2,800 us 9 us
Zstandard 1 680 B 3.2x 15 us 7 us
Zstandard 3 545 B 4.0x 22 us 7 us
Zstandard 9 488 B 4.5x 95 us 7 us
Zstandard 19 425 B 5.1x 1,200 us 7 us
For small payloads, Zstandard dominates. At level 3, it achieves 4.0x compression in just 22us. Gzip at level 6 takes 52us for only 3.4x. The difference: Zstandard was designed for small-to-medium payloads, while Gzip’s DEFLATE algorithm needs more input to build an effective Huffman tree.
Key insight: for the 2KB user preferences endpoint called on every page load, Zstandard-3 saves 1,635 bytes in 22us. Gzip-6 saves 1,535 bytes in 52us. Zstandard delivers marginally better compression at less than half the CPU cost.
Results: Medium Payload (10KB Category List)
Algorithm Level Compressed Ratio Compress Decompress
Gzip 1 3,210 B 3.2x 48 us 18 us
Gzip 6 2,450 B 4.2x 98 us 18 us
Gzip 9 2,380 B 4.3x 210 us 18 us
Brotli 1 2,820 B 3.6x 55 us 14 us
Brotli 4 1,880 B 5.4x 135 us 14 us
Brotli 6 1,720 B 6.0x 320 us 14 us
Brotli 11 1,480 B 6.9x 4,500 us 14 us
Zstandard 1 2,680 B 3.8x 28 us 11 us
Zstandard 3 2,020 B 5.1x 45 us 11 us
Zstandard 9 1,750 B 5.9x 185 us 11 us
Zstandard 19 1,520 B 6.7x 2,400 us 11 us
At 10KB, Brotli’s pre-built dictionary starts paying off. Brotli-4 achieves 5.4x vs Zstandard-3’s 5.1x. But Zstandard-3 is still 3x faster (45us vs 135us). The crossover point depends on whether CPU or bandwidth is the bottleneck.
Results: Large Payload (37KB Article List)
Algorithm Level Compressed Ratio Compress Decompress
Gzip 1 8,917 B 4.2x 82 us 34 us
Gzip 6 6,820 B 5.5x 185 us 35 us
Gzip 9 6,680 B 5.6x 420 us 35 us
Brotli 1 7,802 B 4.8x 95 us 28 us
Brotli 4 5,240 B 7.1x 230 us 29 us
Brotli 6 4,800 B 7.8x 580 us 29 us
Brotli 11 4,100 B 9.1x 8,200 us 28 us
Zstandard 1 7,343 B 5.1x 42 us 18 us
Zstandard 3 5,480 B 6.8x 68 us 18 us
Zstandard 9 4,740 B 7.9x 310 us 19 us
Zstandard 19 4,350 B 8.6x 4,100 us 18 us
At 37KB, the patterns are clear. Brotli’s dictionary gives it an edge on text/JSON content (9.1x at level 11). But for dynamic content where we cannot afford 8ms per response, the practical choices are:
- Budget < 100us: Zstandard-3 (6.8x, 68us)
- Budget < 250us: Brotli-4 (7.1x, 230us)
- Budget < 600us: Brotli-6 (7.8x, 580us)
Results: Very Large Payload (148KB Full Article)
Algorithm Level Compressed Ratio Compress Decompress
Gzip 1 32,400 B 4.6x 280 us 95 us
Gzip 6 24,100 B 6.2x 620 us 96 us
Gzip 9 23,200 B 6.4x 1,450 us 96 us
Brotli 1 28,500 B 5.2x 320 us 78 us
Brotli 4 18,200 B 8.2x 780 us 80 us
Brotli 6 16,400 B 9.1x 1,950 us 79 us
Brotli 11 13,800 B 10.8x 28,000 us 78 us
Zstandard 1 26,800 B 5.5x 145 us 52 us
Zstandard 3 19,800 B 7.5x 235 us 52 us
Zstandard 9 16,900 B 8.8x 1,050 us 54 us
Zstandard 19 14,500 B 10.2x 14,200 us 53 us
For the full article HTML (natural language), Brotli-11 achieves 10.8x compression. This is the use case for pre-compression: compress once at build/publish time, serve the pre-compressed file from CDN. The 28ms CPU cost is paid once per article publish, not once per request.
Diminishing Returns Analysis
For each algorithm, plot the ratio improvement per additional microsecond of CPU:
Gzip:
Level 1->6: +1.3x ratio for +103us = 0.0126 ratio/us
Level 6->9: +0.1x ratio for +235us = 0.0004 ratio/us (97% diminishing)
Verdict: Level 6 is the sweet spot. Level 9 wastes CPU.
Brotli:
Level 1->4: +2.3x ratio for +135us = 0.0170 ratio/us
Level 4->6: +0.7x ratio for +350us = 0.0020 ratio/us (88% diminishing)
Level 6->11: +1.3x ratio for +7620us = 0.0002 ratio/us (99% diminishing)
Verdict: Level 4 for dynamic, level 11 for static pre-compressed.
Zstandard:
Level 1->3: +1.7x ratio for +26us = 0.0654 ratio/us (best efficiency)
Level 3->9: +1.1x ratio for +242us = 0.0045 ratio/us (93% diminishing)
Level 9->19: +0.7x ratio for +3790us = 0.0002 ratio/us (99% diminishing)
Verdict: Level 3 for dynamic. Level 9 if CPU is abundant. Level 19 for static.
The analysis confirms: Zstandard level 1->3 has the highest efficiency of any transition (0.0654 ratio per microsecond). This is why Zstandard-3 is the optimal choice for high-throughput dynamic content.
Dictionary Compression
The content platform’s API responses share significant structure: same JSON keys, same category names, same URL prefixes. Zstandard supports training a dictionary from sample data:
// Train a dictionary from 1000 sample API responses
// Dictionary captures repeated patterns across responses
public class DictionaryCompression {
private final ZstdDictCompress compressDict;
private final ZstdDictDecompress decompressDict;
public DictionaryCompression(Path dictionaryPath) throws IOException {
byte[] dictBytes = Files.readAllBytes(dictionaryPath);
this.compressDict = new ZstdDictCompress(dictBytes, 3);
this.decompressDict = new ZstdDictDecompress(dictBytes);
}
public byte[] compress(byte[] input) {
return Zstd.compress(input, compressDict);
}
public byte[] decompress(byte[] compressed, int originalSize) {
byte[] output = new byte[originalSize];
Zstd.decompress(output, compressed, decompressDict);
return output;
}
}
// Training the dictionary (offline, at deploy time):
// zstd --train samples/*.json -o api-response.dict --maxdict=32768
// Results with dictionary (37KB article list):
// Without dict, Zstd-3: 5,480 bytes (6.8x) 68us
// With dict, Zstd-3: 3,920 bytes (9.6x) 55us
//
// Dictionary provides 41% better compression AND 19% faster compression
// (dictionary pre-computes frequent patterns, less work at runtime)
Dictionary compression is particularly effective for small payloads where the algorithm lacks sufficient context:
2KB user preferences:
Zstd-3 without dict: 545 bytes (4.0x) 22us
Zstd-3 with dict: 310 bytes (7.0x) 14us
Improvement: 75% better compression, 36% faster
Why: The 2KB payload has keys like "theme", "language", "notifications"
that appear in every response. The dictionary pre-encodes these,
so the compressor outputs only the unique values.
Trade-off: both client and server must share the same dictionary. For browser clients, this requires a custom decompression library (not standard Content-Encoding). For internal gRPC services, dictionary compression is straightforward:
// gRPC interceptor with dictionary compression for internal services
public class ZstdDictInterceptor implements ServerInterceptor {
private final ZstdDictCompress compressDict;
@Override
public <ReqT, RespT> ServerCall.Listener<ReqT> interceptCall(
ServerCall<ReqT, RespT> call,
Metadata headers,
ServerCallHandler<ReqT, RespT> next) {
// Check if client supports dictionary compression
String encoding = headers.get(ACCEPT_ENCODING_KEY);
if (encoding != null && encoding.contains("zstd-dict")) {
return next.startCall(new CompressedServerCall<>(call, compressDict), headers);
}
return next.startCall(call, headers);
}
}
Spring Boot Configuration
Production configuration for the content platform:
// application.yml:
// server:
// compression:
// enabled: true
// min-response-size: 1024 # Don't compress < 1KB (overhead > savings)
// mime-types:
// - application/json
// - text/html
// - text/css
// - application/javascript
// - image/svg+xml
// Custom compression with algorithm selection:
@Component
public class AdaptiveCompressionFilter implements WebFilter {
private static final int MIN_SIZE = 1024;
@Override
public Mono<Void> filter(ServerWebExchange exchange, WebFilterChain chain) {
return chain.filter(exchange.mutate()
.response(new CompressingResponseDecorator(
exchange.getResponse(),
selectAlgorithm(exchange.getRequest())))
.build());
}
private CompressionAlgorithm selectAlgorithm(ServerHttpRequest request) {
String accept = request.getHeaders().getFirst("Accept-Encoding");
if (accept == null) return CompressionAlgorithm.NONE;
// Prefer Zstandard for internal services (custom header signals internal)
if (request.getHeaders().containsKey("X-Internal-Service")) {
if (accept.contains("zstd")) return CompressionAlgorithm.ZSTD_3;
}
// For browsers: Brotli > Gzip
if (accept.contains("br")) return CompressionAlgorithm.BROTLI_4;
if (accept.contains("gzip")) return CompressionAlgorithm.GZIP_6;
return CompressionAlgorithm.NONE;
}
}
enum CompressionAlgorithm {
NONE, GZIP_6, BROTLI_4, ZSTD_3
}
CDN Pre-Compression for Static Assets
Static assets (CSS, JS, fonts) compress once and serve millions of times. Use maximum compression:
// Build script: pre-compress all static assets
// Runs as part of CI/CD pipeline after asset bundling
public class StaticAssetCompressor {
public static void compressStaticAssets(Path assetsDir) throws IOException {
try (var files = Files.walk(assetsDir)) {
files.filter(Files::isRegularFile)
.filter(StaticAssetCompressor::isCompressible)
.forEach(StaticAssetCompressor::compressFile);
}
}
private static void compressFile(Path file) {
try {
byte[] original = Files.readAllBytes(file);
if (original.length < 1024) return; // Skip tiny files
// Brotli level 11 (maximum compression, slow but one-time cost)
byte[] brotli = Encoder.compress(original,
new Encoder.Parameters().setQuality(11));
Files.write(Path.of(file + ".br"), brotli);
// Gzip level 9 (fallback for old clients)
byte[] gzip = compressGzip(original, 9);
Files.write(Path.of(file + ".gz"), gzip);
double brotliRatio = (double) original.length / brotli.length;
System.out.printf("%s: %d -> %d bytes (%.1fx Brotli)%n",
file.getFileName(), original.length, brotli.length, brotliRatio);
} catch (IOException e) {
throw new UncheckedIOException(e);
}
}
private static boolean isCompressible(Path file) {
String name = file.toString().toLowerCase();
return name.endsWith(".js") || name.endsWith(".css")
|| name.endsWith(".html") || name.endsWith(".svg")
|| name.endsWith(".json") || name.endsWith(".xml");
}
}
// Nginx configuration to serve pre-compressed files:
// location /assets/ {
// gzip_static on; # Serve .gz files if Accept-Encoding: gzip
// brotli_static on; # Serve .br files if Accept-Encoding: br
// expires 1y; # Static assets are versioned, cache forever
// add_header Cache-Control "public, immutable";
// }
Compression Level Decision Matrix
Payload Type Algorithm Level Rationale
Dynamic API (<5KB) Zstandard 3 Best speed/ratio at small sizes
Dynamic API (5-50KB) Brotli 4 Dictionary advantage outweighs speed gap
Dynamic API (>50KB) Brotli 4 Ratio matters more for large payloads
Static assets Brotli 11 Compress once, serve forever
Internal gRPC Zstandard 3+dict Dictionary unlocks 9x+ on structured data
WebSocket frames Zstandard 1 Per-message compression, speed critical
SSE events None - Too small (<200 bytes), overhead > savings
At the content platform’s scale (50,000 requests/sec), the compression choice determines whether the server spends 3.4ms/sec (Zstandard-3) or 9.25ms/sec (Gzip-6) on compression. That 5.85ms/sec difference frees one CPU core for application logic at 170,000 requests/sec capacity planning.