Compression Algorithms: Gzip, Brotli, Zstandard Compared

The main chapter showed the headline numbers: Brotli-4 achieves 7.1x compression on the article list at 230us CPU cost. This section digs deeper: how do these algorithms perform across different payload sizes? Where are the diminishing returns? How does dictionary compression change the equation for small, repetitive payloads?

Test Methodology

All benchmarks run on the content platform’s production server hardware (AMD EPYC 7763, single core isolated via taskset). Each measurement is the median of 10,000 iterations after 1,000 warmup iterations. Payloads are real content platform responses:

// Payload corpus from content platform:
// 1. User preferences:    2,180 bytes (small JSON, repetitive keys)
// 2. Category list:      10,240 bytes (medium JSON, moderate entropy)
// 3. Article list (50):  37,450 bytes (large JSON, mixed content)
// 4. Full article HTML: 148,600 bytes (very large, natural language)

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@State(Scope.Benchmark)
@Warmup(iterations = 5, time = 2)
@Measurement(iterations = 10, time = 2)
@Fork(2)
public class CompressionBenchmark {

    private byte[] userPrefs;      // 2.1KB
    private byte[] categoryList;   // 10KB
    private byte[] articleList;    // 37KB
    private byte[] fullArticle;   // 148KB

    @Setup
    public void setup() throws Exception {
        userPrefs = Files.readAllBytes(Path.of("test-data/user-prefs.json"));
        categoryList = Files.readAllBytes(Path.of("test-data/categories.json"));
        articleList = Files.readAllBytes(Path.of("test-data/article-list.json"));
        fullArticle = Files.readAllBytes(Path.of("test-data/full-article.html"));
    }

    @Benchmark
    public byte[] gzip6_articleList() {
        return compressGzip(articleList, 6);
    }

    @Benchmark
    public byte[] brotli4_articleList() {
        return compressBrotli(articleList, 4);
    }

    @Benchmark
    public byte[] zstd3_articleList() {
        return compressZstd(articleList, 3);
    }

    // ... all combinations benchmarked

    private byte[] compressGzip(byte[] input, int level) {
        var baos = new ByteArrayOutputStream(input.length / 4);
        var deflater = new Deflater(level);
        try (var gos = new GZIPOutputStream(baos) {{
            def = deflater;
        }}) {
            gos.write(input);
        } catch (IOException e) {
            throw new UncheckedIOException(e);
        }
        return baos.toByteArray();
    }

    private byte[] compressBrotli(byte[] input, int quality) {
        var params = new Encoder.Parameters().setQuality(quality);
        return Encoder.compress(input, params);
    }

    private byte[] compressZstd(byte[] input, int level) {
        return Zstd.compress(input, level);
    }
}

Results: Small Payload (2KB User Preferences)

Algorithm  Level  Compressed  Ratio  Compress  Decompress
Gzip        1      812 B     2.7x     28 us      12 us
Gzip        6      645 B     3.4x     52 us      12 us
Gzip        9      638 B     3.4x    105 us      12 us
Brotli      1      710 B     3.1x     32 us       9 us
Brotli      4      520 B     4.2x     78 us       9 us
Brotli      6      485 B     4.5x    165 us       9 us
Brotli     11      410 B     5.3x  2,800 us       9 us
Zstandard   1      680 B     3.2x     15 us       7 us
Zstandard   3      545 B     4.0x     22 us       7 us
Zstandard   9      488 B     4.5x     95 us       7 us
Zstandard  19      425 B     5.1x  1,200 us       7 us

For small payloads, Zstandard dominates. At level 3, it achieves 4.0x compression in just 22us. Gzip at level 6 takes 52us for only 3.4x. The difference: Zstandard was designed for small-to-medium payloads, while Gzip’s DEFLATE algorithm needs more input to build an effective Huffman tree.

Key insight: for the 2KB user preferences endpoint called on every page load, Zstandard-3 saves 1,635 bytes in 22us. Gzip-6 saves 1,535 bytes in 52us. Zstandard delivers marginally better compression at less than half the CPU cost.

Algorithm  Level  Compressed  Ratio  Compress  Decompress
Gzip        1     3,210 B    3.2x     48 us      18 us
Gzip        6     2,450 B    4.2x     98 us      18 us
Gzip        9     2,380 B    4.3x    210 us      18 us
Brotli      1     2,820 B    3.6x     55 us      14 us
Brotli      4     1,880 B    5.4x    135 us      14 us
Brotli      6     1,720 B    6.0x    320 us      14 us
Brotli     11     1,480 B    6.9x  4,500 us      14 us
Zstandard   1     2,680 B    3.8x     28 us      11 us
Zstandard   3     2,020 B    5.1x     45 us      11 us
Zstandard   9     1,750 B    5.9x    185 us      11 us
Zstandard  19     1,520 B    6.7x  2,400 us      11 us

At 10KB, Brotli’s pre-built dictionary starts paying off. Brotli-4 achieves 5.4x vs Zstandard-3’s 5.1x. But Zstandard-3 is still 3x faster (45us vs 135us). The crossover point depends on whether CPU or bandwidth is the bottleneck.

Results: Large Payload (37KB Article List)

Algorithm  Level  Compressed  Ratio  Compress  Decompress
Gzip        1     8,917 B    4.2x     82 us      34 us
Gzip        6     6,820 B    5.5x    185 us      35 us
Gzip        9     6,680 B    5.6x    420 us      35 us
Brotli      1     7,802 B    4.8x     95 us      28 us
Brotli      4     5,240 B    7.1x    230 us      29 us
Brotli      6     4,800 B    7.8x    580 us      29 us
Brotli     11     4,100 B    9.1x  8,200 us      28 us
Zstandard   1     7,343 B    5.1x     42 us      18 us
Zstandard   3     5,480 B    6.8x     68 us      18 us
Zstandard   9     4,740 B    7.9x    310 us      19 us
Zstandard  19     4,350 B    8.6x  4,100 us      18 us

At 37KB, the patterns are clear. Brotli’s dictionary gives it an edge on text/JSON content (9.1x at level 11). But for dynamic content where we cannot afford 8ms per response, the practical choices are:

Budget < 100us: Zstandard-3 (6.8x, 68us)
Budget < 250us: Brotli-4 (7.1x, 230us)
Budget < 600us: Brotli-6 (7.8x, 580us)

Results: Very Large Payload (148KB Full Article)

Algorithm  Level  Compressed  Ratio   Compress  Decompress
Gzip        1     32,400 B    4.6x    280 us      95 us
Gzip        6     24,100 B    6.2x    620 us      96 us
Gzip        9     23,200 B    6.4x  1,450 us      96 us
Brotli      1     28,500 B    5.2x    320 us      78 us
Brotli      4     18,200 B    8.2x    780 us      80 us
Brotli      6     16,400 B    9.1x  1,950 us      79 us
Brotli     11     13,800 B   10.8x 28,000 us      78 us
Zstandard   1     26,800 B    5.5x    145 us      52 us
Zstandard   3     19,800 B    7.5x    235 us      52 us
Zstandard   9     16,900 B    8.8x  1,050 us      54 us
Zstandard  19     14,500 B   10.2x 14,200 us      53 us

For the full article HTML (natural language), Brotli-11 achieves 10.8x compression. This is the use case for pre-compression: compress once at build/publish time, serve the pre-compressed file from CDN. The 28ms CPU cost is paid once per article publish, not once per request.

Diminishing Returns Analysis

For each algorithm, plot the ratio improvement per additional microsecond of CPU:

Gzip:
  Level 1->6: +1.3x ratio for +103us  = 0.0126 ratio/us
  Level 6->9: +0.1x ratio for +235us  = 0.0004 ratio/us  (97% diminishing)
  Verdict: Level 6 is the sweet spot. Level 9 wastes CPU.

Brotli:
  Level 1->4: +2.3x ratio for +135us  = 0.0170 ratio/us
  Level 4->6: +0.7x ratio for +350us  = 0.0020 ratio/us  (88% diminishing)
  Level 6->11: +1.3x ratio for +7620us = 0.0002 ratio/us (99% diminishing)
  Verdict: Level 4 for dynamic, level 11 for static pre-compressed.

Zstandard:
  Level 1->3: +1.7x ratio for +26us   = 0.0654 ratio/us  (best efficiency)
  Level 3->9: +1.1x ratio for +242us  = 0.0045 ratio/us  (93% diminishing)
  Level 9->19: +0.7x ratio for +3790us = 0.0002 ratio/us (99% diminishing)
  Verdict: Level 3 for dynamic. Level 9 if CPU is abundant. Level 19 for static.

The analysis confirms: Zstandard level 1->3 has the highest efficiency of any transition (0.0654 ratio per microsecond). This is why Zstandard-3 is the optimal choice for high-throughput dynamic content.

Dictionary Compression

The content platform’s API responses share significant structure: same JSON keys, same category names, same URL prefixes. Zstandard supports training a dictionary from sample data:

// Train a dictionary from 1000 sample API responses
// Dictionary captures repeated patterns across responses

public class DictionaryCompression {

    private final ZstdDictCompress compressDict;
    private final ZstdDictDecompress decompressDict;

    public DictionaryCompression(Path dictionaryPath) throws IOException {
        byte[] dictBytes = Files.readAllBytes(dictionaryPath);
        this.compressDict = new ZstdDictCompress(dictBytes, 3);
        this.decompressDict = new ZstdDictDecompress(dictBytes);
    }

    public byte[] compress(byte[] input) {
        return Zstd.compress(input, compressDict);
    }

    public byte[] decompress(byte[] compressed, int originalSize) {
        byte[] output = new byte[originalSize];
        Zstd.decompress(output, compressed, decompressDict);
        return output;
    }
}

// Training the dictionary (offline, at deploy time):
// zstd --train samples/*.json -o api-response.dict --maxdict=32768

// Results with dictionary (37KB article list):
// Without dict, Zstd-3:  5,480 bytes (6.8x)  68us
// With dict, Zstd-3:     3,920 bytes (9.6x)  55us
//
// Dictionary provides 41% better compression AND 19% faster compression
// (dictionary pre-computes frequent patterns, less work at runtime)

Dictionary compression is particularly effective for small payloads where the algorithm lacks sufficient context:

2KB user preferences:
  Zstd-3 without dict:  545 bytes (4.0x)  22us
  Zstd-3 with dict:     310 bytes (7.0x)  14us
  Improvement: 75% better compression, 36% faster

Why: The 2KB payload has keys like "theme", "language", "notifications"
     that appear in every response. The dictionary pre-encodes these,
     so the compressor outputs only the unique values.

Trade-off: both client and server must share the same dictionary. For browser clients, this requires a custom decompression library (not standard Content-Encoding). For internal gRPC services, dictionary compression is straightforward:

// gRPC interceptor with dictionary compression for internal services
public class ZstdDictInterceptor implements ServerInterceptor {

    private final ZstdDictCompress compressDict;

    @Override
    public <ReqT, RespT> ServerCall.Listener<ReqT> interceptCall(
            ServerCall<ReqT, RespT> call,
            Metadata headers,
            ServerCallHandler<ReqT, RespT> next) {

        // Check if client supports dictionary compression
        String encoding = headers.get(ACCEPT_ENCODING_KEY);
        if (encoding != null && encoding.contains("zstd-dict")) {
            return next.startCall(new CompressedServerCall<>(call, compressDict), headers);
        }
        return next.startCall(call, headers);
    }
}

Spring Boot Configuration

Production configuration for the content platform:

// application.yml:
// server:
//   compression:
//     enabled: true
//     min-response-size: 1024  # Don't compress < 1KB (overhead > savings)
//     mime-types:
//       - application/json
//       - text/html
//       - text/css
//       - application/javascript
//       - image/svg+xml

// Custom compression with algorithm selection:
@Component
public class AdaptiveCompressionFilter implements WebFilter {

    private static final int MIN_SIZE = 1024;

    @Override
    public Mono<Void> filter(ServerWebExchange exchange, WebFilterChain chain) {
        return chain.filter(exchange.mutate()
            .response(new CompressingResponseDecorator(
                exchange.getResponse(),
                selectAlgorithm(exchange.getRequest())))
            .build());
    }

    private CompressionAlgorithm selectAlgorithm(ServerHttpRequest request) {
        String accept = request.getHeaders().getFirst("Accept-Encoding");
        if (accept == null) return CompressionAlgorithm.NONE;

        // Prefer Zstandard for internal services (custom header signals internal)
        if (request.getHeaders().containsKey("X-Internal-Service")) {
            if (accept.contains("zstd")) return CompressionAlgorithm.ZSTD_3;
        }

        // For browsers: Brotli > Gzip
        if (accept.contains("br")) return CompressionAlgorithm.BROTLI_4;
        if (accept.contains("gzip")) return CompressionAlgorithm.GZIP_6;

        return CompressionAlgorithm.NONE;
    }
}

enum CompressionAlgorithm {
    NONE, GZIP_6, BROTLI_4, ZSTD_3
}

CDN Pre-Compression for Static Assets

Static assets (CSS, JS, fonts) compress once and serve millions of times. Use maximum compression:

// Build script: pre-compress all static assets
// Runs as part of CI/CD pipeline after asset bundling

public class StaticAssetCompressor {

    public static void compressStaticAssets(Path assetsDir) throws IOException {
        try (var files = Files.walk(assetsDir)) {
            files.filter(Files::isRegularFile)
                .filter(StaticAssetCompressor::isCompressible)
                .forEach(StaticAssetCompressor::compressFile);
        }
    }

    private static void compressFile(Path file) {
        try {
            byte[] original = Files.readAllBytes(file);
            if (original.length < 1024) return;  // Skip tiny files

            // Brotli level 11 (maximum compression, slow but one-time cost)
            byte[] brotli = Encoder.compress(original,
                new Encoder.Parameters().setQuality(11));
            Files.write(Path.of(file + ".br"), brotli);

            // Gzip level 9 (fallback for old clients)
            byte[] gzip = compressGzip(original, 9);
            Files.write(Path.of(file + ".gz"), gzip);

            double brotliRatio = (double) original.length / brotli.length;
            System.out.printf("%s: %d -> %d bytes (%.1fx Brotli)%n",
                file.getFileName(), original.length, brotli.length, brotliRatio);
        } catch (IOException e) {
            throw new UncheckedIOException(e);
        }
    }

    private static boolean isCompressible(Path file) {
        String name = file.toString().toLowerCase();
        return name.endsWith(".js") || name.endsWith(".css")
            || name.endsWith(".html") || name.endsWith(".svg")
            || name.endsWith(".json") || name.endsWith(".xml");
    }
}

// Nginx configuration to serve pre-compressed files:
// location /assets/ {
//     gzip_static on;       # Serve .gz files if Accept-Encoding: gzip
//     brotli_static on;     # Serve .br files if Accept-Encoding: br
//     expires 1y;           # Static assets are versioned, cache forever
//     add_header Cache-Control "public, immutable";
// }

Compression Level Decision Matrix

Payload Type        Algorithm   Level   Rationale
Dynamic API (<5KB)  Zstandard   3       Best speed/ratio at small sizes
Dynamic API (5-50KB) Brotli     4       Dictionary advantage outweighs speed gap
Dynamic API (>50KB) Brotli      4       Ratio matters more for large payloads
Static assets       Brotli     11       Compress once, serve forever
Internal gRPC       Zstandard   3+dict  Dictionary unlocks 9x+ on structured data
WebSocket frames    Zstandard   1       Per-message compression, speed critical
SSE events          None        -       Too small (<200 bytes), overhead > savings

At the content platform’s scale (50,000 requests/sec), the compression choice determines whether the server spends 3.4ms/sec (Zstandard-3) or 9.25ms/sec (Gzip-6) on compression. That 5.85ms/sec difference frees one CPU core for application logic at 170,000 requests/sec capacity planning.