Reducing Chattiness: Aggregation, Pagination, and Conditional Requests

The main chapter introduced the N+1 API problem and showed the aggregation endpoint cutting 8 requests to 1. This section covers the full anti-chattiness toolkit: BFF architecture for different clients, cursor pagination that eliminates COUNT(*), ETags that prevent redundant transfers, and batching patterns that collapse multiple resource fetches into single round trips.

Backend-For-Frontend Architecture

Different clients need different data shapes. The mobile app needs a compact home feed. The web app needs a richer layout with previews. The internal analytics dashboard needs raw metrics. Serving all three from generic CRUD endpoints forces each client into chatty patterns:

// Without BFF: Each client assembles its view from generic endpoints
//
// Mobile app (home screen):
//   GET /api/articles?page_size=10&fields=title,thumbnail_url
//   GET /api/articles/trending?limit=5
//   GET /api/recommendations?limit=10
//   GET /api/notifications/count
//   Total: 4 requests, 28KB transferred, 260ms P50
//
// Web app (home page):
//   GET /api/articles?page_size=20
//   GET /api/articles/trending?limit=10
//   GET /api/recommendations?limit=15
//   GET /api/categories/popular
//   GET /api/user/reading-progress
//   GET /api/articles/bookmarked?limit=5
//   Total: 6 requests, 68KB transferred, 380ms P50
//
// Analytics dashboard:
//   GET /api/metrics/views?period=24h
//   GET /api/metrics/engagement?period=24h
//   GET /api/metrics/top-articles?limit=50
//   GET /api/metrics/referrers
//   GET /api/metrics/geographic
//   Total: 5 requests, 125KB transferred, 450ms P50

// With BFF: One endpoint per client surface
@RestController
@RequestMapping("/api/bff")
public class BffController {

    @GetMapping("/mobile/home")
    public ResponseEntity<MobileHomeFeed> mobileHome(
            @AuthenticationPrincipal UserPrincipal user) {

        var feed = mobileFeedAssembler.assemble(user.id());
        return ResponseEntity.ok()
            .cacheControl(CacheControl.maxAge(30, TimeUnit.SECONDS))
            .eTag(feed.etag())
            .body(feed);
    }

    @GetMapping("/web/home")
    public ResponseEntity<WebHomeFeed> webHome(
            @AuthenticationPrincipal UserPrincipal user) {

        var feed = webFeedAssembler.assemble(user.id());
        return ResponseEntity.ok()
            .cacheControl(CacheControl.maxAge(30, TimeUnit.SECONDS))
            .eTag(feed.etag())
            .body(feed);
    }

    @GetMapping("/dashboard/overview")
    public ResponseEntity<DashboardOverview> dashboardOverview() {
        var overview = dashboardAssembler.assemble();
        return ResponseEntity.ok()
            .cacheControl(CacheControl.maxAge(60, TimeUnit.SECONDS))
            .eTag(overview.etag())
            .body(overview);
    }
}

The assembler pattern keeps BFF endpoints thin:

@Component
public class MobileFeedAssembler {

    private final ArticleService articleService;
    private final RecommendationService recommendationService;
    private final NotificationService notificationService;

    public MobileHomeFeed assemble(String userId) {
        // Concurrent internal calls (in-process or gRPC, not HTTP)
        var articles = articleService.list(10, null,
            Set.of("title", "thumbnail_url", "view_count"));
        var trending = articleService.trending(5);
        var recs = recommendationService.forUser(userId, 10);
        var notifCount = notificationService.unreadCount(userId);

        return new MobileHomeFeed(articles, trending, recs, notifCount);
    }
}

// MobileHomeFeed: single response replacing 4 separate API calls
// Size: 18KB (vs 28KB from 4 separate responses with redundant headers)
// Latency: 75ms P50 (bounded by slowest subsystem, parallel execution)
// vs 260ms P50 (4 sequential calls over network)

Cursor Pagination Implementation

Offset pagination is a hidden performance drain. The cost grows linearly with page depth:

// SLOW: Offset pagination performance degradation
// Page 1:   SELECT ... OFFSET 0   LIMIT 50  ->  1.2ms
// Page 10:  SELECT ... OFFSET 450 LIMIT 50  ->  3.8ms
// Page 100: SELECT ... OFFSET 4950 LIMIT 50 -> 28.4ms
// Page 1000: SELECT ... OFFSET 49950 LIMIT 50 -> 245ms
//
// Why: Database scans and discards OFFSET rows before returning LIMIT rows
// Plus: COUNT(*) for total_count adds 12-45ms depending on table size

// FAST: Cursor pagination with keyset (constant time regardless of depth)
// Any page: SELECT ... WHERE id < :cursor ORDER BY id DESC LIMIT 50 -> 1.1ms
// No COUNT(*) needed

@Repository
public class ArticleCursorRepository {

    private final JdbcTemplate jdbc;

    public CursorPage<ArticleSummary> findArticles(
            String cursor, int pageSize, List<String> categories) {

        var params = new MapSqlParameterSource();
        params.addValue("limit", pageSize + 1);  // Fetch one extra to detect hasMore

        StringBuilder sql = new StringBuilder("""
            SELECT id, title, excerpt, view_count, published_at, 
                   categories, author, thumbnail_url
            FROM articles
            WHERE 1=1
            """);

        if (cursor != null) {
            // Composite cursor: (published_at, id) for deterministic ordering
            CursorValue decoded = decodeCursor(cursor);
            sql.append("""
                AND (published_at, id) < (:cursor_time, :cursor_id)
                """);
            params.addValue("cursor_time", decoded.publishedAt());
            params.addValue("cursor_id", decoded.id());
        }

        if (categories != null && !categories.isEmpty()) {
            sql.append("AND categories && :categories ");
            params.addValue("categories", categories.toArray(String[]::new));
        }

        sql.append("ORDER BY published_at DESC, id DESC LIMIT :limit");

        List<ArticleSummary> results = jdbc.query(sql.toString(), params, articleRowMapper);

        boolean hasMore = results.size() > pageSize;
        if (hasMore) {
            results = results.subList(0, pageSize);
        }

        String nextCursor = hasMore
            ? encodeCursor(results.getLast().publishedAt(), results.getLast().id())
            : null;

        return new CursorPage<>(results, nextCursor, hasMore);
    }

    private String encodeCursor(Instant publishedAt, String id) {
        // Opaque, tamper-evident cursor
        String raw = publishedAt.toEpochMilli() + "|" + id;
        return Base64.getUrlEncoder().withoutPadding()
            .encodeToString(raw.getBytes(StandardCharsets.UTF_8));
    }

    private CursorValue decodeCursor(String cursor) {
        String raw = new String(
            Base64.getUrlDecoder().decode(cursor), StandardCharsets.UTF_8);
        String[] parts = raw.split("\\|", 2);
        return new CursorValue(
            Instant.ofEpochMilli(Long.parseLong(parts[0])),
            parts[1]
        );
    }
}

record CursorValue(Instant publishedAt, String id) {}
record CursorPage<T>(List<T> items, String nextCursor, boolean hasMore) {}

Performance comparison at various page depths:

Page depth   Offset P50   Cursor P50   Offset DB time   Cursor DB time
1            8ms          7ms           1.2ms            1.1ms
10           12ms         7ms           3.8ms            1.1ms
100          38ms         7ms          28.4ms            1.1ms
1000        255ms         7ms         245.0ms            1.1ms

// Cursor pagination is O(1) regardless of page depth
// Offset pagination is O(n) where n = offset value

ETag Implementation Patterns

ETags prevent transferring unchanged data. The content platform has three categories of data with different freshness patterns:

// Pattern 1: Content-based ETag (hash of response body)
// Use for: computed responses where you already have the data in memory
@GetMapping("/api/categories/popular")
public ResponseEntity<List<Category>> getPopularCategories(
        WebRequest request) {

    List<Category> categories = categoryService.getPopular(20);

    // Strong ETag: content hash guarantees byte-identical response
    String etag = "\"" + computeHash(categories) + "\"";

    if (request.checkNotModified(etag)) {
        return null;  // Spring returns 304 automatically
    }

    return ResponseEntity.ok()
        .eTag(etag)
        .cacheControl(CacheControl.maxAge(60, TimeUnit.SECONDS).mustRevalidate())
        .body(categories);
}

private String computeHash(Object obj) {
    try {
        byte[] json = objectMapper.writeValueAsBytes(obj);
        byte[] hash = MessageDigest.getInstance("SHA-256").digest(json);
        return HexFormat.of().formatHex(hash, 0, 8);  // First 8 bytes = 16 hex chars
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}

// Pattern 2: Version-based ETag (database version counter)
// Use for: single resources with a version column
@GetMapping("/api/articles/{id}")
public ResponseEntity<ArticleDetail> getArticle(
        @PathVariable String id, WebRequest request) {

    // Cheap version check (single column, indexed)
    long version = articleRepository.getVersion(id);
    String etag = "\"art-" + id + "-v" + version + "\"";

    if (request.checkNotModified(etag)) {
        return null;  // 304, no serialization or full query needed
    }

    // Full query only if ETag does not match
    ArticleDetail article = articleRepository.findById(id);
    return ResponseEntity.ok()
        .eTag(etag)
        .cacheControl(CacheControl.maxAge(300, TimeUnit.SECONDS).mustRevalidate())
        .body(article);
}

// Pattern 3: Time-based weak ETag (for aggregated/computed data)
// Use for: feeds that update periodically but exact byte match not needed
@GetMapping("/api/bff/mobile/home")
public ResponseEntity<MobileHomeFeed> mobileHome(
        @AuthenticationPrincipal UserPrincipal user, WebRequest request) {

    // Weak ETag: data may differ slightly but is semantically equivalent
    Instant lastUpdate = feedService.lastUpdateTime(user.id());
    String etag = "W/\"feed-" + lastUpdate.getEpochSecond() + "\"";

    if (request.checkNotModified(etag)) {
        return null;
    }

    var feed = mobileFeedAssembler.assemble(user.id());
    return ResponseEntity.ok()
        .eTag(etag)
        .cacheControl(CacheControl.maxAge(30, TimeUnit.SECONDS).mustRevalidate())
        .body(feed);
}

Measured savings in production:

Endpoint                    Hit Rate   Avg Response   304 Response   Bandwidth Saved
/api/categories/popular     82%        10,240 B       98 B           82% * 10,142 = 8.3KB/req
/api/user/preferences       91%         2,180 B       98 B           91% * 2,082 = 1.9KB/req
/api/articles/{id}          45%       148,600 B       98 B           45% * 148,502 = 66.8KB/req
/api/bff/mobile/home        38%        18,400 B       98 B           38% * 18,302 = 7.0KB/req

Total bandwidth saved per user session (avg 25 requests):
  Without ETags:  412KB transferred
  With ETags:     198KB transferred (52% reduction)

Request Batching

The article detail page shows 5 related articles. Without batching, that requires 5 separate requests to fetch metadata for each related article:

// SLOW: N+1 pattern for related articles
// Frontend fetches article, gets related IDs, then fetches each
//
// GET /api/articles/art-123         -> { ..., related: ["art-456","art-789",...] }
// GET /api/articles/art-456/summary -> { title, thumbnail_url, excerpt }
// GET /api/articles/art-789/summary -> { title, thumbnail_url, excerpt }
// GET /api/articles/art-012/summary -> ...
// GET /api/articles/art-345/summary -> ...
// GET /api/articles/art-678/summary -> ...
// Total: 6 requests, 6 round trips (even with HTTP/2, server processes 6 times)

// FAST: Batch endpoint
// GET /api/articles/batch?ids=art-456,art-789,art-012,art-345,art-678&fields=title,thumbnail_url,excerpt
// Total: 1 request, 1 round trip, 1 DB query with IN clause

@GetMapping("/api/articles/batch")
public ResponseEntity<Map<String, ArticleSummary>> batchGetArticles(
        @RequestParam List<String> ids,
        @RequestParam(required = false) Set<String> fields) {

    if (ids.size() > 100) {
        return ResponseEntity.badRequest().build();  // Prevent abuse
    }

    Map<String, ArticleSummary> articles = articleRepository.findByIds(ids);

    if (fields != null && !fields.isEmpty()) {
        articles = articles.entrySet().stream()
            .collect(Collectors.toMap(
                Map.Entry::getKey,
                e -> e.getValue().withFieldMask(fields)
            ));
    }

    return ResponseEntity.ok()
        .cacheControl(CacheControl.maxAge(300, TimeUnit.SECONDS))
        .body(articles);
}

// Database query: single IN clause (much faster than 5 separate queries)
// SELECT id, title, excerpt, thumbnail_url
// FROM articles
// WHERE id IN ('art-456','art-789','art-012','art-345','art-678')
//
// 1 query: 1.8ms (vs 5 queries: 5 * 1.2ms = 6ms + 5 round trips)

Cache-Control Strategy

Different data types require different caching strategies. Over-caching causes stale data; under-caching wastes bandwidth:

// Cache-Control decision matrix for content platform:

@Component
public class CacheControlAdvice {

    // Immutable: versioned static assets (CSS, JS with hash in filename)
    // Cache-Control: public, max-age=31536000, immutable
    public CacheControl immutable() {
        return CacheControl.maxAge(365, TimeUnit.DAYS)
            .cachePublic()
            .immutable();
    }

    // Short-lived: personalized dynamic content (home feed)
    // Cache-Control: private, max-age=30, must-revalidate
    public CacheControl personalizedShortLived() {
        return CacheControl.maxAge(30, TimeUnit.SECONDS)
            .mustRevalidate()
            .cachePrivate();
    }

    // Medium-lived: shared reference data (categories, popular tags)
    // Cache-Control: public, max-age=300, stale-while-revalidate=60
    public CacheControl sharedReference() {
        return CacheControl.maxAge(300, TimeUnit.SECONDS)
            .cachePublic()
            .staleWhileRevalidate(60, TimeUnit.SECONDS);
    }

    // Article content: changes rarely after publication
    // Cache-Control: public, max-age=3600, stale-while-revalidate=300
    public CacheControl articleContent() {
        return CacheControl.maxAge(1, TimeUnit.HOURS)
            .cachePublic()
            .staleWhileRevalidate(300, TimeUnit.SECONDS);
    }

    // User-specific mutable data: preferences, reading progress
    // Cache-Control: private, no-cache (always revalidate with ETag)
    public CacheControl userMutable() {
        return CacheControl.noCache().cachePrivate();
    }
}

The stale-while-revalidate directive is critical for perceived performance. When the max-age expires, the browser serves the stale response immediately while revalidating in the background:

Without stale-while-revalidate:
  User requests /api/categories (max-age expired)
  -> Browser waits for server response (180ms)
  -> User sees 180ms delay

With stale-while-revalidate=60:
  User requests /api/categories (max-age expired, within SWR window)
  -> Browser serves stale response immediately (0ms delay)
  -> Browser revalidates in background
  -> Next request gets fresh data
  -> User sees 0ms delay (stale data acceptable for categories)

Locust Benchmark: Chatty vs Optimized

Full benchmark comparing the chatty client pattern against the optimized BFF pattern:

# locust_chattiness_benchmark.py
from locust import HttpUser, task, between, events
import time

class ChattyMobileClient(HttpUser):
    """Simulates mobile app making separate API calls."""
    wait_time = between(2, 5)

    @task
    def home_screen(self):
        start = time.time()

        # Chatty: 8 separate calls
        self.client.get("/api/articles?page_size=10", name="articles")
        self.client.get("/api/articles/trending?limit=5", name="trending")
        self.client.get("/api/recommendations?limit=10", name="recs")
        self.client.get("/api/user/reading-progress", name="progress")
        self.client.get("/api/categories/popular", name="categories")
        self.client.get("/api/user/preferences", name="prefs")
        self.client.get("/api/notifications/unread-count", name="notifs")
        self.client.get("/api/articles/bookmarked?limit=5", name="bookmarks")

        elapsed = time.time() - start
        events.request.fire(
            request_type="PAGE", name="chatty-home",
            response_time=elapsed * 1000, response_length=0,
            exception=None, context={}
        )


class OptimizedMobileClient(HttpUser):
    """Simulates mobile app using BFF aggregation + ETags."""
    wait_time = between(2, 5)
    etag = None

    @task
    def home_screen(self):
        start = time.time()

        headers = {}
        if self.etag:
            headers["If-None-Match"] = self.etag

        # Single BFF call
        resp = self.client.get(
            "/api/bff/mobile/home",
            headers=headers,
            name="bff-home"
        )

        if resp.status_code == 200:
            self.etag = resp.headers.get("ETag")
        # 304: no body transferred, etag unchanged

        elapsed = time.time() - start
        events.request.fire(
            request_type="PAGE", name="optimized-home",
            response_time=elapsed * 1000, response_length=0,
            exception=None, context={}
        )

Results at 1000 concurrent mobile users, 80ms RTT:

Metric                    Chatty (8 calls)    Optimized (BFF+ETag)
Page load P50:            340ms               95ms
Page load P99:            890ms               220ms
Requests/sec (total):     24,000              3,200 (fewer requests needed)
Bandwidth per page view:  52.4KB              18.4KB (first load)
Bandwidth (repeat visit): 52.4KB              0.1KB (304 response)
Server CPU at 1000 users: 68%                 22%
DB queries per page view: 12                  4
Connection overhead:      8 * HPACK headers   1 * HPACK headers

Improvements:
  P50 latency:  3.6x faster
  P99 latency:  4.0x faster
  Bandwidth:    2.8x less (first visit), 524x less (repeat visit with 304)
  Server CPU:   3.1x less
  DB queries:   3.0x fewer

Combining All Techniques

The full optimization stack for the content platform mobile home screen:

// Layer 1: BFF aggregation (8 calls -> 1 call)
// Layer 2: Field selection (only fields mobile needs)
// Layer 3: Cursor pagination (no COUNT(*) overhead)
// Layer 4: ETag conditional request (304 on repeat)
// Layer 5: Zstandard compression (best speed/ratio for internal)
// Layer 6: Brotli compression (best ratio for browser client)
// Layer 7: Cache-Control with stale-while-revalidate

@GetMapping("/api/bff/mobile/home")
public ResponseEntity<MobileHomeFeed> mobileHome(
        @AuthenticationPrincipal UserPrincipal user,
        WebRequest request) {

    // Layer 4: Check ETag before doing any work
    String etag = feedVersionService.currentETag(user.id());
    if (request.checkNotModified(etag)) {
        return null;  // 304: zero body bytes, ~100 bytes headers
    }

    // Layer 1: Aggregate all data in one server-side operation
    // Layer 2: Field selection built into assembler
    // Layer 3: Cursor pagination used internally
    var feed = mobileFeedAssembler.assemble(user.id());

    // Layer 7: Cache headers
    return ResponseEntity.ok()
        .eTag(etag)
        .cacheControl(CacheControl.maxAge(30, TimeUnit.SECONDS)
            .mustRevalidate()
            .cachePrivate()
            .staleWhileRevalidate(15, TimeUnit.SECONDS))
        .body(feed);
    // Layer 5/6: Compression applied by filter based on Accept-Encoding
}

End-to-end payload progression:

Starting point (naive):
  8 requests, 52.4KB, 340ms P50

After BFF aggregation:
  1 request, 38.2KB, 95ms P50

After field selection:
  1 request, 22.1KB, 92ms P50

After Brotli-4 compression:
  1 request, 3.1KB, 91ms P50

After ETag on repeat visit:
  1 request, 0.1KB, 12ms P50 (304 response)

Total improvement:
  Requests: 8x fewer
  Bandwidth (first): 16.9x less
  Bandwidth (repeat): 524x less
  Latency (first): 3.7x faster
  Latency (repeat): 28x faster

Each technique compounds. Compression works better on the aggregated payload (one large JSON compresses better than 8 small ones due to more repeated patterns). ETags eliminate the entire transfer on repeat visits. Field selection reduces what needs to be compressed. The result: a mobile user on 3G experiences the content platform as responsive rather than sluggish.