Reducing Chattiness: Aggregation, Pagination, and Conditional Requests
Reducing Chattiness: Aggregation, Pagination, and Conditional Requests
The main chapter introduced the N+1 API problem and showed the aggregation endpoint cutting 8 requests to 1. This section covers the full anti-chattiness toolkit: BFF architecture for different clients, cursor pagination that eliminates COUNT(*), ETags that prevent redundant transfers, and batching patterns that collapse multiple resource fetches into single round trips.
Backend-For-Frontend Architecture
Different clients need different data shapes. The mobile app needs a compact home feed. The web app needs a richer layout with previews. The internal analytics dashboard needs raw metrics. Serving all three from generic CRUD endpoints forces each client into chatty patterns:
// Without BFF: Each client assembles its view from generic endpoints
//
// Mobile app (home screen):
// GET /api/articles?page_size=10&fields=title,thumbnail_url
// GET /api/articles/trending?limit=5
// GET /api/recommendations?limit=10
// GET /api/notifications/count
// Total: 4 requests, 28KB transferred, 260ms P50
//
// Web app (home page):
// GET /api/articles?page_size=20
// GET /api/articles/trending?limit=10
// GET /api/recommendations?limit=15
// GET /api/categories/popular
// GET /api/user/reading-progress
// GET /api/articles/bookmarked?limit=5
// Total: 6 requests, 68KB transferred, 380ms P50
//
// Analytics dashboard:
// GET /api/metrics/views?period=24h
// GET /api/metrics/engagement?period=24h
// GET /api/metrics/top-articles?limit=50
// GET /api/metrics/referrers
// GET /api/metrics/geographic
// Total: 5 requests, 125KB transferred, 450ms P50
// With BFF: One endpoint per client surface
@RestController
@RequestMapping("/api/bff")
public class BffController {
@GetMapping("/mobile/home")
public ResponseEntity<MobileHomeFeed> mobileHome(
@AuthenticationPrincipal UserPrincipal user) {
var feed = mobileFeedAssembler.assemble(user.id());
return ResponseEntity.ok()
.cacheControl(CacheControl.maxAge(30, TimeUnit.SECONDS))
.eTag(feed.etag())
.body(feed);
}
@GetMapping("/web/home")
public ResponseEntity<WebHomeFeed> webHome(
@AuthenticationPrincipal UserPrincipal user) {
var feed = webFeedAssembler.assemble(user.id());
return ResponseEntity.ok()
.cacheControl(CacheControl.maxAge(30, TimeUnit.SECONDS))
.eTag(feed.etag())
.body(feed);
}
@GetMapping("/dashboard/overview")
public ResponseEntity<DashboardOverview> dashboardOverview() {
var overview = dashboardAssembler.assemble();
return ResponseEntity.ok()
.cacheControl(CacheControl.maxAge(60, TimeUnit.SECONDS))
.eTag(overview.etag())
.body(overview);
}
}
The assembler pattern keeps BFF endpoints thin:
@Component
public class MobileFeedAssembler {
private final ArticleService articleService;
private final RecommendationService recommendationService;
private final NotificationService notificationService;
public MobileHomeFeed assemble(String userId) {
// Concurrent internal calls (in-process or gRPC, not HTTP)
var articles = articleService.list(10, null,
Set.of("title", "thumbnail_url", "view_count"));
var trending = articleService.trending(5);
var recs = recommendationService.forUser(userId, 10);
var notifCount = notificationService.unreadCount(userId);
return new MobileHomeFeed(articles, trending, recs, notifCount);
}
}
// MobileHomeFeed: single response replacing 4 separate API calls
// Size: 18KB (vs 28KB from 4 separate responses with redundant headers)
// Latency: 75ms P50 (bounded by slowest subsystem, parallel execution)
// vs 260ms P50 (4 sequential calls over network)
Cursor Pagination Implementation
Offset pagination is a hidden performance drain. The cost grows linearly with page depth:
// SLOW: Offset pagination performance degradation
// Page 1: SELECT ... OFFSET 0 LIMIT 50 -> 1.2ms
// Page 10: SELECT ... OFFSET 450 LIMIT 50 -> 3.8ms
// Page 100: SELECT ... OFFSET 4950 LIMIT 50 -> 28.4ms
// Page 1000: SELECT ... OFFSET 49950 LIMIT 50 -> 245ms
//
// Why: Database scans and discards OFFSET rows before returning LIMIT rows
// Plus: COUNT(*) for total_count adds 12-45ms depending on table size
// FAST: Cursor pagination with keyset (constant time regardless of depth)
// Any page: SELECT ... WHERE id < :cursor ORDER BY id DESC LIMIT 50 -> 1.1ms
// No COUNT(*) needed
@Repository
public class ArticleCursorRepository {
private final JdbcTemplate jdbc;
public CursorPage<ArticleSummary> findArticles(
String cursor, int pageSize, List<String> categories) {
var params = new MapSqlParameterSource();
params.addValue("limit", pageSize + 1); // Fetch one extra to detect hasMore
StringBuilder sql = new StringBuilder("""
SELECT id, title, excerpt, view_count, published_at,
categories, author, thumbnail_url
FROM articles
WHERE 1=1
""");
if (cursor != null) {
// Composite cursor: (published_at, id) for deterministic ordering
CursorValue decoded = decodeCursor(cursor);
sql.append("""
AND (published_at, id) < (:cursor_time, :cursor_id)
""");
params.addValue("cursor_time", decoded.publishedAt());
params.addValue("cursor_id", decoded.id());
}
if (categories != null && !categories.isEmpty()) {
sql.append("AND categories && :categories ");
params.addValue("categories", categories.toArray(String[]::new));
}
sql.append("ORDER BY published_at DESC, id DESC LIMIT :limit");
List<ArticleSummary> results = jdbc.query(sql.toString(), params, articleRowMapper);
boolean hasMore = results.size() > pageSize;
if (hasMore) {
results = results.subList(0, pageSize);
}
String nextCursor = hasMore
? encodeCursor(results.getLast().publishedAt(), results.getLast().id())
: null;
return new CursorPage<>(results, nextCursor, hasMore);
}
private String encodeCursor(Instant publishedAt, String id) {
// Opaque, tamper-evident cursor
String raw = publishedAt.toEpochMilli() + "|" + id;
return Base64.getUrlEncoder().withoutPadding()
.encodeToString(raw.getBytes(StandardCharsets.UTF_8));
}
private CursorValue decodeCursor(String cursor) {
String raw = new String(
Base64.getUrlDecoder().decode(cursor), StandardCharsets.UTF_8);
String[] parts = raw.split("\\|", 2);
return new CursorValue(
Instant.ofEpochMilli(Long.parseLong(parts[0])),
parts[1]
);
}
}
record CursorValue(Instant publishedAt, String id) {}
record CursorPage<T>(List<T> items, String nextCursor, boolean hasMore) {}
Performance comparison at various page depths:
Page depth Offset P50 Cursor P50 Offset DB time Cursor DB time
1 8ms 7ms 1.2ms 1.1ms
10 12ms 7ms 3.8ms 1.1ms
100 38ms 7ms 28.4ms 1.1ms
1000 255ms 7ms 245.0ms 1.1ms
// Cursor pagination is O(1) regardless of page depth
// Offset pagination is O(n) where n = offset value
ETag Implementation Patterns
ETags prevent transferring unchanged data. The content platform has three categories of data with different freshness patterns:
// Pattern 1: Content-based ETag (hash of response body)
// Use for: computed responses where you already have the data in memory
@GetMapping("/api/categories/popular")
public ResponseEntity<List<Category>> getPopularCategories(
WebRequest request) {
List<Category> categories = categoryService.getPopular(20);
// Strong ETag: content hash guarantees byte-identical response
String etag = "\"" + computeHash(categories) + "\"";
if (request.checkNotModified(etag)) {
return null; // Spring returns 304 automatically
}
return ResponseEntity.ok()
.eTag(etag)
.cacheControl(CacheControl.maxAge(60, TimeUnit.SECONDS).mustRevalidate())
.body(categories);
}
private String computeHash(Object obj) {
try {
byte[] json = objectMapper.writeValueAsBytes(obj);
byte[] hash = MessageDigest.getInstance("SHA-256").digest(json);
return HexFormat.of().formatHex(hash, 0, 8); // First 8 bytes = 16 hex chars
} catch (Exception e) {
throw new RuntimeException(e);
}
}
// Pattern 2: Version-based ETag (database version counter)
// Use for: single resources with a version column
@GetMapping("/api/articles/{id}")
public ResponseEntity<ArticleDetail> getArticle(
@PathVariable String id, WebRequest request) {
// Cheap version check (single column, indexed)
long version = articleRepository.getVersion(id);
String etag = "\"art-" + id + "-v" + version + "\"";
if (request.checkNotModified(etag)) {
return null; // 304, no serialization or full query needed
}
// Full query only if ETag does not match
ArticleDetail article = articleRepository.findById(id);
return ResponseEntity.ok()
.eTag(etag)
.cacheControl(CacheControl.maxAge(300, TimeUnit.SECONDS).mustRevalidate())
.body(article);
}
// Pattern 3: Time-based weak ETag (for aggregated/computed data)
// Use for: feeds that update periodically but exact byte match not needed
@GetMapping("/api/bff/mobile/home")
public ResponseEntity<MobileHomeFeed> mobileHome(
@AuthenticationPrincipal UserPrincipal user, WebRequest request) {
// Weak ETag: data may differ slightly but is semantically equivalent
Instant lastUpdate = feedService.lastUpdateTime(user.id());
String etag = "W/\"feed-" + lastUpdate.getEpochSecond() + "\"";
if (request.checkNotModified(etag)) {
return null;
}
var feed = mobileFeedAssembler.assemble(user.id());
return ResponseEntity.ok()
.eTag(etag)
.cacheControl(CacheControl.maxAge(30, TimeUnit.SECONDS).mustRevalidate())
.body(feed);
}
Measured savings in production:
Endpoint Hit Rate Avg Response 304 Response Bandwidth Saved
/api/categories/popular 82% 10,240 B 98 B 82% * 10,142 = 8.3KB/req
/api/user/preferences 91% 2,180 B 98 B 91% * 2,082 = 1.9KB/req
/api/articles/{id} 45% 148,600 B 98 B 45% * 148,502 = 66.8KB/req
/api/bff/mobile/home 38% 18,400 B 98 B 38% * 18,302 = 7.0KB/req
Total bandwidth saved per user session (avg 25 requests):
Without ETags: 412KB transferred
With ETags: 198KB transferred (52% reduction)
Request Batching
The article detail page shows 5 related articles. Without batching, that requires 5 separate requests to fetch metadata for each related article:
// SLOW: N+1 pattern for related articles
// Frontend fetches article, gets related IDs, then fetches each
//
// GET /api/articles/art-123 -> { ..., related: ["art-456","art-789",...] }
// GET /api/articles/art-456/summary -> { title, thumbnail_url, excerpt }
// GET /api/articles/art-789/summary -> { title, thumbnail_url, excerpt }
// GET /api/articles/art-012/summary -> ...
// GET /api/articles/art-345/summary -> ...
// GET /api/articles/art-678/summary -> ...
// Total: 6 requests, 6 round trips (even with HTTP/2, server processes 6 times)
// FAST: Batch endpoint
// GET /api/articles/batch?ids=art-456,art-789,art-012,art-345,art-678&fields=title,thumbnail_url,excerpt
// Total: 1 request, 1 round trip, 1 DB query with IN clause
@GetMapping("/api/articles/batch")
public ResponseEntity<Map<String, ArticleSummary>> batchGetArticles(
@RequestParam List<String> ids,
@RequestParam(required = false) Set<String> fields) {
if (ids.size() > 100) {
return ResponseEntity.badRequest().build(); // Prevent abuse
}
Map<String, ArticleSummary> articles = articleRepository.findByIds(ids);
if (fields != null && !fields.isEmpty()) {
articles = articles.entrySet().stream()
.collect(Collectors.toMap(
Map.Entry::getKey,
e -> e.getValue().withFieldMask(fields)
));
}
return ResponseEntity.ok()
.cacheControl(CacheControl.maxAge(300, TimeUnit.SECONDS))
.body(articles);
}
// Database query: single IN clause (much faster than 5 separate queries)
// SELECT id, title, excerpt, thumbnail_url
// FROM articles
// WHERE id IN ('art-456','art-789','art-012','art-345','art-678')
//
// 1 query: 1.8ms (vs 5 queries: 5 * 1.2ms = 6ms + 5 round trips)
Cache-Control Strategy
Different data types require different caching strategies. Over-caching causes stale data; under-caching wastes bandwidth:
// Cache-Control decision matrix for content platform:
@Component
public class CacheControlAdvice {
// Immutable: versioned static assets (CSS, JS with hash in filename)
// Cache-Control: public, max-age=31536000, immutable
public CacheControl immutable() {
return CacheControl.maxAge(365, TimeUnit.DAYS)
.cachePublic()
.immutable();
}
// Short-lived: personalized dynamic content (home feed)
// Cache-Control: private, max-age=30, must-revalidate
public CacheControl personalizedShortLived() {
return CacheControl.maxAge(30, TimeUnit.SECONDS)
.mustRevalidate()
.cachePrivate();
}
// Medium-lived: shared reference data (categories, popular tags)
// Cache-Control: public, max-age=300, stale-while-revalidate=60
public CacheControl sharedReference() {
return CacheControl.maxAge(300, TimeUnit.SECONDS)
.cachePublic()
.staleWhileRevalidate(60, TimeUnit.SECONDS);
}
// Article content: changes rarely after publication
// Cache-Control: public, max-age=3600, stale-while-revalidate=300
public CacheControl articleContent() {
return CacheControl.maxAge(1, TimeUnit.HOURS)
.cachePublic()
.staleWhileRevalidate(300, TimeUnit.SECONDS);
}
// User-specific mutable data: preferences, reading progress
// Cache-Control: private, no-cache (always revalidate with ETag)
public CacheControl userMutable() {
return CacheControl.noCache().cachePrivate();
}
}
The stale-while-revalidate directive is critical for perceived performance. When the max-age expires, the browser serves the stale response immediately while revalidating in the background:
Without stale-while-revalidate:
User requests /api/categories (max-age expired)
-> Browser waits for server response (180ms)
-> User sees 180ms delay
With stale-while-revalidate=60:
User requests /api/categories (max-age expired, within SWR window)
-> Browser serves stale response immediately (0ms delay)
-> Browser revalidates in background
-> Next request gets fresh data
-> User sees 0ms delay (stale data acceptable for categories)
Locust Benchmark: Chatty vs Optimized
Full benchmark comparing the chatty client pattern against the optimized BFF pattern:
# locust_chattiness_benchmark.py
from locust import HttpUser, task, between, events
import time
class ChattyMobileClient(HttpUser):
"""Simulates mobile app making separate API calls."""
wait_time = between(2, 5)
@task
def home_screen(self):
start = time.time()
# Chatty: 8 separate calls
self.client.get("/api/articles?page_size=10", name="articles")
self.client.get("/api/articles/trending?limit=5", name="trending")
self.client.get("/api/recommendations?limit=10", name="recs")
self.client.get("/api/user/reading-progress", name="progress")
self.client.get("/api/categories/popular", name="categories")
self.client.get("/api/user/preferences", name="prefs")
self.client.get("/api/notifications/unread-count", name="notifs")
self.client.get("/api/articles/bookmarked?limit=5", name="bookmarks")
elapsed = time.time() - start
events.request.fire(
request_type="PAGE", name="chatty-home",
response_time=elapsed * 1000, response_length=0,
exception=None, context={}
)
class OptimizedMobileClient(HttpUser):
"""Simulates mobile app using BFF aggregation + ETags."""
wait_time = between(2, 5)
etag = None
@task
def home_screen(self):
start = time.time()
headers = {}
if self.etag:
headers["If-None-Match"] = self.etag
# Single BFF call
resp = self.client.get(
"/api/bff/mobile/home",
headers=headers,
name="bff-home"
)
if resp.status_code == 200:
self.etag = resp.headers.get("ETag")
# 304: no body transferred, etag unchanged
elapsed = time.time() - start
events.request.fire(
request_type="PAGE", name="optimized-home",
response_time=elapsed * 1000, response_length=0,
exception=None, context={}
)
Results at 1000 concurrent mobile users, 80ms RTT:
Metric Chatty (8 calls) Optimized (BFF+ETag)
Page load P50: 340ms 95ms
Page load P99: 890ms 220ms
Requests/sec (total): 24,000 3,200 (fewer requests needed)
Bandwidth per page view: 52.4KB 18.4KB (first load)
Bandwidth (repeat visit): 52.4KB 0.1KB (304 response)
Server CPU at 1000 users: 68% 22%
DB queries per page view: 12 4
Connection overhead: 8 * HPACK headers 1 * HPACK headers
Improvements:
P50 latency: 3.6x faster
P99 latency: 4.0x faster
Bandwidth: 2.8x less (first visit), 524x less (repeat visit with 304)
Server CPU: 3.1x less
DB queries: 3.0x fewer
Combining All Techniques
The full optimization stack for the content platform mobile home screen:
// Layer 1: BFF aggregation (8 calls -> 1 call)
// Layer 2: Field selection (only fields mobile needs)
// Layer 3: Cursor pagination (no COUNT(*) overhead)
// Layer 4: ETag conditional request (304 on repeat)
// Layer 5: Zstandard compression (best speed/ratio for internal)
// Layer 6: Brotli compression (best ratio for browser client)
// Layer 7: Cache-Control with stale-while-revalidate
@GetMapping("/api/bff/mobile/home")
public ResponseEntity<MobileHomeFeed> mobileHome(
@AuthenticationPrincipal UserPrincipal user,
WebRequest request) {
// Layer 4: Check ETag before doing any work
String etag = feedVersionService.currentETag(user.id());
if (request.checkNotModified(etag)) {
return null; // 304: zero body bytes, ~100 bytes headers
}
// Layer 1: Aggregate all data in one server-side operation
// Layer 2: Field selection built into assembler
// Layer 3: Cursor pagination used internally
var feed = mobileFeedAssembler.assemble(user.id());
// Layer 7: Cache headers
return ResponseEntity.ok()
.eTag(etag)
.cacheControl(CacheControl.maxAge(30, TimeUnit.SECONDS)
.mustRevalidate()
.cachePrivate()
.staleWhileRevalidate(15, TimeUnit.SECONDS))
.body(feed);
// Layer 5/6: Compression applied by filter based on Accept-Encoding
}
End-to-end payload progression:
Starting point (naive):
8 requests, 52.4KB, 340ms P50
After BFF aggregation:
1 request, 38.2KB, 95ms P50
After field selection:
1 request, 22.1KB, 92ms P50
After Brotli-4 compression:
1 request, 3.1KB, 91ms P50
After ETag on repeat visit:
1 request, 0.1KB, 12ms P50 (304 response)
Total improvement:
Requests: 8x fewer
Bandwidth (first): 16.9x less
Bandwidth (repeat): 524x less
Latency (first): 3.7x faster
Latency (repeat): 28x faster
Each technique compounds. Compression works better on the aggregated payload (one large JSON compresses better than 8 small ones due to more repeated patterns). ETags eliminate the entire transfer on repeat visits. Field selection reduces what needs to be compressed. The result: a mobile user on 3G experiences the content platform as responsive rather than sluggish.