gRPC, HTTP/2 Multiplexing, and Connection Reuse
gRPC, HTTP/2 Multiplexing, and Connection Reuse
The Black Box
The route optimizer creates a new HTTP connection for each request to the package service. At 50 requests per route and 100 routes computed per minute, the optimizer creates 5,000 TCP connections per minute. Each connection involves a TCP handshake (1 round trip), TLS handshake (2 more round trips for TLS 1.3), and HTTP negotiation. The route optimizer spends more time establishing connections than using them.
The Mechanism
HTTP/1.1 Connection Model
HTTP/1.1 allows connection reuse via keep-alive, but each connection handles one request at a time. To send 50 concurrent requests, 50 connections are needed. Each connection consumes:
- A file descriptor on both client and server
- A TCP send/receive buffer (typically 128KB per direction)
- A TLS session (approximately 10KB of memory for session state)
50 connections: 50 file descriptors, ~14MB of buffer memory per side.
HTTP/2 Multiplexing
HTTP/2 uses a single TCP connection with multiple streams. Each stream carries one request/response pair. Streams are interleaved on the connection: bytes from stream 3 can be sent between bytes from stream 7. The connection is full-duplex.
HTTP/1.1 (50 requests, keep-alive, serial):
Conn 1: [req1 →] [← resp1] [req2 →] [← resp2] ... [req50 →] [← resp50]
Total: 50 round trips × 0.5ms = 25ms minimum
HTTP/2 (50 requests, multiplexed):
Conn 1: [req1,req2,...,req50 →→→] [←←← resp1,resp2,...,resp50]
Total: ~1 round trip × 0.5ms = 0.5ms minimum (plus server processing)
gRPC uses HTTP/2 natively. Every gRPC call is an HTTP/2 stream. A single ManagedChannel in the gRPC Java client maintains one or more HTTP/2 connections and multiplexes all RPCs across them.
gRPC Channel Lifecycle
// Concept: gRPC channel configuration for the route optimizer
// One channel per target service. Reused across all RPC calls.
// Do NOT create a new channel per request.
// BLACK BOX: creating a channel per request
PackageServiceGrpc.PackageServiceBlockingStub getPackage(String host) {
ManagedChannel channel = ManagedChannelBuilder.forAddress(host, 9090).build();
return PackageServiceGrpc.newBlockingStub(channel);
// Channel is created, used once, and garbage collected.
// TCP handshake, HTTP/2 setup on every call. Terrible.
}
// MECHANISM: shared channel, reused across calls
private final ManagedChannel channel = ManagedChannelBuilder
.forAddress("package-service", 9090)
.usePlaintext()
.keepAliveTime(30, TimeUnit.SECONDS) // Send keepalive pings
.keepAliveTimeout(5, TimeUnit.SECONDS) // Close if no ping response
.maxInboundMessageSize(4 * 1024 * 1024) // 4MB max message
.build();
private final PackageServiceGrpc.PackageServiceBlockingStub stub =
PackageServiceGrpc.newBlockingStub(channel);
// The channel manages the HTTP/2 connection(s).
// All RPC calls through 'stub' are multiplexed on the same connection.
// Connection is established on first use and reused for the lifetime of the channel.
Deadline Propagation
gRPC deadlines prevent requests from waiting indefinitely. A deadline propagates from the client through intermediate services: if the route optimizer sets a 500ms deadline, the package service knows it has 500ms total, not 500ms per hop.
// Concept: gRPC deadline to prevent unbounded waiting
PackageInfo result = stub
.withDeadlineAfter(200, TimeUnit.MILLISECONDS) // 200ms deadline
.getPackage(request);
// If the package service does not respond within 200ms:
// - Client receives StatusRuntimeException with Status.DEADLINE_EXCEEDED
// - Server is notified that the client has cancelled (server can stop processing)
// - No thread is blocked waiting for a response that will be discarded
The Observable Consequence
Connection costs for the route optimizer’s 50-request batch:
| Metric | HTTP/1.1 (50 connections) | HTTP/1.1 (keep-alive, serial) | gRPC (1 connection) |
|---|---|---|---|
| TCP handshakes | 50 | 1 | 1 |
| Concurrent in-flight | 50 | 1 | 50 |
| Total round trips | 50 | 50 | 1 (batch RPC) |
| Wall clock time | 25ms + processing | 25ms + processing | 0.5ms + processing |
| Memory (buffers) | 14 MB | 280 KB | 280 KB |
| File descriptors | 50 | 1 | 1 |
The gRPC approach with a batch RPC reduces network overhead from 25ms to 0.5ms and memory usage from 14MB to 280KB. For a route optimizer computing 100 routes/minute, the savings are 2,450ms of network latency and 1.4GB of connection buffer churn per minute.
The Decision Rule
Use gRPC for internal service-to-service communication when you control both the client and server, request volume is high (hundreds+ RPCs/second), and you benefit from compile-time type safety via Protobuf schemas.
Do not use gRPC for browser-facing APIs (browsers do not natively support HTTP/2 trailers, which gRPC requires). Do not use gRPC for services that are called fewer than 10 times per minute. The tooling overhead (Protobuf codegen, gRPC stubs, channel management) is not justified for low-frequency calls. A plain HTTP/1.1 JSON endpoint is simpler to implement, debug, and monitor.
When adopting gRPC, create one ManagedChannel per target service and share it across all callers. Configure keepalive to detect dead connections. Set deadlines on every call to prevent unbounded waits.