The Web Layer: Threads and Dispatchers

The web layer in LogisticsCore is not a passive conduit for HTTP traffic—it is a critical performance and reliability boundary. Every request entering the system engages a chain of JVM and Spring Framework mechanics that must be understood at the level of thread scheduling, memory management, and I/O contention. This section dissects the request processing pipeline with a focus on operational reality: where threads block, how dispatchers route work, and what happens under load. The goal is not to describe abstractions but to expose the cost model of each design decision.

Servlet Container: The Foundation, Not the Solution

The Servlet container—Tomcat, Jetty, or Undertow—provides the execution environment for Spring Framework web applications. It manages thread pools, socket I/O, and servlet lifecycle events. In LogisticsCore, the container hosts the DispatcherServlet, which serves as the front controller. However, the container does not absolve the application of performance responsibility. Its default thread-per-request model ties one OS thread to each active request, a model that scales only as far as the thread pool allows.

Java’s platform threads (pthreads) are expensive: each consumes ~1MB of stack space and requires full context switching. Under high concurrency, thread exhaustion is not a failure mode—it is the expected outcome when blocking I/O dominates the call path. This is not a container limitation; it is a consequence of the 1:1 mapping between Java threads and OS threads.

Spring Boot configures these containers with opinionated defaults, but those defaults assume moderate load and fast dependencies. In LogisticsCore, where warehouse inventory checks and carrier rate lookups involve remote calls, those assumptions break down. The container is not the problem; the thread model it enforces is.

DispatcherServlet: Control Flow and Hidden Costs

The DispatcherServlet is the central orchestrator in Spring Framework’s MVC architecture. Its request flow is deterministic but not free:

Request Received: The container assigns a thread from its pool (e.g., Tomcat’s maxThreads=200).
Handler Mapping: The RequestMappingHandlerMapping resolves the request to a controller method. This is a memory-bound operation—fast, but subject to classpath scanning overhead if not precomputed.
Handler Adapter: The RequestMappingHandlerAdapter invokes the method. This component handles argument resolution (e.g., @RequestBody deserialization), which may involve blocking I/O if the JSON parser is not configured for non-blocking streams.
Interceptors: HandlerInterceptors run pre- and post-handling. If an interceptor calls a metrics service or auth server synchronously, the thread blocks.
View Resolution and Rendering: Rare in LogisticsCore’s JSON-only APIs, but if used, template engines like Thymeleaf perform blocking file I/O.

Every phase in this chain is synchronous and runs on the container thread. There is no backpressure. There is no automatic offloading. If any step blocks—database query, cache miss, external API—the thread is lost to useful work.

This is not a flaw in Spring Framework; it is the contract. The framework provides extension points, but it does not change the underlying execution model. The burden is on the developer to ensure that no blocking call enters this path unchecked.

Blocking I/O: The Scalability Ceiling

Blocking I/O is the primary constraint on throughput in traditional Servlet applications. When a thread waits for a PostgreSQL query or a REST call to a shipping provider, it does not yield. The OS thread remains mapped, stack allocated, and scheduler-eligible, even though it is idle.

In LogisticsCore, a single /inventory/check endpoint that blocks on a 200ms database round-trip limits throughput to at most 5 requests per thread per second. With a 200-thread pool, maximum theoretical throughput is 1,000 RPS. Real-world contention reduces this further. This is not a database bottleneck—it is a thread exhaustion bottleneck.

Virtual Threads: A Mechanistic Fix

Java 21’s Virtual Threads provide a direct solution to the thread scarcity problem. Unlike platform threads, virtual threads are managed by the JVM, not the OS. They are allocated from a fork-join pool and multiplexed onto a small set of carrier threads. Creating a virtual thread costs microseconds, not megabytes.

When the Servlet container is configured to use a virtual thread per request (e.g., via server.tomcat.threads.virtual.enabled=true in Spring Boot), each incoming request runs on its own virtual thread. Blocking operations still block—but now they block a lightweight JVM fiber, not an OS thread. The carrier thread is freed to handle other virtual threads.

This is not magic. It does not eliminate latency. It does not make I/O faster. What it does is decouple concurrency from OS thread count. LogisticsCore can now handle 10,000 concurrent requests with only 100 carrier threads, because the blocking cost is shifted from the OS scheduler to the JVM’s scheduler.

The trade-off is CPU-bound work. Virtual threads excel at I/O-bound tasks but provide no benefit for CPU-heavy operations. In LogisticsCore, this means they are ideal for request handling, but not for batch inventory reconciliation.

Reactive Stack: A Different Execution Model

The Reactive Stack—built on Project Reactor and integrated via Spring WebFlux—is not an incremental improvement. It is a different programming model based on non-blocking, asynchronous data streams.

At its core is the event loop: a small, fixed number of threads (often #CPU) that never block. Instead of assigning a thread to a request, the DispatcherHandler (the reactive analog of DispatcherServlet) registers callbacks for I/O events. When data arrives from a socket, a callback is invoked on an event loop thread.

This model eliminates thread-per-request overhead entirely. LogisticsCore can handle tens of thousands of concurrent connections with minimal memory footprint. Backpressure is built in: slow consumers signal upstream to slow down production.

Key Components, Mechanistically

DispatcherHandler: Routes requests in WebFlux. Unlike DispatcherServlet, it does not run on a container thread. It is invoked by Netty or Undertow’s event loop.
RouterFunctions or @Controller: Define request mappings. RouterFunctions offer a functional, composable alternative to annotation-based routing.
WebClient: A reactive HTTP client that uses Netty for non-blocking I/O. Each request returns a Mono or Flux, which is subscribed to lazily.

The Reactive Stack does not use virtual threads. It avoids blocking altogether. This is a stricter contract: no synchronous calls are allowed anywhere in the chain. A single block() call on a Mono can stall an entire event loop.

Choosing the Right Stack: A Prescriptive Guide

The choice between Servlet with Virtual Threads and Reactive is not about preference. It is about alignment with the application’s I/O profile and team capability.

Use Servlet + Virtual Threads if:
- LogisticsCore’s dependencies are predominantly blocking (e.g., JDBC, legacy REST APIs).
- The team is experienced with imperative programming.
- Migration cost must be minimized. Virtual threads require no code changes—only configuration.
- Throughput is limited by I/O concurrency, not CPU.
Use Reactive Stack if:
- The system can adopt reactive clients (e.g., R2DBC, WebClient).
- The team can enforce a non-blocking discipline.
- The application must scale to very high concurrency with minimal resources.
- Backpressure is a requirement, not a nice-to-have.

Do not adopt Reactive “to be modern.” If you block inside a subscribe() or call block() on a Mono, you have gained nothing and lost clarity. The Reactive Stack is not a performance panacea—it is a constraint system that forces correct behavior at the cost of complexity.

Conclusion: Enforce the Right Contract

In LogisticsCore, the web layer must be designed around the dominant cost: I/O latency. The Servlet stack with Virtual Threads is the pragmatic upgrade path—leveraging Java 21’s runtime improvements to extend the life of imperative code under load. It requires no rewrite, only a JVM upgrade and container reconfiguration.

The Reactive Stack is superior in resource efficiency but demands a full-stack commitment to asynchrony. It is not suitable for piecemeal adoption. Mixing blocking and non-blocking code creates hidden failure modes.

Choose Virtual Threads for evolutionary scalability. Choose Reactive for revolutionary efficiency. Do not leave this decision to defaults. Audit every I/O operation. Measure thread utilization. Enforce the contract at the architecture level—because the JVM will not do it for you.

Sources

[1] Spring Framework Documentation: Web, 2024. [2] Project Reactor Documentation: Getting Started, 2024.