Skip to main content
spring boot the mechanics of magic

Tomcat, Threads, and Blocking I/O

5 min read Chapter 14 of 24
Summary

This section dissects the thread-per-request model in Apache...

This section dissects the thread-per-request model in Apache Tomcat and its scalability limitations under blocking I/O. It explains how traditional servlet containers assign each HTTP request a dedicated platform thread from a fixed pool, leading to thread starvation when all threads are blocked on I/O operations like database calls or external API requests. Key Tomcat configuration parameters (maxThreads, minSpareThreads, acceptCount) are analyzed in this context. The section then introduces Java 21's virtual threads as a solution, explaining how they are lightweight JVM-managed threads that park when blocked, freeing carrier threads to handle other work. Configuration via `spring.threads.virtual.enabled=true` is demonstrated, along with programmatic alternatives. Thought experiments (warehouse picking queue, connection pool mismatch) illustrate the practical implications. The draft includes code examples simulating thread starvation, controller endpoints for testing, and configuration examples. Performance comparisons and pitfalls (synchronized blocks pinning, ThreadLocal considerations) are discussed. The section maintains focus on the LogisticsCore application throughout, showing how virtual threads transform request handling without code changes.

Tomcat, Threads, and Blocking I/O

The web layer in LogisticsCore is not a passive conduit for HTTP traffic—it is a critical performance and reliability boundary. Every request entering the system engages a chain of JVM and Spring Framework mechanics that must be understood at the level of thread scheduling, memory management, and I/O contention. This section dissects the request processing pipeline with a focus on operational reality: where threads block, how dispatchers route work, and what happens under load.

The Thread-per-Request Model

The traditional servlet container architecture, such as Apache Tomcat, assigns each HTTP request a dedicated thread from a fixed-size thread pool for its entire duration, including blocking I/O operations. This model is straightforward but has significant scalability limitations. When all available threads are busy handling requests and waiting on I/O operations, no threads are available to handle new incoming requests, leading to queueing or rejection of requests.

Configuration Parameters

Understanding key configuration parameters in Tomcat is crucial for managing the thread-per-request model:

  • server.tomcat.threads.max: The maximum number of threads in the thread pool.
  • server.tomcat.threads.min-spare: The minimum number of idle threads to keep in the pool.
  • server.tomcat.accept-count: The maximum queue length for incoming connection requests when all request processing threads are busy.
# Traditional platform thread configuration (before Java 21)
server.tomcat.threads.max=200
server.tomcat.threads.min-spare=10
server.tomcat.accept-count=100
server.tomcat.connection-timeout=20000

Virtual Threads with Java 21

Java 21 introduces virtual threads, which are lightweight threads managed by the JVM. They offer a more efficient way to handle concurrency, especially for I/O-bound operations. By using virtual threads, the thread-per-request model can scale more efficiently without the need for a large thread pool.

Enabling Virtual Threads in Spring Boot

To enable virtual threads in a Spring Boot application, you can use the spring.threads.virtual.enabled property:

# Virtual thread configuration (Java 21+, Spring Boot 3.2+)
spring.threads.virtual.enabled=true

With virtual threads, the configuration parameters for the traditional thread pool become less critical, as the JVM manages the threads more efficiently.

Comparison of Thread Models

The following diagram illustrates the difference between the traditional thread-per-request model and the virtual thread model:

THREAD-PER-REQUEST MODEL (PLATFORM THREADS):

+----------------+     +----------------+     +----------------+
|  Request 1     |     |  Request 2     |     |  Request N     |
|  Thread T1     |     |  Thread T2     |     |  Thread TN     |
|                |     |                |     |                |
|  CPU           |     |  CPU           |     |  CPU           |
|  I/O Block ────┼────>│  I/O Block ────┼────>│  I/O Block ────┼────> Queue or Rejection
|  (Thread Held) |     |  (Thread Held) |     |  (Thread Held) |
+----------------+     +----------------+     +----------------+
       ↑                      ↑                      ↑
       └───── Fixed Pool (e.g., 200 threads) ────────┘

VIRTUAL THREAD MODEL (JAVA 21+):

+----------------+     +----------------+     +----------------+
|  Request 1     |     |  Request 2     |     |  Request N     |
|  Virtual V1    |     |  Virtual V2    |     |  Virtual VN    |
|                |     |                |     |                |
|  CPU           |     |  CPU           |     |  CPU           |
|  I/O Block ────┼─┐   │  I/O Block ────┼─┐   │  I/O Block ────┼─┐
|  (Suspended)   | │   │  (Suspended)   | │   │  (Suspended)   | │
+----------------+ │   +----------------+ │   +----------------+ │
                   │                      │                      │
                   ▼                      ▼                      ▼
           +-----------------------------------------------------------+
           | Carrier Thread Pool (e.g., 200 platform threads)          |
           | Dynamically schedules runnable virtual threads            |
           +-----------------------------------------------------------+

Thought Experiments

The Warehouse Picking Queue Analogy

Imagine LogisticsCore’s warehouse has 200 human workers (platform threads). Each worker receives a picking list (HTTP request), walks to shelves (CPU processing), waits at the loading dock for an external truck (blocking I/O - 2 seconds), and returns with items (response). With 1000 picking requests, the first 200 workers start immediately, the next 100 wait in the break room (acceptCount queue), and the remaining 700 are turned away (rejected).

Now, introduce robotic assistants (virtual threads): each human worker (carrier thread) gets 5 robotic assistants. When a robot waits at the loading dock, the human helps other robots. All 1000 requests are handled with 200 humans, with no queueing or rejections.

The Connection Pool Mismatch

Traditional: 200 threads × each needs a DB connection = 200 connection pool. But with virtual threads: 1000 concurrent requests × short DB calls. Do we need 1000 connections? No - connections are released faster. But peak concurrent DB calls might be 50. Sizing becomes based on actual DB concurrency, not thread count.

Data Tables

The following table compares configuration parameters and their purposes in traditional and virtual thread models:

Configuration ParameterTraditional (Platform Threads)Virtual Threads (Java 21+)Purpose
server.tomcat.threads.maxCritical (e.g., 200)Less critical (defaults used)Maximum concurrent request-handling threads
server.tomcat.threads.min-spareImportant for startup performanceNot applicableMinimum idle threads kept ready
server.tomcat.accept-countCritical buffer (e.g., 100)Much less importantQueue size when all threads busy
spring.threads.virtual.enabledfalse (or not set)trueEnables virtual thread support

Conclusion

Do not treat virtual threads as a transparent performance upgrade. They shift the bottleneck from thread exhaustion to downstream resource pressure—particularly database connection pools and remote service rate limits. In LogisticsCore, enabling virtual threads without adjusting the HikariCP pool size will result in immediate connection storms under load. The correct migration path is: (1) enable virtual threads, (2) monitor DB connection utilization, (3) cap the connection pool at the database’s sustainable concurrency level, and (4) implement structured concurrency with StructuredTaskScope to bound fan-out. The thread-per-request model remains, but the cost of a thread is no longer a constraint on request concurrency.