Was this article helpful?
to mark as helpful
Enjoyed this article?
Get more engineering insights delivered weekly.
Comments
to join the discussion
to mark as helpful
Get more engineering insights delivered weekly.
to join the discussion
Jagdish Salgotra
Jul 13, 2025 · 15 min read · Project Loom
Learn to build ultra-scalable web services using Java's virtual threads. This guide covers implementing high-performance HTTP servers that handle massive concurrent connections by efficiently managing blocking I/O operations.
Your article assistant
Ask me anything about this article. I'll provide answers with relevant sources.
Try asking:
Note This series uses Java 21 as the baseline. Virtual threads are stable in Java 21 (JEP 444). Structured concurrency snippets in this part (
StructuredTaskScope, JEP 453) use preview APIs and require--enable-preview.
In 1999, Dan Kegel posed the C10K problem: how can a web server handle 10,000 concurrent connections efficiently? For over two decades, this challenge has driven the evolution of web server architectures, from Apache's process-per-request model to Nginx's event-driven architecture, and later to Node.js's async I/O and reactive programming frameworks.
Today, with virtual threads, we can handle C10K-class traffic with simple blocking code and lower thread-management overhead.
In this part, we’ll build a production-style service, run it under load, and look at practical patterns that hold up in real systems.
Traditional Java web servers face a fundamental limitation: platform threads are expensive. Each platform thread typically consumes around 1-2MB of stack memory and is mapped 1:1 to an OS thread. This creates an early scalability ceiling for I/O-heavy services.
// From PlatformThreadPoolServer (feature/java-21)
ExecutorService threadPoolExecutor = Executors.newFixedThreadPool(THREAD_POOL_SIZE);
HttpServer server = HttpServer.create(new InetSocketAddress(PORT), 0);
server.createContext("/api", exchange -> {
threadPoolExecutor.submit(() -> {
try {
String currentThreadName = Thread.currentThread().getName();
System.out.println("Received request on platform thread pool: " + currentThreadName);
Thread.sleep(BLOCKING_SIMULATION_TIME);
String response = "Platform Thread Ok\n";
exchange.sendResponseHeaders(200, response.length());
exchange.getResponseBody().write(response.getBytes());
exchange.close();
} catch (Exception e) {
e.printStackTrace();
}
});
});
Practical constraints with platform threads:
To overcome platform thread limitations, the industry moved toward asynchronous, non-blocking architectures:
// Async HTTP style from AsyncHttpClient (same codebase)
public CompletableFuture<String> getAsync(String url) {
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create(url))
.timeout(Duration.ofSeconds(30))
.GET()
.build();
return httpClient.sendAsync(request, HttpResponse.BodyHandlers.ofString())
.thenApply(HttpResponse::body);
}
Trade-offs with async/reactive approaches:
Virtual threads reduce the trade-off between simplicity and scalability for blocking I/O. In practice they provide:
Here is the same style of server with virtual threads:
public class VirtualThreadHttpServer {
static void main(String[] args) throws IOException {
System.out.println("Virtual Thread HTTP Server");
HttpServer server = HttpServer.create(new InetSocketAddress(8080), 0);
System.out.println("Listening on port 8080");
server.createContext("/hello", exchange -> {
try {
System.out.println("Received request on virtual thread: " + Thread.currentThread().getName());
Thread.sleep(100);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
System.err.println("Thread interrupted: " + e.getMessage());
}
String response = "Hello from virtual thread!";
exchange.sendResponseHeaders(200, response.length());
OutputStream os = exchange.getResponseBody();
os.write(response.getBytes());
os.close();
exchange.close();
});
server.setExecutor(Executors.newVirtualThreadPerTaskExecutor());
server.start();
System.out.println("Press Ctrl+C to stop the server.");
}
}
What changes in this implementation:
Executors.newVirtualThreadPerTaskExecutor()Traditional connection pooling becomes a bottleneck with high concurrency:
// Platform-thread request handling from PlatformThreadMicroservice
server.createContext("/block", exchange -> {
handleRequest(exchange, "BLOCK", () -> {
try {
Thread.sleep(300);
return "DB call completed";
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new RuntimeException("Interrupted", e);
}
});
});
// Virtual-thread equivalent from VirtualThreadMicroservice
HttpServer server = HttpServer.create(new InetSocketAddress(PORT), 0);
server.setExecutor(Executors.newVirtualThreadPerTaskExecutor());
server.createContext("/block", exchange -> handleRequest(exchange, "BLOCK", () -> {
try {
Thread.sleep(300);
return "DB call completed";
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new RuntimeException("Interrupted", e);
}
}));
Key advantages:
private static String aggregateWithStructuredConcurrency() throws Exception {
long startTime = System.currentTimeMillis();
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
var blockFuture = scope.fork(() -> fetchBlock());
var fileFuture = scope.fork(() -> fetchFile());
scope.join();
scope.throwIfFailed();
long duration = System.currentTimeMillis() - startTime;
return String.format("StructuredTaskScope Combined: %s | %s (Total: %dms)",
blockFuture.get(), fileFuture.get(), duration);
}
}
The structured concurrency advantage:
You can't improve what you can't measure. Here's how to monitor virtual thread web services properly.
private static String generateMetrics() {
updateCpuUsage();
long usedMemory = runtime.totalMemory() - runtime.freeMemory();
return String.format("""
Virtual Thread Microservice Metrics:
=====================================
Active Requests: %d
Total Requests: %d
Average Response Time: %.2fms
CPU Usage: %.2f%%
Memory Usage: %.2fMB / %.2fMB
JVM Uptime: %d seconds
Thread Type: Virtual Threads
""",
activeRequests.get(),
totalRequests.get(),
totalRequests.get() > 0 ? (double)totalResponseTime.get() / totalRequests.get() : 0,
cpuUsage,
usedMemory / 1024.0 / 1024.0,
runtime.totalMemory() / 1024.0 / 1024.0,
runtimeBean.getUptime() / 1000
);
}
These outputs are from one test run in a specific setup. Treat them as illustrative; your numbers will vary by hardware, JVM settings, and downstream behavior.
Test Scenario: E-commerce order processing with database calls, external API calls, and business logic processing.
Traditional Thread Pool Server:
wrk -t8 -c1000 -d30s http://localhost:8080/process-order
Running 30s test @ http://localhost:8080/process-order
8 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 2.45s 1.20s 8.91s 68.25%
Req/Sec 12.34 8.92 45.00 78.26%
Requests/sec: 98.73
Traditional: OutOfMemoryError under sustained load
Virtual Thread Server:
wrk -t8 -c1000 -d30s http://localhost:8080/process-order
Running 30s test @ http://localhost:8080/process-order
8 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 245.67ms 45.23ms 892.12ms 89.23%
Req/Sec 502.34 23.45 567.00 82.34%
Requests/sec: 4,018.72
Stable performance throughout entire test
How to read these results:
| Metric | Traditional Threads | Virtual Threads | Improvement | Notes |
|---|---|---|---|---|
| Requests/Second | 98.73 | 4,018.72 | 40x | One test environment; validate your workload |
| Average Latency | 2.45s | 245.67ms | ~10x lower | One test environment; validate your workload |
| Stability in this run | OutOfMemoryError under sustained load | Stayed stable | Environment-specific | One test environment; validate your workload |
synchronized in hot paths, long native calls), since pinning can erase virtual-thread gainsjdk.VirtualThreadPinned events for pinning diagnosis// DON'T: put synchronized blocks around blocking work (pinning risk)
private static final Object SYNC_LOCK = new Object();
private static void synchronizedWork(int taskId) {
synchronized (SYNC_LOCK) {
try {
System.out.printf("Synchronized task %d on thread %s%n",
taskId, Thread.currentThread().getName());
Thread.sleep(100);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
}
// DO: use a virtual-thread-per-task executor
ExecutorService loomExecutor = Executors.newVirtualThreadPerTaskExecutor();
// DO: prefer ReentrantLock for contended sections
private static final ReentrantLock REENTRANT_LOCK = new ReentrantLock();
private static void reentrantLockWork(int taskId) {
REENTRANT_LOCK.lock();
try {
System.out.printf("ReentrantLock task %d on thread %s%n",
taskId, Thread.currentThread().getName());
Thread.sleep(100);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
} finally {
REENTRANT_LOCK.unlock();
}
}
A practical migration sequence:
// Before (platform thread pool)
ExecutorService threadPoolExecutor = Executors.newFixedThreadPool(THREAD_POOL_SIZE);
// After (virtual thread per task)
ExecutorService loomExecutor = Executors.newVirtualThreadPerTaskExecutor();
// Before: aggregate with CompletableFuture
private static String aggregateWithCompletableFuture() throws Exception {
CompletableFuture<String> blockFuture = CompletableFuture.supplyAsync(() -> {
try {
return fetchBlock();
} catch (Exception e) {
throw new RuntimeException(e);
}
});
CompletableFuture<String> fileFuture = CompletableFuture.supplyAsync(() -> {
try {
return fetchFile();
} catch (Exception e) {
throw new RuntimeException(e);
}
});
CompletableFuture.allOf(blockFuture, fileFuture).join();
return blockFuture.get() + " | " + fileFuture.get();
}
// After: same flow with StructuredTaskScope
private static String aggregateWithStructuredConcurrency() throws Exception {
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
var blockFuture = scope.fork(() -> fetchBlock());
var fileFuture = scope.fork(() -> fetchFile());
scope.join();
scope.throwIfFailed();
return blockFuture.get() + " | " + fileFuture.get();
}
}
-Djdk.tracePinnedThreads=full)HttpServer server = HttpServer.create(new InetSocketAddress(PORT), 0);
server.setExecutor(Executors.newVirtualThreadPerTaskExecutor());
server.createContext("/compute", exchange -> handleRequest(exchange, "COMPUTE", () -> {
long result = 0;
for (int i = 2; i <= 50_000; i++) {
if (isPrime(i)) {
result += i;
}
}
return "CPU Task completed. Result: " + result;
}));
private static void handleRequest(HttpExchange exchange, String endpoint, RequestHandler handler) {
long requestId = requestCounter.incrementAndGet();
long startTime = System.currentTimeMillis();
activeRequests.incrementAndGet();
try {
String result = handler.handle();
long duration = System.currentTimeMillis() - startTime;
totalRequests.incrementAndGet();
totalResponseTime.addAndGet(duration);
String response = String.format("%s (Duration: %dms, Thread: %s, Request: #%d)",
result, duration, Thread.currentThread().getName(), requestId);
sendResponse(exchange, response);
} catch (Exception e) {
sendErrorResponse(exchange, "Error: " + e.getMessage());
} finally {
activeRequests.decrementAndGet();
}
}
newVirtualThreadPerTaskExecutor()-Djdk.tracePinnedThreads=fulljdk.VirtualThreadPinned) before rolloutIn Part 3, we’ll move into real-world microservice patterns with virtual threads.
We'll cover:
src/main/java/app/js/VirtualThreadPoolServer.javasrc/main/java/app/js/PlatformThreadPoolServer.javasrc/main/java/app/js/microservices/VirtualThreadMicroservice.javawrk or Apache Bench to test your implementationsPart 2 complete. We moved from core concepts to a service you can run and pressure-test.
Series Navigation: