Was this article helpful?
to mark as helpful
Enjoyed this article?
Get more engineering insights delivered weekly.
Comments
to join the discussion
to mark as helpful
Get more engineering insights delivered weekly.
to join the discussion
Jagdish Salgotra
Jul 20, 2025 · 28 min read · Project Loom
Deep dive into production-grade microservices built with virtual threads. Explore practical implementation patterns, advanced observability, and graceful shutdown strategies to ensure resilience and scalability in distributed systems.
Your article assistant
Ask me anything about this article. I'll provide answers with relevant sources.
Try asking:
Note This series uses Java 21 as the baseline. Virtual threads are stable in Java 21 (JEP 444). Structured concurrency snippets in this part (
StructuredTaskScope, JEP 453) use preview APIs and require--enable-preview.
Concurrency limits usually appear under realistic traffic, not happy-path demos.
Traditional Java microservices can hit this wall sooner than teams plan for:
// From PlatformThreadMicroservice (feature/java-21)
ExecutorService executor = Executors.newFixedThreadPool(THREAD_POOL_SIZE);
HttpServer server = HttpServer.create(new InetSocketAddress(PORT), 0);
server.setExecutor(executor);
server.createContext("/block", exchange -> {
handleRequest(exchange, "BLOCK", () -> {
try {
Thread.sleep(300);
return "DB call completed";
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new RuntimeException("Interrupted", e);
}
});
});
Common failure patterns in production:
Virtual threads reduce the trade-off between readability and concurrency for blocking I/O. Here is the same style of service with virtual threads:
public class VirtualThreadMicroservice {
static void main(String[] args) throws IOException {
createTestFile();
startMetricsLogger();
HttpServer server = HttpServer.create(new InetSocketAddress(PORT), 0);
server.setExecutor(Executors.newVirtualThreadPerTaskExecutor());
server.createContext("/block", exchange -> handleRequest(exchange, "BLOCK", () -> {
try {
Thread.sleep(300);
return "DB call completed";
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new RuntimeException("Interrupted", e);
}
}));
server.createContext("/aggregate", exchange -> handleRequest(
exchange, "AGGREGATE", VirtualThreadMicroservice::aggregateWithStructuredConcurrency));
server.createContext("/aggregate-old", exchange -> handleRequest(
exchange, "AGGREGATE_OLD", VirtualThreadMicroservice::aggregateWithCompletableFuture));
server.start();
logger.info(" Virtual Thread Microservice started on port " + PORT);
}
}
What changed in practice:
Executors.newVirtualThreadPerTaskExecutor()private static String aggregateWithStructuredConcurrency() throws Exception {
long startTime = System.currentTimeMillis();
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
var blockFuture = scope.fork(() -> fetchBlock());
var fileFuture = scope.fork(() -> fetchFile());
scope.join();
scope.throwIfFailed();
long duration = System.currentTimeMillis() - startTime;
return String.format("StructuredTaskScope Combined: %s | %s (Total: %dms)",
blockFuture.get(), fileFuture.get(), duration);
}
}
Compare with the CompletableFuture baseline:
private static String aggregateWithCompletableFuture() throws Exception {
long startTime = System.currentTimeMillis();
CompletableFuture<String> blockFuture = CompletableFuture.supplyAsync(() -> {
try {
return fetchBlock();
} catch (Exception e) {
throw new RuntimeException(e);
}
});
CompletableFuture<String> fileFuture = CompletableFuture.supplyAsync(() -> {
try {
return fetchFile();
} catch (Exception e) {
throw new RuntimeException(e);
}
});
CompletableFuture.allOf(blockFuture, fileFuture).join();
long duration = System.currentTimeMillis() - startTime;
return String.format("CompletableFuture Combined: %s | %s (Total: %dms)",
blockFuture.get(), fileFuture.get(), duration);
}
The structured concurrency advantage:
Virtual-thread microservices can expose useful operational metrics with straightforward built-in endpoints:
private static String generateMetrics() {
updateCpuUsage();
long usedMemory = runtime.totalMemory() - runtime.freeMemory();
return String.format("""
Virtual Thread Microservice Metrics:
=====================================
Active Requests: %d
Total Requests: %d
Average Response Time: %.2fms
CPU Usage: %.2f%%
Memory Usage: %.2fMB / %.2fMB
JVM Uptime: %d seconds
Thread Type: Virtual Threads
""",
activeRequests.get(),
totalRequests.get(),
totalRequests.get() > 0 ? (double)totalResponseTime.get() / totalRequests.get() : 0,
cpuUsage,
usedMemory / 1024.0 / 1024.0,
runtime.totalMemory() / 1024.0 / 1024.0,
runtimeBean.getUptime() / 1000
);
}
Built-in monitoring signals:
These outputs are from one test run in a specific setup. Treat them as illustrative; your numbers will vary by hardware, JVM settings, and downstream behavior.
In this simplified test; real workloads vary widely based on downstream behavior.
Traditional Thread Pool Service:
# wrk load test results - traditional approach
wrk -t8 -c1000 -d30s http://localhost:8080/aggregate-old
Running 30s test @ http://localhost:8080/aggregate-old
8 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 2.45s 1.20s 8.91s 68.25%
Req/Sec 12.34 8.92 45.00 78.26%
Requests/sec: 98.73
Transfer/sec: 15.24KB
Traditional: OutOfMemoryError under sustained load
Virtual Thread Service:
# Same test - virtual threads
wrk -t8 -c1000 -d30s http://localhost:8080/aggregate
Running 30s test @ http://localhost:8080/aggregate
8 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 245.67ms 45.23ms 892.12ms 89.23%
Req/Sec 502.34 23.45 567.00 82.34%
Requests/sec: 4,018.72
Transfer/sec: 621.45KB
Stable performance for entire test duration
How to read these results:
| Metric | Traditional Threads | Virtual Threads | Improvement | Notes |
|---|---|---|---|---|
| Requests/Second | 98.73 | 4,018.72 | 40x | Results from one environment; always validate your specific workload |
| Average Latency | 2.45s | 245.67ms | ~10x lower | Results from one environment; always validate your specific workload |
| Stability in this run | OutOfMemoryError under sustained load | Stayed stable | Environment-specific | Results from one environment; always validate your specific workload |
synchronized hot paths, long native calls) since pinning can erase gainsjdk.VirtualThreadPinned events for diagnosisjdk.VirtualThreadPinned) before production rolloutprivate static String firstSuccessWithStructuredConcurrency() throws Exception {
long startTime = System.currentTimeMillis();
try (var scope = new StructuredTaskScope.ShutdownOnSuccess<String>()) {
scope.fork(() -> slowService("Cache-1", 500));
scope.fork(() -> slowService("Cache-2", 200));
scope.fork(() -> slowService("Database", 800));
scope.join();
long duration = System.currentTimeMillis() - startTime;
return String.format("First successful result: %s (Duration: %dms)",
scope.result(), duration);
}
}
private static String aggregateWithFallback() {
long startTime = System.currentTimeMillis();
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
var blockFuture = scope.fork(() -> fetchBlock());
var fileFuture = scope.fork(() -> fetchFileWithPossibleError());
scope.join();
scope.throwIfFailed();
long duration = System.currentTimeMillis() - startTime;
return String.format("Aggregate with fallback: %s | %s (Duration: %dms)",
blockFuture.get(), fileFuture.get(), duration);
} catch (Exception e) {
long duration = System.currentTimeMillis() - startTime;
return String.format("Fallback response: One service failed (%s), but we handled it gracefully (Duration: %dms)",
e.getMessage(), duration);
}
}
private static String multiServiceAggregation() throws Exception {
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
var blockFuture = scope.fork(() -> fetchBlock());
var fileFuture = scope.fork(() -> fetchFile());
var computeFuture = scope.fork(() -> fetchCompute());
var cacheFuture = scope.fork(() -> slowService("Cache", 150));
scope.join();
scope.throwIfFailed();
return String.format("Multi-service result: Block[%s] | File[%s] | Compute[%s] | Cache[%s]",
blockFuture.get(), fileFuture.get(), computeFuture.get(), cacheFuture.get());
}
}
public static void main(String[] args) throws IOException {
createTestFile();
startMetricsLogger();
HttpServer server = HttpServer.create(new InetSocketAddress(PORT), 0);
server.setExecutor(Executors.newVirtualThreadPerTaskExecutor());
server.createContext("/aggregate", exchange ->
handleRequest(exchange, "AGGREGATE", VirtualThreadMicroservice::aggregateWithStructuredConcurrency));
server.createContext("/metrics", exchange -> sendResponse(exchange, generateMetrics()));
server.createContext("/health", exchange -> sendResponse(exchange, "Virtual Thread Microservice is running!"));
server.start();
logger.info(" Virtual Thread Microservice started on port " + PORT);
Runtime.getRuntime().addShutdownHook(new Thread(() -> {
logger.info("\nShutting down Virtual Thread Microservice...");
server.stop(2);
cleanupTestFile();
}));
}
Why this matters in production:
** DO:**
** DON'T:**
A practical migration sequence:
Executors.newFixedThreadPool() to newVirtualThreadPerTaskExecutor()In Part 4, we'll explore advanced structured concurrency patterns: timeout handling, conditional cancellation, and fault-tolerant orchestration.
We'll cover:
wrk -t8 -c1000 -d30s http://localhost:8080/aggregatePart 3 complete. This one focused on production trade-offs, not just sample-code wins.
Series Navigation: