Project Loom

Java Concurrency with Project Loom: Part 3 - Real-World Microservices

Deep dive into production-grade microservices built with virtual threads. Explore practical implementation patterns, advanced observability, and graceful shutdown strategies to ensure resilience and scalability in distributed systems.

Jagdish Salgotra

staff eng · writes from production

2025-07-20·28 min read·~1,000 words

Companion · for this piece

Ask

Ask NoteSensei.

A reading assistant that only knows what's in this article. Sources every answer to a passage you can re-read.

Test

Test your understanding.

Five questions drawn from the piece. Earn a grade. See the passage behind anything you miss.

#project-loom

Written by

Jagdish Salgotra

Software engineer with 15 years work experience. Skills: Java, Spring Boot, Hibernate, SQL, Linux, Python, Telecom, IoT, Autonomous Systems

all posts →

Was this article helpful?

anonymous · no account needed

Keep reading

Comments

to join the discussion

Project Loom

Java Concurrency with Project Loom: Part 3 - Real-World Microservices

Jagdish Salgotra

staff eng · writes from production

2025-07-20·28 min read·~1,000 words

Note This series uses Java 21 as the baseline. Virtual threads are stable in Java 21 (JEP 444). Structured concurrency snippets in this part (StructuredTaskScope, JEP 453) use preview APIs and require --enable-preview.

TL;DR

Build microservices with higher concurrency headroom for blocking I/O workloads
Replace complex async orchestration where simpler blocking flows are clearer
Built-in monitoring and observability without external dependencies
Structured concurrency eliminates resource leaks and improves reliability
Performance gains can be significant for I/O-heavy paths, but must be validated per workload
The existing thread-per-request programming model remains usable

The Microservices Reality Check

Concurrency limits usually appear under realistic traffic, not happy-path demos.

Traditional Java microservices can hit this wall sooner than teams plan for:

The Classic Failure Pattern

java

// From PlatformThreadMicroservice (feature/java-21)
ExecutorService executor = Executors.newFixedThreadPool(THREAD_POOL_SIZE);
HttpServer server = HttpServer.create(new InetSocketAddress(PORT), 0);
server.setExecutor(executor);

server.createContext("/block", exchange -> {
    handleRequest(exchange, "BLOCK", () -> {
        try {
            Thread.sleep(300);
            return "DB call completed";
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
            throw new RuntimeException("Interrupted", e);
        }
    });
});

Common failure patterns in production:

Thread Pool Exhaustion: 200 thread pool + 450ms per request gives roughly ~444 concurrent requests in this simplified model
Resource Waste: Threads sitting idle waiting for I/O responses
Potential Cascading Latency: One slow dependency can propagate latency across services
Scaling Cost: Adding instances can become expensive quickly
Complex Async Code: CompletableFuture-heavy flows can be harder to debug and maintain

Virtual Thread Approach for Microservices

Virtual threads reduce the trade-off between readability and concurrency for blocking I/O. Here is the same style of service with virtual threads:

java

public class VirtualThreadMicroservice {
    static void main(String[] args) throws IOException {
        createTestFile();
        startMetricsLogger();

        HttpServer server = HttpServer.create(new InetSocketAddress(PORT), 0);
        server.setExecutor(Executors.newVirtualThreadPerTaskExecutor());

        server.createContext("/block", exchange -> handleRequest(exchange, "BLOCK", () -> {
            try {
                Thread.sleep(300);
                return "DB call completed";
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
                throw new RuntimeException("Interrupted", e);
            }
        }));

        server.createContext("/aggregate", exchange -> handleRequest(
            exchange, "AGGREGATE", VirtualThreadMicroservice::aggregateWithStructuredConcurrency));
        server.createContext("/aggregate-old", exchange -> handleRequest(
            exchange, "AGGREGATE_OLD", VirtualThreadMicroservice::aggregateWithCompletableFuture));

        server.start();
        logger.info(" Virtual Thread Microservice started on port " + PORT);
    }
}

What changed in practice:

One line change: Executors.newVirtualThreadPerTaskExecutor()
Same blocking code: less async orchestration overhead in application code
Higher concurrency headroom for I/O-heavy endpoints
Built-in metrics: Production-ready monitoring from day one

Deep Dive: Production-Ready Microservices Patterns

1. Service Aggregation with Structured Concurrency

java

private static String aggregateWithStructuredConcurrency() throws Exception {
    long startTime = System.currentTimeMillis();
    
    try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
        var blockFuture = scope.fork(() -> fetchBlock());
        var fileFuture = scope.fork(() -> fetchFile());

        scope.join();
        scope.throwIfFailed();

        long duration = System.currentTimeMillis() - startTime;
        return String.format("StructuredTaskScope Combined: %s | %s (Total: %dms)",
            blockFuture.get(), fileFuture.get(), duration);
    }
}

Compare with the CompletableFuture baseline:

java

private static String aggregateWithCompletableFuture() throws Exception {
    long startTime = System.currentTimeMillis();
    
    CompletableFuture<String> blockFuture = CompletableFuture.supplyAsync(() -> {
        try {
            return fetchBlock();
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    });
    
    CompletableFuture<String> fileFuture = CompletableFuture.supplyAsync(() -> {
        try {
            return fetchFile();
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    });

    CompletableFuture.allOf(blockFuture, fileFuture).join();
    long duration = System.currentTimeMillis() - startTime;
    
    return String.format("CompletableFuture Combined: %s | %s (Total: %dms)",
        blockFuture.get(), fileFuture.get(), duration);
}

The structured concurrency advantage:

Automatic cleanup: helps prevent request-scoped resource leaks
Exception safety: one failure cancels related subtasks
Readable code: try-with-resources keeps orchestration localized
Linear control flow: blocking style without callback chains

Production Monitoring

Virtual-thread microservices can expose useful operational metrics with straightforward built-in endpoints:

java

private static String generateMetrics() {
    updateCpuUsage();
    long usedMemory = runtime.totalMemory() - runtime.freeMemory();
    
    return String.format("""
        Virtual Thread Microservice Metrics:
        =====================================
        Active Requests: %d
        Total Requests: %d
        Average Response Time: %.2fms
        CPU Usage: %.2f%%
        Memory Usage: %.2fMB / %.2fMB
        JVM Uptime: %d seconds
        Thread Type: Virtual Threads
        """,
        activeRequests.get(),
        totalRequests.get(),
        totalRequests.get() > 0 ? (double)totalResponseTime.get() / totalRequests.get() : 0,
        cpuUsage,
        usedMemory / 1024.0 / 1024.0,
        runtime.totalMemory() / 1024.0 / 1024.0,
        runtimeBean.getUptime() / 1000
    );
}

Built-in monitoring signals:

Real-time metrics: request counts, response times, memory usage
Baseline visibility without extra libraries for this sample service
CPU tracking: automatic CPU usage monitoring
Memory insights: heap and non-heap memory tracking
Live updates: metrics endpoint updates in real time

Real-World Performance Analysis

Example Load-Test Output

These outputs are from one test run in a specific setup. Treat them as illustrative; your numbers will vary by hardware, JVM settings, and downstream behavior.

In this simplified test; real workloads vary widely based on downstream behavior.

Traditional Thread Pool Service:

bash

# wrk load test results - traditional approach
wrk -t8 -c1000 -d30s http://localhost:8080/aggregate-old

Running 30s test @ http://localhost:8080/aggregate-old
  8 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     2.45s     1.20s    8.91s    68.25%
    Req/Sec    12.34      8.92     45.00     78.26%
  Requests/sec: 98.73
  Transfer/sec: 15.24KB

 Traditional: OutOfMemoryError under sustained load

Virtual Thread Service:

bash

# Same test - virtual threads
wrk -t8 -c1000 -d30s http://localhost:8080/aggregate

Running 30s test @ http://localhost:8080/aggregate
  8 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   245.67ms   45.23ms   892.12ms   89.23%
    Req/Sec   502.34     23.45    567.00     82.34%
  Requests/sec: 4,018.72
  Transfer/sec: 621.45KB

 Stable performance for entire test duration

How to read these results:

Metric	Traditional Threads	Virtual Threads	Improvement	Notes
Requests/Second	98.73	4,018.72	40x	Results from one environment; always validate your specific workload
Average Latency	2.45s	245.67ms	~10x lower	Results from one environment; always validate your specific workload
Stability in this run	OutOfMemoryError under sustained load	Stayed stable	Environment-specific	Results from one environment; always validate your specific workload

Caveats: End-to-End Limits Still Apply

Virtual threads improve request concurrency, but they do not increase downstream capacity by themselves
DB pools, remote API limits, socket/file descriptor limits, and queue capacity still set hard ceilings
For CPU-bound sections, gains are usually smaller than for blocking I/O
Monitor and reduce pinning (synchronized hot paths, long native calls) since pinning can erase gains
Use JFR jdk.VirtualThreadPinned events for diagnosis

Validate Gains in Your Environment

Re-run load tests with realistic traffic shapes and concurrency ramps
Compare p50/p95/p99 latency, throughput, and error rates across sustained runs
Measure downstream saturation points (DB pool usage, API quotas, queue depth)
Inspect pinning with JFR (jdk.VirtualThreadPinned) before production rollout

Advanced Production Patterns

1. First-Success Pattern (Circuit Breaker Alternative)

java

private static String firstSuccessWithStructuredConcurrency() throws Exception {
    long startTime = System.currentTimeMillis();
    
    try (var scope = new StructuredTaskScope.ShutdownOnSuccess<String>()) {
        scope.fork(() -> slowService("Cache-1", 500));
        scope.fork(() -> slowService("Cache-2", 200));
        scope.fork(() -> slowService("Database", 800));
        
        scope.join();
        
        long duration = System.currentTimeMillis() - startTime;
        return String.format("First successful result: %s (Duration: %dms)",
            scope.result(), duration);
    }
}

2. Fallback with Structured Concurrency

java

private static String aggregateWithFallback() {
    long startTime = System.currentTimeMillis();
    
    try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
        var blockFuture = scope.fork(() -> fetchBlock());
        var fileFuture = scope.fork(() -> fetchFileWithPossibleError());
        
        scope.join();
        scope.throwIfFailed();
        
        long duration = System.currentTimeMillis() - startTime;
        return String.format("Aggregate with fallback: %s | %s (Duration: %dms)",
            blockFuture.get(), fileFuture.get(), duration);
        
    } catch (Exception e) {
        long duration = System.currentTimeMillis() - startTime;
        return String.format("Fallback response: One service failed (%s), but we handled it gracefully (Duration: %dms)",
            e.getMessage(), duration);
    }
}

3. Multi-Service Orchestration

java

private static String multiServiceAggregation() throws Exception {
    try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
        var blockFuture = scope.fork(() -> fetchBlock());
        var fileFuture = scope.fork(() -> fetchFile());
        var computeFuture = scope.fork(() -> fetchCompute());
        var cacheFuture = scope.fork(() -> slowService("Cache", 150));
        
        scope.join();
        scope.throwIfFailed();
        
        return String.format("Multi-service result: Block[%s] | File[%s] | Compute[%s] | Cache[%s]",
            blockFuture.get(), fileFuture.get(), computeFuture.get(), cacheFuture.get());
    }
}

Graceful Shutdown: The Production Necessity

java

public static void main(String[] args) throws IOException {
    createTestFile();
    startMetricsLogger();
    HttpServer server = HttpServer.create(new InetSocketAddress(PORT), 0);
    server.setExecutor(Executors.newVirtualThreadPerTaskExecutor());

    server.createContext("/aggregate", exchange ->
        handleRequest(exchange, "AGGREGATE", VirtualThreadMicroservice::aggregateWithStructuredConcurrency));
    server.createContext("/metrics", exchange -> sendResponse(exchange, generateMetrics()));
    server.createContext("/health", exchange -> sendResponse(exchange, "Virtual Thread Microservice is running!"));

    server.start();
    logger.info(" Virtual Thread Microservice started on port " + PORT);

    Runtime.getRuntime().addShutdownHook(new Thread(() -> {
        logger.info("\nShutting down Virtual Thread Microservice...");
        server.stop(2);
        cleanupTestFile();
    }));
}

Why this matters in production:

Fewer dropped in-flight requests during shutdown windows
Metrics preservation: Final statistics before shutdown
Resource cleanup: No resource leaks in container environments
Audit trail: Clear logging of shutdown process
Kubernetes friendly: Respects termination grace periods

Best Practices for Production Microservices

Do's and Don'ts from Production Use

** DO:**

Use blocking I/O intentionally: It is often a good fit with virtual threads
Use structured concurrency for request-scoped orchestration where it improves clarity
Monitor with built-in metrics: Simple HTTP endpoints can provide strong baseline visibility
Design for failure: Use timeout patterns and fallback mechanisms
Test with realistic load: 1000+ concurrent connections minimum

** DON'T:**

Pool virtual threads: They are cheap to create, so prefer per-task creation
Assume reactive is obsolete: choose based on workload, ecosystem, and team constraints
Ignore pinning: Monitor for synchronized blocks that pin threads
Overcomplicate: keep flows simple and observable
Skip load testing: Virtual threads change performance characteristics

Migration Strategy

A practical migration sequence:

Start small: Pick a non-critical service for your first migration
Replace executors: Change Executors.newFixedThreadPool() to newVirtualThreadPerTaskExecutor()
Simplify async code: Replace CompletableFuture chains where a blocking flow is clearer
Add monitoring: Implement metrics endpoints from day one
Load test everything: Virtual threads have different performance characteristics
Monitor pinning: Use JFR to identify carrier thread pinning
Gradual rollout: Blue-green deployment with traffic shifting

What's Next?

In Part 4, we'll explore advanced structured concurrency patterns: timeout handling, conditional cancellation, and fault-tolerant orchestration.

We'll cover:

Advanced timeout patterns that prevent cascading failures
Conditional cancellation for complex business workflows
Building circuit breakers with structured concurrency
Distributed tracing and observability at scale
Practical rollout and validation guidance

Resources

Complete Code: VirtualThreadMicroservice.java - Production-ready examples
Load Testing: Run wrk -t8 -c1000 -d30s http://localhost:8080/aggregate
Try It Yourself: Clone the repo and run the microservice locally
Official Documentation: JEP 453: Structured Concurrency

Part 3 complete. This one focused on production trade-offs, not just sample-code wins.

Series Navigation:

Previous: Part 2: Building Web Services with Virtual Threads →
Next: Part 4: Advanced Structured Concurrency Patterns →

Companion · for this piece

Ask

Ask NoteSensei.

A reading assistant that only knows what's in this article. Sources every answer to a passage you can re-read.

Test

Test your understanding.

Five questions drawn from the piece. Earn a grade. See the passage behind anything you miss.

#project-loom

Written by

Jagdish Salgotra

Software engineer with 15 years work experience. Skills: Java, Spring Boot, Hibernate, SQL, Linux, Python, Telecom, IoT, Autonomous Systems

all posts →

Was this article helpful?

anonymous · no account needed

Keep reading

Note This series uses Java 21 as the baseline. Virtual threads are stable in Java 21 (JEP 444). Structured concurrency snippets in this part (StructuredTaskScope, JEP 453) use preview APIs and require --enable-preview.

TL;DR

Build microservices with higher concurrency headroom for blocking I/O workloads
Replace complex async orchestration where simpler blocking flows are clearer
Built-in monitoring and observability without external dependencies
Structured concurrency eliminates resource leaks and improves reliability
Performance gains can be significant for I/O-heavy paths, but must be validated per workload
The existing thread-per-request programming model remains usable

The Microservices Reality Check

Concurrency limits usually appear under realistic traffic, not happy-path demos.

Traditional Java microservices can hit this wall sooner than teams plan for:

The Classic Failure Pattern

java

// From PlatformThreadMicroservice (feature/java-21)
ExecutorService executor = Executors.newFixedThreadPool(THREAD_POOL_SIZE);
HttpServer server = HttpServer.create(new InetSocketAddress(PORT), 0);
server.setExecutor(executor);

server.createContext("/block", exchange -> {
    handleRequest(exchange, "BLOCK", () -> {
        try {
            Thread.sleep(300);
            return "DB call completed";
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
            throw new RuntimeException("Interrupted", e);
        }
    });
});

Common failure patterns in production:

Thread Pool Exhaustion: 200 thread pool + 450ms per request gives roughly ~444 concurrent requests in this simplified model
Resource Waste: Threads sitting idle waiting for I/O responses
Potential Cascading Latency: One slow dependency can propagate latency across services
Scaling Cost: Adding instances can become expensive quickly
Complex Async Code: CompletableFuture-heavy flows can be harder to debug and maintain

Virtual Thread Approach for Microservices

Virtual threads reduce the trade-off between readability and concurrency for blocking I/O. Here is the same style of service with virtual threads:

java

public class VirtualThreadMicroservice {
    static void main(String[] args) throws IOException {
        createTestFile();
        startMetricsLogger();

        HttpServer server = HttpServer.create(new InetSocketAddress(PORT), 0);
        server.setExecutor(Executors.newVirtualThreadPerTaskExecutor());

        server.createContext("/block", exchange -> handleRequest(exchange, "BLOCK", () -> {
            try {
                Thread.sleep(300);
                return "DB call completed";
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
                throw new RuntimeException("Interrupted", e);
            }
        }));

        server.createContext("/aggregate", exchange -> handleRequest(
            exchange, "AGGREGATE", VirtualThreadMicroservice::aggregateWithStructuredConcurrency));
        server.createContext("/aggregate-old", exchange -> handleRequest(
            exchange, "AGGREGATE_OLD", VirtualThreadMicroservice::aggregateWithCompletableFuture));

        server.start();
        logger.info(" Virtual Thread Microservice started on port " + PORT);
    }
}

What changed in practice:

One line change: Executors.newVirtualThreadPerTaskExecutor()
Same blocking code: less async orchestration overhead in application code
Higher concurrency headroom for I/O-heavy endpoints
Built-in metrics: Production-ready monitoring from day one

Deep Dive: Production-Ready Microservices Patterns

1. Service Aggregation with Structured Concurrency

java

private static String aggregateWithStructuredConcurrency() throws Exception {
    long startTime = System.currentTimeMillis();
    
    try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
        var blockFuture = scope.fork(() -> fetchBlock());
        var fileFuture = scope.fork(() -> fetchFile());

        scope.join();
        scope.throwIfFailed();

        long duration = System.currentTimeMillis() - startTime;
        return String.format("StructuredTaskScope Combined: %s | %s (Total: %dms)",
            blockFuture.get(), fileFuture.get(), duration);
    }
}

Compare with the CompletableFuture baseline:

java

private static String aggregateWithCompletableFuture() throws Exception {
    long startTime = System.currentTimeMillis();
    
    CompletableFuture<String> blockFuture = CompletableFuture.supplyAsync(() -> {
        try {
            return fetchBlock();
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    });
    
    CompletableFuture<String> fileFuture = CompletableFuture.supplyAsync(() -> {
        try {
            return fetchFile();
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    });

    CompletableFuture.allOf(blockFuture, fileFuture).join();
    long duration = System.currentTimeMillis() - startTime;
    
    return String.format("CompletableFuture Combined: %s | %s (Total: %dms)",
        blockFuture.get(), fileFuture.get(), duration);
}

The structured concurrency advantage:

Automatic cleanup: helps prevent request-scoped resource leaks
Exception safety: one failure cancels related subtasks
Readable code: try-with-resources keeps orchestration localized
Linear control flow: blocking style without callback chains

Production Monitoring

Virtual-thread microservices can expose useful operational metrics with straightforward built-in endpoints:

java

private static String generateMetrics() {
    updateCpuUsage();
    long usedMemory = runtime.totalMemory() - runtime.freeMemory();
    
    return String.format("""
        Virtual Thread Microservice Metrics:
        =====================================
        Active Requests: %d
        Total Requests: %d
        Average Response Time: %.2fms
        CPU Usage: %.2f%%
        Memory Usage: %.2fMB / %.2fMB
        JVM Uptime: %d seconds
        Thread Type: Virtual Threads
        """,
        activeRequests.get(),
        totalRequests.get(),
        totalRequests.get() > 0 ? (double)totalResponseTime.get() / totalRequests.get() : 0,
        cpuUsage,
        usedMemory / 1024.0 / 1024.0,
        runtime.totalMemory() / 1024.0 / 1024.0,
        runtimeBean.getUptime() / 1000
    );
}

Built-in monitoring signals:

Real-time metrics: request counts, response times, memory usage
Baseline visibility without extra libraries for this sample service
CPU tracking: automatic CPU usage monitoring
Memory insights: heap and non-heap memory tracking
Live updates: metrics endpoint updates in real time

Real-World Performance Analysis

Example Load-Test Output

These outputs are from one test run in a specific setup. Treat them as illustrative; your numbers will vary by hardware, JVM settings, and downstream behavior.

In this simplified test; real workloads vary widely based on downstream behavior.

Traditional Thread Pool Service:

bash

# wrk load test results - traditional approach
wrk -t8 -c1000 -d30s http://localhost:8080/aggregate-old

Running 30s test @ http://localhost:8080/aggregate-old
  8 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     2.45s     1.20s    8.91s    68.25%
    Req/Sec    12.34      8.92     45.00     78.26%
  Requests/sec: 98.73
  Transfer/sec: 15.24KB

 Traditional: OutOfMemoryError under sustained load

Virtual Thread Service:

bash

# Same test - virtual threads
wrk -t8 -c1000 -d30s http://localhost:8080/aggregate

Running 30s test @ http://localhost:8080/aggregate
  8 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   245.67ms   45.23ms   892.12ms   89.23%
    Req/Sec   502.34     23.45    567.00     82.34%
  Requests/sec: 4,018.72
  Transfer/sec: 621.45KB

 Stable performance for entire test duration

How to read these results:

Metric	Traditional Threads	Virtual Threads	Improvement	Notes
Requests/Second	98.73	4,018.72	40x	Results from one environment; always validate your specific workload
Average Latency	2.45s	245.67ms	~10x lower	Results from one environment; always validate your specific workload
Stability in this run	OutOfMemoryError under sustained load	Stayed stable	Environment-specific	Results from one environment; always validate your specific workload

Caveats: End-to-End Limits Still Apply

Virtual threads improve request concurrency, but they do not increase downstream capacity by themselves
DB pools, remote API limits, socket/file descriptor limits, and queue capacity still set hard ceilings
For CPU-bound sections, gains are usually smaller than for blocking I/O
Monitor and reduce pinning (synchronized hot paths, long native calls) since pinning can erase gains
Use JFR jdk.VirtualThreadPinned events for diagnosis

Validate Gains in Your Environment

Re-run load tests with realistic traffic shapes and concurrency ramps
Compare p50/p95/p99 latency, throughput, and error rates across sustained runs
Measure downstream saturation points (DB pool usage, API quotas, queue depth)
Inspect pinning with JFR (jdk.VirtualThreadPinned) before production rollout

Advanced Production Patterns

1. First-Success Pattern (Circuit Breaker Alternative)

java

private static String firstSuccessWithStructuredConcurrency() throws Exception {
    long startTime = System.currentTimeMillis();
    
    try (var scope = new StructuredTaskScope.ShutdownOnSuccess<String>()) {
        scope.fork(() -> slowService("Cache-1", 500));
        scope.fork(() -> slowService("Cache-2", 200));
        scope.fork(() -> slowService("Database", 800));
        
        scope.join();
        
        long duration = System.currentTimeMillis() - startTime;
        return String.format("First successful result: %s (Duration: %dms)",
            scope.result(), duration);
    }
}

2. Fallback with Structured Concurrency

java

private static String aggregateWithFallback() {
    long startTime = System.currentTimeMillis();
    
    try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
        var blockFuture = scope.fork(() -> fetchBlock());
        var fileFuture = scope.fork(() -> fetchFileWithPossibleError());
        
        scope.join();
        scope.throwIfFailed();
        
        long duration = System.currentTimeMillis() - startTime;
        return String.format("Aggregate with fallback: %s | %s (Duration: %dms)",
            blockFuture.get(), fileFuture.get(), duration);
        
    } catch (Exception e) {
        long duration = System.currentTimeMillis() - startTime;
        return String.format("Fallback response: One service failed (%s), but we handled it gracefully (Duration: %dms)",
            e.getMessage(), duration);
    }
}

3. Multi-Service Orchestration

java

private static String multiServiceAggregation() throws Exception {
    try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
        var blockFuture = scope.fork(() -> fetchBlock());
        var fileFuture = scope.fork(() -> fetchFile());
        var computeFuture = scope.fork(() -> fetchCompute());
        var cacheFuture = scope.fork(() -> slowService("Cache", 150));
        
        scope.join();
        scope.throwIfFailed();
        
        return String.format("Multi-service result: Block[%s] | File[%s] | Compute[%s] | Cache[%s]",
            blockFuture.get(), fileFuture.get(), computeFuture.get(), cacheFuture.get());
    }
}

Graceful Shutdown: The Production Necessity

java

public static void main(String[] args) throws IOException {
    createTestFile();
    startMetricsLogger();
    HttpServer server = HttpServer.create(new InetSocketAddress(PORT), 0);
    server.setExecutor(Executors.newVirtualThreadPerTaskExecutor());

    server.createContext("/aggregate", exchange ->
        handleRequest(exchange, "AGGREGATE", VirtualThreadMicroservice::aggregateWithStructuredConcurrency));
    server.createContext("/metrics", exchange -> sendResponse(exchange, generateMetrics()));
    server.createContext("/health", exchange -> sendResponse(exchange, "Virtual Thread Microservice is running!"));

    server.start();
    logger.info(" Virtual Thread Microservice started on port " + PORT);

    Runtime.getRuntime().addShutdownHook(new Thread(() -> {
        logger.info("\nShutting down Virtual Thread Microservice...");
        server.stop(2);
        cleanupTestFile();
    }));
}

Why this matters in production:

Fewer dropped in-flight requests during shutdown windows
Metrics preservation: Final statistics before shutdown
Resource cleanup: No resource leaks in container environments
Audit trail: Clear logging of shutdown process
Kubernetes friendly: Respects termination grace periods

Best Practices for Production Microservices

Do's and Don'ts from Production Use

** DO:**

Use blocking I/O intentionally: It is often a good fit with virtual threads
Use structured concurrency for request-scoped orchestration where it improves clarity
Monitor with built-in metrics: Simple HTTP endpoints can provide strong baseline visibility
Design for failure: Use timeout patterns and fallback mechanisms
Test with realistic load: 1000+ concurrent connections minimum

** DON'T:**

Pool virtual threads: They are cheap to create, so prefer per-task creation
Assume reactive is obsolete: choose based on workload, ecosystem, and team constraints
Ignore pinning: Monitor for synchronized blocks that pin threads
Overcomplicate: keep flows simple and observable
Skip load testing: Virtual threads change performance characteristics

Migration Strategy

A practical migration sequence:

Start small: Pick a non-critical service for your first migration
Replace executors: Change Executors.newFixedThreadPool() to newVirtualThreadPerTaskExecutor()
Simplify async code: Replace CompletableFuture chains where a blocking flow is clearer
Add monitoring: Implement metrics endpoints from day one
Load test everything: Virtual threads have different performance characteristics
Monitor pinning: Use JFR to identify carrier thread pinning
Gradual rollout: Blue-green deployment with traffic shifting

What's Next?

In Part 4, we'll explore advanced structured concurrency patterns: timeout handling, conditional cancellation, and fault-tolerant orchestration.

We'll cover:

Advanced timeout patterns that prevent cascading failures
Conditional cancellation for complex business workflows
Building circuit breakers with structured concurrency
Distributed tracing and observability at scale
Practical rollout and validation guidance

Resources

Complete Code: VirtualThreadMicroservice.java - Production-ready examples
Load Testing: Run wrk -t8 -c1000 -d30s http://localhost:8080/aggregate
Try It Yourself: Clone the repo and run the microservice locally
Official Documentation: JEP 453: Structured Concurrency

Part 3 complete. This one focused on production trade-offs, not just sample-code wins.

Series Navigation:

Previous: Part 2: Building Web Services with Virtual Threads →
Next: Part 4: Advanced Structured Concurrency Patterns →