Was this article helpful?
to mark as helpful
Enjoyed this article?
Get more engineering insights delivered weekly.
Comments
to join the discussion
to mark as helpful
Get more engineering insights delivered weekly.
to join the discussion
Jagdish Salgotra
Aug 3, 2025 · 32 min read · Project Loom
Master advanced structured concurrency patterns for production microservices. Learn circuit breakers, fallback patterns, and resilient service orchestration with StructuredTaskScope.
Your article assistant
Ask me anything about this article. I'll provide answers with relevant sources.
Try asking:
Note This series uses Java 21 as the baseline. Structured concurrency snippets in this part (
StructuredTaskScope, JEP 453) use preview APIs and require--enable-preview.
Basic structured concurrency gets you far. The gaps show up when dependencies get slow or flaky.
This is usually where teams need the next layer: retries, circuit breakers, partial-result handling, and clear failure boundaries between services.
Checkout and dashboard paths are usually where this shows up first: one slow dependency starts to affect unrelated requests.
Here's what most of us end up writing when we need resilient service orchestration:
// From StructuredConcurrencyComparison (CompletableFuture baseline)
CompletableFuture<String> cf1 = CompletableFuture.supplyAsync(() -> simulateService("Service-A", 200));
CompletableFuture<String> cf2 = CompletableFuture.supplyAsync(() -> simulateService("Service-B", 300));
CompletableFuture<String> cf3 = CompletableFuture.supplyAsync(() -> simulateService("Service-C", 100));
CompletableFuture.allOf(cf1, cf2, cf3).join();
String result = String.format("Results: %s, %s, %s", cf1.get(), cf2.get(), cf3.get());
logger.info(result);
Common operational issues:
These chains can work, but failure handling and cancellation logic tend to grow quickly.
These patterns package resilience behavior into reusable scope-based helpers while keeping control flow readable.
public class ScopedRequestHandler {
private static final Logger logger = LoggerFactory.getLogger(ScopedRequestHandler.class);
public <T> T runInScope(Callable<T> task) throws Exception {
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
var future = scope.fork(task);
scope.join();
scope.throwIfFailed();
return future.get();
}
}
public <T> T runInScopeWithTimeout(Callable<T> task, Duration timeout) throws Exception {
Instant deadline = Instant.now().plus(timeout);
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
var future = scope.fork(task);
scope.join();
if (Instant.now().isAfter(deadline)) {
throw new TimeoutException("Operation exceeded timeout: " + timeout);
}
scope.throwIfFailed();
return future.get();
}
}
public <T1, T2, T3> TripleResult<T1, T2, T3> runInParallel(
Callable<T1> task1,
Callable<T2> task2,
Callable<T3> task3) throws Exception {
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
var future1 = scope.fork(task1);
var future2 = scope.fork(task2);
var future3 = scope.fork(task3);
scope.join();
scope.throwIfFailed();
return new TripleResult<>(future1.get(), future2.get(), future3.get());
}
}
}
What this abstraction gives you:
public <T> T runWithFallback(Callable<T> primary, Callable<T> fallback) throws Exception {
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
var primaryFuture = scope.fork(primary);
scope.join();
try {
scope.throwIfFailed();
return primaryFuture.get();
} catch (Exception e) {
logger.warn("Primary task failed, using fallback: {}", e.getMessage());
return fallback.call();
}
}
}
Real-world usage:
// Cache -> Database fallback
public String getDataWithFallback(String key) throws Exception {
return scopedHandler.runWithFallback(
() -> getFromPrimaryDatabase(key),
() -> getFromSecondaryDatabase(key)
);
}
public <T> T runWithCircuitBreaker(Callable<T> task, CircuitBreakerConfig config) throws Exception {
if (config.isOpen()) {
throw new RuntimeException("Circuit breaker is OPEN - failing fast");
}
try {
T result = runInScope(task);
config.onSuccess();
return result;
} catch (Exception e) {
config.onFailure();
throw e;
}
}
public static class CircuitBreakerConfig {
private int failureCount = 0;
private final int threshold;
private final Duration timeout;
private Instant lastFailureTime = Instant.MIN;
public boolean isOpen() {
if (failureCount >= threshold) {
return Instant.now().isBefore(lastFailureTime.plus(timeout));
}
return false;
}
public void onFailure() {
failureCount++;
lastFailureTime = Instant.now();
}
public void onSuccess() {
failureCount = 0;
}
}
Note This circuit breaker is illustrative. For production, prefer libraries like Resilience4j for mature state handling and metrics support.
public <T> T runWithRetry(Callable<T> task, int maxRetries, Duration retryDelay) throws Exception {
Exception lastException = null;
for (int attempt = 1; attempt <= maxRetries; attempt++) {
try {
return runInScope(task);
} catch (Exception e) {
lastException = e;
logger.warn("Attempt {} failed: {}", attempt, e.getMessage());
if (attempt < maxRetries) {
Thread.sleep(retryDelay.toMillis());
}
}
}
throw new RuntimeException("All " + maxRetries + " attempts failed", lastException);
}
This is useful for transient failure windows where a later attempt can succeed.
This shows one way to compose these patterns in service-layer code.
public class BusinessService {
private static final Logger logger = LoggerFactory.getLogger(BusinessService.class);
private final ScopedRequestHandler scopedHandler = new ScopedRequestHandler();
private final ScopedRequestHandler.CircuitBreakerConfig dbCircuitBreaker =
new ScopedRequestHandler.CircuitBreakerConfig(3, Duration.ofSeconds(30));
public String buildDashboard(String userId) throws Exception {
logger.info("Building dashboard for: {}", userId);
var result = scopedHandler.runInParallel(
() -> fetchUserStats(userId),
() -> fetchRecentActivity(userId),
() -> fetchNotifications(userId)
);
return String.format("Dashboard: Stats[%s] | Activity[%s] | Notifications[%s]",
result.result1(), result.result2(), result.result3());
}
public String aggregateServices() throws Exception {
logger.info("Aggregating multiple services");
var request = ScopedRequestHandler.AggregateRequest.of(
() -> callAuthService(),
() -> callUserService(),
() -> callNotificationService(),
() -> callAnalyticsService()
);
var result = scopedHandler.aggregate(request);
return String.format("Aggregated %d services in %dms: %s",
result.results().size(), result.durationMs(), result.results());
}
public String getCachedData(String key) throws Exception {
logger.info("Getting cached data for: {}", key);
return scopedHandler.runFirstSuccess(
() -> getFromL1Cache(key),
() -> getFromL2Cache(key),
() -> getFromDatabase(key)
);
}
public String callProtectedService(String request) throws Exception {
logger.info("Calling protected service with request: {}", request);
return scopedHandler.runWithCircuitBreaker(
() -> callUnreliableService(request),
dbCircuitBreaker
);
}
public String processOrder(String orderId) throws Exception {
logger.info("Processing order: {}", orderId);
var validationResult = scopedHandler.runInParallel(
() -> validatePayment(orderId),
() -> validateInventory(orderId),
() -> validateShipping(orderId)
);
if (allValidationsPassed(validationResult)) {
return scopedHandler.runInParallel(
() -> chargePayment(orderId),
() -> reserveInventory(orderId),
() -> scheduleShipping(orderId)
).toString();
}
return "Order validation failed: " + validationResult;
}
}
Production-oriented features:
This BusinessService keeps resilience logic close to business flow while using standard Java concurrency primitives.
These outputs are from one test environment and failure model. Treat them as illustrative; validate with your own traffic shape and dependencies.
In this simplified run with induced failures; real results vary widely.
Traditional Approach (CompletableFuture + Libraries):
# Load test with 50% service failures
wrk -t8 -c1000 -d30s http://localhost:8080/traditional-dashboard
Running 30s test @ http://localhost:8080/traditional-dashboard
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.45s 0.89s 5.21s 67.82%
Req/Sec 89.34 67.23 234.00 71.24%
Requests/sec: 715.23
Memory usage: 1.2GB
ERROR: 23% requests failed due to resource exhaustion
Advanced Structured Concurrency:
# Same test with structured concurrency patterns
wrk -t8 -c1000 -d30s http://localhost:8080/structured-dashboard
Running 30s test @ http://localhost:8080/structured-dashboard
Thread Stats Avg Stdev Max +/- Stdev
Latency 892.34ms 234.56ms 2.1s 82.45%
Req/Sec 123.67 12.34 145.00 89.23%
Requests/sec: 989.36
Memory usage: 785MB
SUCCESS: 0.2% error rate (all due to actual service failures)
How to read this run:
| Metric | Traditional + Libraries | Structured Concurrency | Improvement | Notes |
|---|---|---|---|---|
| Requests/Second | 715 | 989 | ~38% higher in this run | Results from one environment; validate your workload |
| Memory Usage | 1.2GB | 785MB | ~35% lower in this run | Results from one environment; validate your workload |
| Error Rate | 23% | 0.2% | Lower in this run | Results from one environment; validate your workload |
| P95 Latency | 3.2s | 1.4s | ~56% lower in this run | Results from one environment; validate your workload |
| Resource Cleanup | Manual | Scope-managed | Model difference | Results from one environment; validate your workload |
Runtime runtime = Runtime.getRuntime();
System.gc();
Thread.sleep(100);
long beforeStructured = runtime.totalMemory() - runtime.freeMemory();
for (int i = 0; i < 100; i++) {
final int taskId = i;
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
var t1 = scope.fork(() -> timedTask("Task-" + taskId, 1));
var t2 = scope.fork(() -> timedTask("Task-" + taskId, 1));
scope.join();
scope.throwIfFailed();
t1.get();
t2.get();
}
}
System.gc();
Thread.sleep(100);
long afterStructured = runtime.totalMemory() - runtime.freeMemory();
This micro-benchmark focuses on allocation behavior around scope-managed tasks; validate with sustained workload tests before drawing production conclusions.
public class StructuredMicroservice {
private final BusinessService businessService = new BusinessService();
private void handleUserDashboard(HttpExchange exchange) {
handleRequest(exchange, "USER_DASHBOARD", () -> {
String userId = getQueryParam(exchange, "userId", "default-user");
return businessService.buildDashboard(userId);
});
}
private void handleDataWithFallback(HttpExchange exchange) {
handleRequest(exchange, "DATA_WITH_FALLBACK", () -> {
String key = getQueryParam(exchange, "key", "default-key");
return businessService.getDataWithFallback(key);
});
}
private void handleOrderProcessing(HttpExchange exchange) {
handleRequest(exchange, "ORDER_PROCESSING", () -> {
String orderId = getQueryParam(exchange, "orderId", "default-order");
return businessService.processOrder(orderId);
});
}
}
This pattern handles:
The practical value is graceful degradation when one dependency is slow or unavailable.
Use Circuit Breakers when:
Use Retry Patterns when:
Use Fallback Patterns when:
Use First-Success when:
A practical rollout sequence:
Phase 1: Replace Core Orchestration
// Before: Complex CompletableFuture chains
CompletableFuture<String> cf1 = CompletableFuture.supplyAsync(() -> simulateService("A", 1));
CompletableFuture<String> cf2 = CompletableFuture.supplyAsync(() -> simulateService("B", 1));
CompletableFuture.allOf(cf1, cf2).join();
cf1.get();
cf2.get();
// After: Simple structured concurrency
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
var task1 = scope.fork(() -> simulateService("A", 1));
var task2 = scope.fork(() -> simulateService("B", 1));
scope.join();
scope.throwIfFailed();
task1.get();
task2.get();
}
Phase 2: Add Resilience Patterns
// Wrap existing calls with resilience
scopedHandler.runWithCircuitBreaker(
() -> callUnreliableService(request),
dbCircuitBreaker
)
Phase 3: Optimize and Tune
// Fine-tune timeouts, retry counts, circuit breaker thresholds
new ScopedRequestHandler.CircuitBreakerConfig(3, Duration.ofSeconds(30));
scopedHandler.runWithRetry(
() -> unstableExternalService(operation),
3,
Duration.ofMillis(500)
);
DO:
DON'T:
Start with one critical service, validate operational behavior, then expand based on observed results.
In Part 6, we'll dive into Performance Analysis and Benchmarking, comparing virtual threads against platform threads and reactive frameworks across real workload patterns.
We'll cover:
Part 5 complete. We focused on resilience patterns you can adopt incrementally in production systems.
Series Navigation:
If you implement these patterns, compare failure-handling behaviour and latency tails first.