Heads up: Java version matters here
Code in this article targets Java 21 preview and uses StructuredTaskScope.ShutdownOnSuccess / ShutdownOnFailure directly. If you are on Java 25, this code will not compile as-is. The API was reshaped around StructuredTaskScope.open(...) with explicit Joiner strategies. The mental models below transfer cleanly; only the syntax changes. See Part 9 for the Java 21 to Java 25 migration walkthrough.
The snippets below come from the repository source, specifically the feature/java-21 branch, where the JDK 21 form lives. The main files are StructuredExampleWithSuccess, AdvancedStructuredPatterns, ConcurrentServiceLayer, and ScopedRequestHandler. (The repo's main already carries the JDK 25 migration; Part 9 walks through the diff.)
Why Advanced Patterns Matter
The failures that usually force you to reach for these patterns show up the same way: p50 looks healthy while p99 is painful. Every dependency claims it is "mostly fine" in isolation. One slow optional call punishes the whole request. One careless hedge doubles the load on a dependency that was already struggling.
These patterns are worth the complexity only when you can spell out the policy before forking anything: which result counts, how long you wait, what a degraded response looks like, and how much extra load you will accept downstream. If you can't answer those four, the structured concurrency machinery just hides the problem one layer deeper.
Pattern 1: First Successful Result
Java 21 preview ships StructuredTaskScope.ShutdownOnSuccess<T> for first-success wins. The first fork to return a result completes the scope; the other forks are interrupted.
Use this when multiple idempotent read sources are genuinely equivalent, such as replicated services or redundant endpoints where any successful answer is acceptable.
Where this fits: read-replica fan-out for a search or catalog query. Three replicas hold the same data, the fastest healthy one wins, and the others get cancelled before they finish work nobody will read.
static void runWithShutdownOnSuccess() throws Exception {
try (var scope = new StructuredTaskScope.ShutdownOnSuccess<String>()) {
scope.fork(() -> slowService("Service-A", 1000));
scope.fork(() -> slowService("Service-B", 500));
scope.fork(() -> slowService("Service-C", 200));
scope.join();
String result = scope.result();
logger.info("First successful result: {}", result);
}
}
Use this when multiple idempotent read sources are genuinely equivalent (replicated services, redundant endpoints). The first successful answer wins, and the others are canceled.
Pattern 2: Bounded Partial Results
Emit results as they land and bound the wait. Slow tasks no longer punish fast ones.
Use this for responses with optional sections: recommendations, enrichment panels, diagnostics, or any page where partial data is still useful and the missing pieces are visible to callers.
Where this fits: a product or dashboard page that fans out to half a dozen widget services with a hard 500 to 800 ms budget. The fast widgets render, the slow ones surface as "loading" or get quietly dropped, and the page never blocks on the slowest dependency.
public ProgressiveSummary<T> executeWithProgressCallback(
List<Callable<T>> tasks,
ProgressCallback<T> callback,
Duration maxDuration) throws InterruptedException {
long startTime = System.currentTimeMillis();
List<T> results = new ArrayList<>(Collections.nCopies(tasks.size(), null));
List<Exception> errors = new ArrayList<>();
boolean[] completed = new boolean[tasks.size()];
int totalCompleted = 0;
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
List<StructuredTaskScope.Subtask<T>> subtasks = new ArrayList<>();
for (Callable<T> task : tasks) {
subtasks.add(scope.fork(task));
}
Instant maxTime = Instant.now().plus(maxDuration);
while (totalCompleted < tasks.size() && Instant.now().isBefore(maxTime)) {
try {
scope.joinUntil(Instant.now().plusMillis(50));
} catch (TimeoutException e) {
}
for (int i = 0; i < subtasks.size(); i++) {
if (!completed[i]) {
var subtask = subtasks.get(i);
switch (subtask.state()) {
case SUCCESS:
try {
T result = subtask.get();
results.set(i, result);
completed[i] = true;
totalCompleted++;
callback.onProgress(i, result);
} catch (Exception e) {
errors.add(new RuntimeException("Task " + i + " failed", e));
completed[i] = true;
totalCompleted++;
}
break;
case FAILED:
errors.add(new RuntimeException("Task " + i + " failed"));
completed[i] = true;
totalCompleted++;
break;
case UNAVAILABLE:
break;
}
}
}
}
if (Instant.now().isAfter(maxTime)) {
System.out.printf(" Progressive execution timed out after %d ms%n",
maxDuration.toMillis());
scope.shutdown();
}
return new ProgressiveSummary<>(
results.stream().filter(Objects::nonNull).collect(toList()),
errors,
totalCompleted,
tasks.size(),
System.currentTimeMillis() - startTime,
Instant.now().isAfter(maxTime)
);
} catch (Exception e) {
throw new RuntimeException("Unexpected timeout in progressive results", e);
}
}
Slow optional sections drop out cleanly instead of holding up the whole response.
Pattern 3: Hedged Read with Delay
The hedge fires only when the primary is slow, capping tail latency without doubling load.
Use this when one logical read has rare but painful tail spikes. The delay is the control. Without it, hedging quietly becomes an accidental load multiplier.
Where this fits: a single hot read that already meets p50 comfortably but blows p99. Think primary-key lookup against a sharded store, a cached profile fetch, or an authoritative pricing call. The dependency is healthy at p50 but occasionally takes five to ten times longer for one specific request, and that long tail dominates user-visible latency. Dean and Barroso's "The Tail at Scale" (CACM 2013) is the canonical writeup of why tail latency matters more than the mean.
The shape is simple. Fork a primary. Fork a hedge that sleeps for hedgeAfterMillis before doing real work. Race them under ShutdownOnSuccess. The key detail is what happens when the primary returns first: the still-sleeping hedge must be interrupted before it issues a duplicate request, otherwise the delay buys you nothing. On JDK 21 you lean on Thread.sleep honouring interruption inside the hedge fork. It works, but it reads awkwardly next to the surrounding race logic.
JDK 25 reshapes this around a Joiner strategy that returns the moment a subtask succeeds and cleanly cancels the rest. That makes hedge-with-delay read like a single composed pattern instead of a manual race. The working implementation, plus a deeper walkthrough of why the cancellation semantics matter for hedge load, lives in Part 9: Java 21 to Java 25 Migration.
The mental model still applies on Java 21. Tune hedgeAfterMillis to your p95-to-p99 envelope, and watch the hedge-fire rate in production to confirm the delay is actually keeping duplicate load capped.
Pattern 4: Controlled Degradation Contract
Use this when the fallback is part of the product contract, not a hidden way to mask a broken primary path. The fallback is deliberately sequential. It only starts once the primary has failed, so it never adds duplicate load while the primary is still trying.
Where this fits: a search request that falls back to a cached top-N when the live index is unavailable, or a recommendations API that falls back to a generic "popular this week" list when the personalised model times out. The user gets a clearly degraded but still-coherent response, and your dashboards can count fallback hits as a real metric instead of burying them inside a retry counter.
Honest framing. The structural part is the primary fork only. ShutdownOnFailure gives you cancellation and aggregation for that one concurrent operation. The fallback is plain sequential code in the catch handler. It does not run inside the scope. Important: the fallback runs outside the scope after the primary has already failed. Folding it into the same scope turns the pattern into a race and defeats the purpose of avoiding a duplicate load. That turns the pattern into a race (Pattern 3), removes the "primary failed first" trigger, and creates exactly the duplicate load this pattern was designed to avoid.
public <T> T runWithFallback(Callable<T> primary, Callable<T> fallback) throws Exception {
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
var primaryFuture = scope.fork(primary);
scope.join();
try {
scope.throwIfFailed();
return primaryFuture.get();
} catch (Exception e) {
logger.warn("Primary task failed, using fallback: {}", e.getMessage());
return fallback.call();
}
}
}
Make sure callers can tell when they got the fallback path instead of the primary. A silent fallback is just a slow bug.
Pattern Selection Matrix
| If you have... | Use... | Do not use... |
|---|
| Equivalent idempotent read sources | Pattern 1: First Successful Result | Non-idempotent writes or sources with different semantics |
| Optional results with a deadline | Pattern 2: Bounded Partial Results | Workflows where every result is required for correctness |
| One logical read with painful p99 | Pattern 3: Hedged Read with Delay | Hedging without a delay budget or load monitoring |
| Primary path plus cheap degraded response | Pattern 4: Controlled Degradation Contract | Expensive, stale, or misleading fallback paths |
Java 21 Preview Notes
- First-success joiners can hide slower failures by design; this is fine for equivalent read race patterns, but monitor failure rates separately.
- Always bound hedging and partial-result paths with clear deadlines.
- Deadline behavior assumes unfinished work observes interruption promptly.
- Keep additional load from hedging measurable and capped.
- Monitor per-source success rates in racing patterns to detect silent degradations.
Testing Guidance
Test advanced patterns with:
- Healthy dependencies.
- One slow dependency.
- One failing dependency.
- Repeated degraded responses under load.
- Hedge path effectiveness vs extra dependency load.
Build and Runtime Reminder
javac --release 21 --enable-preview ...
java --enable-preview ...
Resources