engnotes.dev
NotebookTopicsAbout

Subscribe

One email when a new post goes up. Nothing else.

one per post · no tracking · also on RSS

Site

  • Notebook
  • Topics
  • About
  • Contact

Topics

Project Loom9Structured Concurrency9Tail Latency & System Behavior4

Elsewhere

  • GitHub
  • X
  • LinkedIn
  • Email
engnotes.dev© 2026 Jagdish Salgotra · written on personal time. not on employer time.
PrivacyTermsCookies
blog/structured-concurrency/part 7
Structured Concurrency · Part 7 of 9

Three structured-concurrency patterns we run in a fan-out service

Structured concurrency patterns are worth the complexity only when the cancellation policy is decided before the first fork, not after the first timeout. The three we run: aggregation on quorum, bulkheading per tenant, and a deadline shape that protects against one slow upstream.

J
Jagdish Salgotra
2026-05-04·8 min read·~901 words

Series navigation

← PreviousComposing resilience policies as separable layersNext →Four operational checks we run on every StructuredTaskScope
Code repositoryproject-loom
#structured-concurrency
share
J

Written by

Jagdish Salgotra

Distributed systems, cloud-native architecture, and the JVM. mostly shipping, occasionally reading.

all posts

Keep reading · rest of the series

  • 2026-03-2215 min read
    Part 1
    What structured scopes actually catch
  • 2026-03-3010 min read
    Part 2
    What a missed deadline should do, and what it should not
  • 2026-04-0610 min read
    Part 3
    Cancelling siblings before they burn capacity
  • 2026-04-1210 min read
    Part 4
    Two workflow shapes that show up after fork-and-wait
Was this article helpful? or email →
anonymous · no account needed

On this page

Reading progress

0 min of 8 · ~8 left

Ask the post

Any answer points back at the paragraph it came from.

Java version note Code here targets Java 21 preview and uses ShutdownOnSuccess and ShutdownOnFailure directly. On Java 25 this will not compile. The mental models transfer cleanly. Only the syntax changes. Part 9 covers the migration.

The failures that force you to reach for these patterns show up the same way: p50 looks healthy while p99 is painful. Every dependency claims it is mostly fine in isolation. One slow optional call punishes the whole request.

These patterns are worth the complexity only when you can answer four questions before forking anything: which result counts, how long you wait, what a degraded response looks like, and how much extra load you will accept downstream. If you cannot answer those four, the structured concurrency machinery just hides the problem one layer deeper.


Pattern 1: First Successful Result

ShutdownOnSuccess gives you first-success semantics. The first fork to return a result completes the scope. The rest are interrupted.

Use this when multiple idempotent read sources are genuinely equivalent — replicated services or redundant endpoints where any successful answer is acceptable. Read-replica fan-out for a search or catalog query is the typical case: three replicas hold the same data, the fastest healthy one wins, and the others are cancelled before they finish work nobody will read.

java
static void runWithShutdownOnSuccess() throws Exception {
    try (var scope = new StructuredTaskScope.ShutdownOnSuccess<String>()) {
        scope.fork(() -> slowService("Service-A", 1000));
        scope.fork(() -> slowService("Service-B", 500));
        scope.fork(() -> slowService("Service-C", 200));

        scope.join();

        String result = scope.result();
        logger.info("First successful result: {}", result);
    }
}

Do not use this for non-idempotent operations or sources with different semantics. First-success joiners hide slower failures by design. Monitor failure rates per source separately or silent degradation will look like normal variance.


Pattern 2: Bounded partial results

Emit results as they land and bound the wait. Slow tasks no longer punish fast ones.

Use this for responses with optional sections: recommendations, enrichment panels, diagnostics, any page where partial data is still useful and the missing pieces are visible to callers.

A product or dashboard page that fans out to half a dozen widget services with a hard 500ms budget is the right fit. Fast widgets render. Slow ones surface as loading or get dropped. The page never blocks on the slowest dependency.

java
public ProgressiveSummary<T> executeWithProgressCallback(
        List<Callable<T>> tasks,
        ProgressCallback<T> callback,
        Duration maxDuration) throws InterruptedException {

    long startTime = System.currentTimeMillis();
    List<T> results = new ArrayList<>(Collections.nCopies(tasks.size(), null));
    List<Exception> errors = new ArrayList<>();
    boolean[] completed = new boolean[tasks.size()];
    int totalCompleted = 0;

    try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
        List<StructuredTaskScope.Subtask<T>> subtasks = new ArrayList<>();

        for (Callable<T> task : tasks) {
            subtasks.add(scope.fork(task));
        }

        Instant maxTime = Instant.now().plus(maxDuration);

        while (totalCompleted < tasks.size() && Instant.now().isBefore(maxTime)) {
            try {
                scope.joinUntil(Instant.now().plusMillis(50));
            } catch (TimeoutException e) {
                // 50ms polling tick elapsed before all tasks finished. Loop again.
            }

            for (int i = 0; i < subtasks.size(); i++) {
                if (!completed[i]) {
                    var subtask = subtasks.get(i);

                    switch (subtask.state()) {
                        case SUCCESS:
                            try {
                                T result = subtask.get();
                                results.set(i, result);
                                completed[i] = true;
                                totalCompleted++;
                                callback.onProgress(i, result);
                            } catch (Exception e) {
                                errors.add(new RuntimeException("Task " + i + " failed", e));
                                completed[i] = true;
                                totalCompleted++;
                            }
                            break;

                        case FAILED:
                            errors.add(new RuntimeException("Task " + i + " failed"));
                            completed[i] = true;
                            totalCompleted++;
                            break;

                        case UNAVAILABLE:
                            // Still running. Check again next pass.
                            break;
                    }
                }
            }
        }

        if (Instant.now().isAfter(maxTime)) {
            scope.shutdown();
        }

        return new ProgressiveSummary<>(
            results.stream().filter(Objects::nonNull).collect(toList()),
            errors,
            totalCompleted,
            tasks.size(),
            System.currentTimeMillis() - startTime,
            Instant.now().isAfter(maxTime)
        );

    } catch (Exception e) {
        throw new RuntimeException("Unexpected timeout in progressive results", e);
    }
}

The 50ms polling tick is deliberate. It gives completed subtasks a short window to be picked up without spinning. Tune it to your latency budget. Slow optional sections drop out cleanly instead of holding up the whole response.

Do not use this when every result is required for correctness. That is Pattern 4 territory.


Pattern 3: Hedged read with delay

The hedge fires only when the primary is slow, capping tail latency without doubling load.

Use this when one logical read has rare but painful tail spikes. The delay is the control. Without it, hedging becomes an accidental load multiplier.

Primary-key lookup against a sharded store, a cached profile fetch, an authoritative pricing call — these are the right fit. Healthy at p50, occasionally five to ten times longer for one specific request, and that long tail dominates user-visible latency.

The shape is simple. Fork a primary at t=0. Fork a hedge that sleeps for hedgeAfterMillis before doing real work. Race them under ShutdownOnSuccess. The key detail is what happens when the primary returns first: the still-sleeping hedge must be interrupted before it issues a duplicate request. On Java 21 this relies on Thread.sleep honouring interruption inside the hedge fork. It works. Java 25 makes the cancellation semantics cleaner via the Joiner API. Part 9 covers that diff.

Tune hedgeAfterMillis to your p95-to-p99 envelope. Watch the hedge-fire rate to confirm the delay is keeping duplicate load capped. If the hedge fires on more than a small percentage of requests, the delay is too short or the primary has a real problem worth investigating directly.

Do not hedge without a delay budget and load monitoring in place.


Pattern 4: Controlled degradation contract

Use this when the fallback is part of the product contract, not a way to mask a broken primary path.

Search falling back to a cached top-N when the live index is unavailable. A recommendations API falling back to a generic list when the personalised model times out. The user gets a clearly degraded but coherent response. Your dashboards can count fallback hits as a real signal instead of burying them in a retry counter.

The structure matters here. The fallback runs outside the scope, sequentially, only after the primary has failed. Folding it into the same scope turns this into Pattern 3, removes the "primary failed first" trigger, and creates exactly the duplicate load this pattern was designed to avoid.

java
public <T> T runWithFallback(Callable<T> primary, Callable<T> fallback) throws Exception {
    try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
        var primaryFuture = scope.fork(primary);

        scope.join();

        try {
            scope.throwIfFailed();
            return primaryFuture.get();
        } catch (Exception e) {
            logger.warn("Primary failed, using fallback: {}", e.getMessage());
            return fallback.call();
        }
    }
}

Make sure callers can tell when they got the fallback path. A silent fallback is just a slow bug.


Pattern selection

If you haveUseDo not use
Equivalent idempotent read sourcesPattern 1: First successful resultNon-idempotent writes or sources with different semantics
Optional results with a deadlinePattern 2: Bounded partial resultsWorkflows where every result is required
One logical read with painful p99Pattern 3: Hedged read with delayHedging without a delay budget or load monitoring
Primary path plus cheap degraded responsePattern 4: Controlled degradation contractExpensive, stale, or misleading fallback paths

Resources

  • JEP 453: Structured Concurrency (Preview, JDK 21)
  • JDK 21 API: StructuredTaskScope (Preview)