engnotes.dev
NotebookTopicsAbout

Subscribe

One email when a new post goes up. Nothing else.

one per post · no tracking · also on RSS

Site

  • Notebook
  • Topics
  • About
  • Contact

Topics

Structured Concurrency9Tail Latency & System Behavior3

Elsewhere

  • GitHub
  • X
  • LinkedIn
  • Email
engnotes.dev© 2026 Jagdish Salgotra · written on personal time. not on employer time.
PrivacyTermsCookies
blog/structured-concurrency/part 6
Structured Concurrency · Part 6 of 9

Composing resilience policies as separable layers

Resilience composition in structured concurrency comes down to one rule: keep each policy a separable layer that fires visibly in logs, or rebuild the complexity you were trying to remove.

J
Jagdish Salgotra
2026-04-26·10 min read·~473 words

Series navigation

← PreviousWhy downstream capacity is the real ceiling on fan-outNext →Three structured-concurrency patterns we run in a fan-out service
Code repositoryproject-loom
#structured-concurrency
share
J

Written by

Jagdish Salgotra

Distributed systems, cloud-native architecture, and the JVM. mostly shipping, occasionally reading.

all posts

Keep reading · rest of the series

  • 2026-03-2215 min read
    Part 1
    What structured scopes actually catch
  • 2026-03-3010 min read
    Part 2
    What a missed deadline should do, and what it should not
  • 2026-04-0610 min read
    Part 3
    Cancelling siblings before they burn capacity
  • 2026-04-1210 min read
    Part 4
    Two workflow shapes that show up after fork-and-wait
Was this article helpful? or email →
anonymous · no account needed

On this page

Reading progress

0 min of 10 · ~10 left

Ask the post

Any answer points back at the paragraph it came from.

Note This article targets Java 21 preview structured concurrency APIs (StructuredTaskScope, JEP 453). Part 9 covers migration to newer preview APIs in Java 25.

When the Dashboard Lies

I sat in a debugging session where the question was embarrassingly simple: did the dependency recover, or did we serve fallback? We had retries. We had a timeout. We had a fallback to cache. The dashboard said: clean success.

It took two engineers and forty minutes of log tracing to figure out that "clean success" meant the fallback had been serving cached responses for twenty minutes while upstream recovered. Clean success rate. Zero alerts. One completely invisible failure.

That is the composition problem. Once timeout, retry, and fallback all live in the same handler, the code becomes harder to reason about than the failure itself. And worse -> the metrics lie to you.

The rest of this article is the pattern that fixes it.

Why Composition Is the Hard Part

Most teams can adopt a single pattern (for example timeout + fail-fast). Complexity usually appears when patterns are combined:

  • timeout + fallback,
  • retry + circuit breaker,
  • partial results + admission control.

The goal is to keep policy explicit and layered, not hidden in nested lambdas.

Composition stays readable when each policy layer has one clear job.

Composition Principle: Keep Layers Separable

A practical order for policy composition:

  1. scope lifecycle (fork/join/failure),
  2. timeout budget,
  3. retry policy,
  4. fallback/degradation policy,
  5. admission/bulkhead control.

Each layer should be testable independently.

Example: Timeout + Retry + Fallback

The structured path makes success, retry, and fallback distinguishable in metrics and logs.

java
public String performRetryableOperation(String operation) throws Exception {
    logger.info("Performing retryable operation: {}", operation);

    // Layer order: scope -> retry -> timeout (adjust per policy needs).
    return scopedHandler.runWithRetry(
        () -> unstableExternalService(operation),
        3,
        Duration.ofMillis(500)
    );
}

public <T> T runWithRetry(Callable<T> task, int maxRetries, Duration retryDelay) throws Exception {
    Exception lastException = null;

    for (int attempt = 1; attempt <= maxRetries; attempt++) {
        try {
            return runInScope(task);
        } catch (Exception e) {
            lastException = e;
            logger.warn("Attempt {} failed: {}", attempt, e.getMessage());

            if (attempt < maxRetries) {
                Thread.sleep(retryDelay.toMillis());
            }
        }
    }

    throw new RuntimeException("All " + maxRetries + " attempts failed", lastException);
}

public <T> T runWithFallback(Callable<T> primary, Callable<T> fallback) throws Exception {
    try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
        var primaryFuture = scope.fork(primary);

        scope.join();

        try {
            scope.throwIfFailed();
            return primaryFuture.get();
        } catch (Exception e) {
            logger.warn("Primary task failed, using fallback: {}", e.getMessage());
            return fallback.call();
        }
    }
}

Keep fallback bounded and observable. Unbounded fallback chains are hard to operate.

Example: Shared Scope for Correlated Calls

java
public <T1, T2> ParallelResult<T1, T2> runInParallel(
        Callable<T1> task1,
        Callable<T2> task2) throws Exception {

    // One scope per logical unit prevents fragmented cancellation.
    try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
        var future1 = scope.fork(task1);
        var future2 = scope.fork(task2);

        scope.join();
        scope.throwIfFailed();

        return new ParallelResult<>(future1.get(), future2.get());
    }
}

Use one scope per logical unit of work; avoid spreading a single request across unrelated scopes.

Best Practices Checklist

  • Keep one ownership boundary per request step.
  • Use explicit timeout budgets, not implicit defaults.
  • Ensure throwIfFailed() is called after join() in Java 21 patterns.
  • Keep retry policy low and idempotency-aware.
  • Instrument policy paths (success, timeout, retry, fallback).
  • Make degradation decisions explicit in response contracts.

Testing Matrix

For composed policies, test combinations, not only single failures:

  • Slow + retry success.
  • Slow + retry exhausted + fallback success.
  • Fast failures + breaker open.
  • Timeout while one subtask succeeds.
  • Cancellation cleanup on partial completion.

Java 21 Preview Limitations

  • joinUntil(...) throws TimeoutException; define deterministic behavior afterwards.
  • Mixed retry logic inside subtasks can hide root causes; correlation IDs across retries and fallbacks help trace failures in logs.
  • Over-composition can recreate the same complexity structured concurrency aims to remove.

Build and Runtime Reminder

bash
javac --release 21 --enable-preview ...
java --enable-preview ...

Resources

  • JEP 453: Structured Concurrency (Preview)
  • Java 21 API: StructuredTaskScope (Preview)
  • Resilience4j