engnotes.dev
NotebookTopicsAbout

Weekly Digest

The engineering brief.

Weekly insights · No spam · RSS

Site

  • Notebook
  • Topics
  • About
  • Contact

Topics

Structured Concurrency9

Elsewhere

  • GitHub
  • X (Twitter)
  • LinkedIn
  • Email
engnotes.dev© 2026 Jagdish Salgotra · written on personal time. not on employer time.
PrivacyTermsCookies
blog/structured-concurrency/part 8
Structured Concurrency

Four operational checks we run on every StructuredTaskScope

Before a fan-out service can be trusted under load, four things need to be true: outcomes counted per scope, deadlines propagated, bulkheads in place, and pinning watched in JFR. What each one looks like in code.

J
Jagdish Salgotra
2026-05-10·10 min read·~649 words

Series navigation

← PreviousThree structured-concurrency patterns we run in a fan-out serviceNext →Migrating our fan-out service from Java 21 to Java 25
Code repositoryproject-loom
#structured-concurrency
share
J

Written by

Jagdish Salgotra

Distributed systems, cloud-native architecture, and the JVM. mostly shipping, occasionally reading.

all posts

Keep reading

  • 2026-05-1712 min readMigrating our fan-out service from Java 21 to Java 25
  • 2026-05-048 min readThree structured-concurrency patterns we run in a fan-out service
  • 2026-04-2610 min readComposing resilience policies as separable layers

Was this useful?

Was this article helpful? or email →
anonymous · no account needed

On this page

Reading progress

0 min of 10 · ~10 left

Ask the post

Any answer points back at the paragraph it came from.

Note This part focuses on operating Java 21 preview structured concurrency APIs (StructuredTaskScope, JEP 453) in real services. Part 9 is a migration-focused appendix for Java 21 -> Java 25 preview API changes.

The first thing that caught my attention in testing was a metric that would not come down. The client had already timed out. The scope had returned. But something was still running, still writing to logs, still holding a thread. That is the zombie-subtask problem, and it is the reason this article exists.

Structured concurrency is still preview in Java 21. This is not production guidance in the sense of battle-tested at scale. It is what I found running this seriously in testing, with an eye on what would have to be true before I trusted it in prod.

The zombie-subtask problem

Without scope-level cancellation, a slow subtask keeps running after the response is gone. The client sees a timeout. Your metrics do not.

Same client outcome either way. Different cost. The before case leaves threads running work nobody is waiting for anymore.

java
public String shortTimeoutExample() throws Exception {
    Instant deadline = Instant.now().plusMillis(350);

    try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
        var slowTask = scope.fork(() -> simulateSlowService("slow-service", 500));
        var fastTask = scope.fork(() -> simulateSlowService("fast-service", 100));

        scope.joinUntil(deadline);
        scope.throwIfFailed();

        return String.format("Timeout Results: %s, %s",
                slowTask.get(), fastTask.get());
    }
}

joinUntil sets the ceiling. Without it, the scope waits as long as the slowest subtask wants to run.

For this to work, subtasks have to handle interruption. Code that swallows InterruptedException will not cancel cleanly regardless of what the scope does:

java
private static String simulateService(String name, long delayMs) {
    try {
        Thread.sleep(delayMs);
        return name + "-OK";
    } catch (InterruptedException e) {
        // Propagate interrupt for graceful client cleanup.
        Thread.currentThread().interrupt();
        throw new RuntimeException("Interrupted: " + name, e);
    }
}

If your downstream clients do not propagate interrupt, the deadline budget is set but cancellation never lands. The metric stays elevated. The logs keep writing. Same picture as before.

What to measure

Track outcomes per scope, not just per request:

  • success
  • timeout
  • failure
  • cancellation

The cancellation count is the one most teams skip. It is also the one that tells you whether your deadline propagation is actually working. If cancellations are zero while timeouts are non-zero, something is swallowing the interrupt.

Wire these into Micrometer. Without scope-level counters, a zombie-subtask problem looks like normal latency variance until it does not.

Downstream protection still matters

Virtual threads do not remove downstream limits. A constrained dependency has a capacity ceiling regardless of how cheaply you can create threads. Semaphores or bulkheads around constrained downstreams, bounded retries, circuit breakers on unstable endpoints, none of that goes away with structured concurrency.

Keep timeout budgets explicit and path-specific. One global timeout for all fan-out paths will either be too tight for the slow paths or too loose for the fast ones.

Preview risk in practice

Structured concurrency is still preview in Java 21 and still preview in Java 25. The API has evolved between versions. Practically:

  • Isolate preview usage behind service-layer boundaries so a future API change does not propagate everywhere
  • Keep the previous orchestration implementation until you have validated the structured concurrency path under degraded dependencies
  • Run load tests with slow and failing downstreams before any rollout, gradual traffic shifting in early Java 21 testing surfaced interruption edge cases that normal load tests missed

What to check before trusting this in production

  • --enable-preview in both compile and runtime paths
  • Scope-level outcome metrics wired to dashboards before any traffic
  • Cancellation count non-zero under timeout conditions
  • Subtask interrupt handling validated explicitly, not assumed
  • Load tests include degraded dependency scenarios
  • Rollback path to previous orchestration implementation validated

Build and Runtime Reminder

bash
javac --release 21 --enable-preview ...
java --enable-preview ...

Series Wrap-Up

Across Parts 1-8, the consistent theme is not "more threads"; it is better lifecycle control for concurrent work.

In Java 21 preview, structured concurrency can already reduce orchestration complexity when used with clear policies for timeout, cancellation, and degradation.

Part 9 provides the migration appendix for teams moving this code to Java 25 preview APIs.

Resources

  • JEP 453: Structured Concurrency (Preview)
  • JEP 444: Virtual Threads
  • Java 21 API: StructuredTaskScope (Preview)