engnotes.dev
NotebookTopicsAbout

Subscribe

One email when a new post goes up. Nothing else.

one per post · no tracking · also on RSS

Site

  • Notebook
  • Topics
  • About
  • Contact

Topics

Project Loom9Structured Concurrency9Tail Latency & System Behavior4

Elsewhere

  • GitHub
  • X
  • LinkedIn
  • Email
engnotes.dev© 2026 Jagdish Salgotra · written on personal time. not on employer time.
PrivacyTermsCookies
blog/project-loom/part 4
Project Loom · Part 4 of 9

Structured Concurrency in Practice

Long CompletableFuture chains look fine in code review and painful in incident calls because a failed call leaves its siblings still running, and a structured scope is what cancels them on the way out.

J
Jagdish Salgotra
2025-07-27·25 min read·~1,200 words

Series navigation

← Previous · Part 3Real-World MicroservicesNext · Part 5 →Advanced Structured Concurrency Patterns
Code repositoryproject-loom
#project-loom
share
J

Written by

Jagdish Salgotra

Distributed systems, cloud-native architecture, and the JVM. mostly shipping, occasionally reading.

all posts

Keep reading · rest of the series

  • 2025-07-0615 min read
    Part 1
    Java Virtual Threads: Why They Matter for I/O Scalability
  • 2025-07-1315 min read
    Part 2
    Building Web Services with Virtual Threads
  • 2025-07-2028 min read
    Part 3
    Real-World Microservices
  • 2025-08-0312 min read
    Part 5
    Advanced Structured Concurrency Patterns
Was this article helpful? or email →
anonymous · no account needed

On this page

Reading progress

0 min of 25 · ~25 left

Ask the post

Any answer points back at the paragraph it came from.

Note This series uses Java 21 as the baseline. Structured concurrency snippets in this part (StructuredTaskScope, JEP 453) use preview APIs and require --enable-preview.

Structured concurrency is not mainly a performance feature. It is an ownership feature.

When a request forks three pieces of work, those pieces of work should have a visible parent. If one branch fails, the sibling branches should not drift away as orphaned tasks. If the caller only needs the first successful answer, the code should say that locally. If every branch is required, the parent should join the whole scope before it reads results.

That is the useful shift. Structured concurrency makes the lifetime of concurrent work look like the lifetime of ordinary block-scoped code.

This article is learning material. The main branch now builds with OpenJDK 25.0.2 and uses the Java 25 preview structured-concurrency API, with the Java 21 version separately managed in the feature/java-21 branch. The snippets below quote the current main-branch code that generated the measurements. Part 9 covers the Java 21 to Java 25 migration.

The example shape

The smallest example is StructuredExample.java. It forks three service calls and joins them inside one scope:

java
try (var scope = StructuredTaskScope.open(StructuredTaskScope.Joiner.awaitAllSuccessfulOrThrow())) {
    Subtask<String> fetch1 = scope.fork(() -> fetchFromService1());
    Subtask<String> fetch2 = scope.fork(() -> fetchFromService2());
    Subtask<String> fetch3 = scope.fork(() -> fetchFromService3());

    scope.join();

    String result = fetch1.get() + fetch2.get() + fetch3.get();
    logger.info("Combined result: {}", result);
}

The service delays are fixed:

java
static String fetchFromService1() throws Exception {
    Thread.sleep(300);
    return "Service1 ";
}

static String fetchFromService2() throws Exception {
    Thread.sleep(200);
    return "Service2 ";
}

static String fetchFromService3() throws Exception {
    Thread.sleep(100);
    return "Service3 ";
}

When I ran it, the services completed in delay order and the combined result printed after the 300ms branch finished:

text
Service3 completed on thread:
Service2 completed on thread:
Service1 completed on thread:
Combined result: Service1 Service2 Service3

The blank thread name is the same virtual-thread naming detail from Parts 2 and 3. The logging pattern includes Thread.currentThread().getName(), but these virtual threads were not created with explicit names.

The parent owns the children

The important line is not scope.fork(...) by itself. The important shape is fork, join, then read:

java
var task1 = scope.fork(() -> simulateService("Service-A", 200));
var task2 = scope.fork(() -> simulateService("Service-B", 300));
var task3 = scope.fork(() -> simulateService("Service-C", 100));

scope.join();

String result = String.format("Results: %s, %s, %s",
    task1.get(), task2.get(), task3.get());

That pattern is visible in StructuredConcurrencyComparison.java. The same file also has the CompletableFuture version:

java
CompletableFuture<String> cf1 = CompletableFuture.supplyAsync(() -> simulateService("Service-A", 200));
CompletableFuture<String> cf2 = CompletableFuture.supplyAsync(() -> simulateService("Service-B", 300));
CompletableFuture<String> cf3 = CompletableFuture.supplyAsync(() -> simulateService("Service-C", 100));

CompletableFuture.allOf(cf1, cf2, cf3).join();

String result = String.format("Results: %s, %s, %s",
    cf1.get(), cf2.get(), cf3.get());

Both versions are capable of producing the right happy-path result. The distinction is where the ownership lives. In the structured version, the task lifetime is tied to the scope block. In the CompletableFuture version, the caller has to keep hold of the futures and remember the cancellation and cleanup policy separately.

What I ran

The measurements below were generated with OpenJDK 25.0.2 and Maven 3.9.12:

bash
mvn clean compile -DskipTests
mvn dependency:build-classpath -Dmdep.outputFile=cp.txt
export CP="$(cat cp.txt):target/classes"

The build succeeded and compiled 35 source files.

Then I ran the standalone examples:

bash
java --enable-preview -cp "$CP" app.js.concurrent.StructuredExample
java --enable-preview -cp "$CP" app.js.concurrent.StructuredExampleWithSuccess
java --enable-preview -cp "$CP" app.js.concurrent.StructuredExampleWithErrors
java --enable-preview -cp "$CP" app.js.concurrent.StructuredConcurrencyComparison
java --enable-preview -cp "$CP" app.js.concurrent.StructuredConcurrencyTester

The results below come from those commands.

Happy-path timing is not the headline

The comparison class runs three simulated service calls with 200ms, 300ms, and 100ms delays. Both the structured version and the CompletableFuture version should complete near 300ms because the slowest required branch sleeps for 300ms.

That is what happened:

text
StructuredTaskScope took: 310ms
CompletableFuture took: 313ms

The longer tester reported the same shape:

text
Expected time: ~300ms
StructuredTaskScope: 310ms (overhead: 10ms)
CompletableFuture: 306ms (overhead: 6ms)
StructuredTaskScope timing: ACCURATE
CompletableFuture timing: ACCURATE

That result is important because it prevents the wrong article from being written. This run does not support a claim that structured concurrency is faster on the happy path. Both approaches waited for the slowest required branch, and both landed near the expected duration.

The better claim is smaller: structured concurrency makes the ownership and failure policy easier to see in the code.

First success is a different policy

StructuredExampleWithSuccess.java shows two policies in one class. The first part waits for all services. The second part returns the first successful result:

java
try (var scope = StructuredTaskScope.open(
        StructuredTaskScope.Joiner.<String>allUntil(s -> s.state() == Subtask.State.SUCCESS)
)) {
    scope.fork(() -> slowService("Service-A", 1000));
    scope.fork(() -> slowService("Service-B", 500));
    scope.fork(() -> slowService("Service-C", 200));

    Stream<Subtask<String>> results = scope.join();

    String result = results
        .filter(s -> s.state() == Subtask.State.SUCCESS)
        .findFirst()
        .map(Subtask::get)
        .orElseThrow(() -> new Exception("No successful result"));
    logger.info("First successful result: {}", result);
}

The run produced:

text
=== ShutdownOnFailure Example ===
Service3 completed on thread:
Service2 completed on thread:
Service1 completed on thread:
All services completed: Service1 Service2 Service3

=== ShutdownOnSuccess Example ===
Service-C completed on thread:
First successful result: Service-C result

The comparison class measured the same first-success shape:

text
StructuredTaskScope first result: Fast-Service-OK (took 207ms)
CompletableFuture first result: Fast-Service-OK (took 201ms)

Again, the headline is not raw speed. The headline is locality. The structured version says "join until one subtask succeeds" at the scope boundary. A reader does not have to reconstruct the policy from anyOf(...) plus separate cancellation calls.

Failure behavior is where the shape pays off

StructuredExampleWithErrors.java forks three calls. One branch fails:

java
try (var scope = StructuredTaskScope.open(StructuredTaskScope.Joiner.awaitAllSuccessfulOrThrow())) {
    Subtask<String> fetch1 = scope.fork(() -> fetchFromService("Service-1", 300, false));
    Subtask<String> fetch2 = scope.fork(() -> fetchFromService("Service-2", 200, true));
    Subtask<String> fetch3 = scope.fork(() -> fetchFromService("Service-3", 100, false));

    scope.join();

    String result = fetch1.get() + fetch2.get() + fetch3.get();
    logger.info("Combined result: {}", result);
} catch (Exception e) {
    logger.error("One or more services failed: {}", e.getMessage());
    logger.error("All remaining tasks were cancelled automatically");
}

The run showed the 100ms branch completing and then the failure being reported when the failing 200ms branch threw:

text
Service-3 completed successfully on thread:
One or more services failed: java.lang.RuntimeException: Service-2 failed!
All remaining tasks were cancelled automatically

Notice what is missing: there is no Service-1 completed successfully line. That branch had a 300ms delay, and the failure happened earlier. The scope did not need to keep waiting for it.

The dedicated tester made the timing difference clearer:

text
Testing StructuredTaskScope error handling...
StructuredTaskScope caught error in: 57ms
Error message: java.lang.RuntimeException: Bad-Task failed!

Testing CompletableFuture error handling...
CompletableFuture caught error in: 204ms
Error: Bad-Task failed!

In that tester, the bad task fails after 50ms. The structured path reported the error in 57ms. The CompletableFuture.allOf(...) path reported it in 204ms because the all-of join waited for the other branches to finish before surfacing the failure.

That is the strongest practical result in this article. The policy changes when a failure is allowed to end the group.

Cancellation is testable

The same tester includes a cancellation case. Two long tasks start, then a third task fails after 100ms:

java
try (var scope = StructuredTaskScope.open(StructuredTaskScope.Joiner.awaitAllSuccessfulOrThrow())) {
    scope.fork(() -> cancellableTask("Task-1", 1000, structuredCancelled));
    scope.fork(() -> cancellableTask("Task-2", 2000, structuredCancelled));
    scope.fork(() -> failingTask("Failing-Task", 100));

    scope.join();
} catch (Exception e) {
    logger.info("StructuredTaskScope cancelled " + structuredCancelled.get() + " tasks");
}

The measured output:

text
StructuredTaskScope cancelled 2 tasks
CompletableFuture requires manual cancellation
StructuredTaskScope auto-cancelled: 2 tasks
CompletableFuture auto-cancelled: 0 tasks

This is not an abstract cleanup claim. The checked-in task increments a counter only when it receives InterruptedException:

java
private static String cancellableTask(String name, long delayMs, AtomicInteger cancelledCounter) {
    try {
        Thread.sleep(delayMs);
        return name + "-COMPLETED";
    } catch (InterruptedException e) {
        cancelledCounter.incrementAndGet();
        Thread.currentThread().interrupt();
        return name + "-CANCELLED";
    }
}

The structured run incremented the counter twice. The CompletableFuture run did not.

The performance micro-test is not a general benchmark

The current run does not support a structured-concurrency performance claim.

The comparison class reported:

text
StructuredTaskScope: 1336ms
CompletableFuture: 1310ms
Difference: -26ms

The longer tester reported:

text
StructuredTaskScope average: 1.01 ms
CompletableFuture average: 1.01 ms
Difference: 0.00 ms
StructuredTaskScope is 1.0x faster

That is effectively a tie for this tiny 1ms task loop, with noise larger than the lesson. It should not be turned into a performance claim either way.

The memory section is even noisier:

text
StructuredTaskScope: -489 KB
CompletableFuture: -26 KB
Difference: 462 KB

Negative allocation deltas after explicit GC are not article evidence. They are a signal that this local test is too rough for memory claims. Keep the memory output in the testing guide if it helps inspect behavior locally, but do not use it to claim that one approach allocates less.

The architecture in one picture

In text, the diagram shows a parent scope owning several child tasks. The parent starts the children, waits according to the join policy, and exits the scope with cleanup tied to that block.

That is the mental model to keep. Structured concurrency gives concurrent work a visible parent.

How to test this kind of code

Structured-concurrency tests should prove ownership policy. For all-required fan-out, verify that the result includes every required branch and that total time follows the slowest branch. For first-success fan-out, verify which branch wins and that the result returns near that branch's delay. For failure paths, assert both the time to observe failure and whether slower sibling tasks were interrupted. A test that only checks the final exception message can miss the main behavior.

Cancellation deserves its own check. Use tasks that record interruption, as StructuredConcurrencyTester does, so the test can distinguish "the parent returned" from "the siblings actually stopped." For performance, avoid turning 1ms local task loops into broad claims. Use them as smoke checks only, and measure real request paths separately.

What comes next

Part 4 showed why structured concurrency is about local ownership, not automatic speed. The strongest evidence was the failure and cancellation behavior: a 50ms failing task surfaced in 57ms under the structured scope, and the scope interrupted two slower sibling tasks. The CompletableFuture examples can be made correct, but the policy is spread across futures, joins, and manual cancellation calls.

Part 5 moves from the core scope model into advanced patterns: timeouts, conditional cancellation, progressive result collection, and resource-aware orchestration.


Resources

  • Try It Yourself: Run the examples with Java 21+ and --enable-preview
  • Official Documentation: JEP 453: Structured Concurrency
  • Performance Tests: Run repeated tests and compare behavior under failure scenarios