engnotes.dev
NotebookTopicsAbout

Subscribe

One email when a new post goes up. Nothing else.

one per post · no tracking · also on RSS

Site

  • Notebook
  • Topics
  • About
  • Contact

Topics

Structured Concurrency9Tail Latency & System Behavior3

Elsewhere

  • GitHub
  • X
  • LinkedIn
  • Email
engnotes.dev© 2026 Jagdish Salgotra · written on personal time. not on employer time.
PrivacyTermsCookies
blog/structured-concurrency/part 5
Structured Concurrency · Part 5 of 9

Why downstream capacity is the real ceiling on fan-out

The real ceiling on a fan-out request is downstream capacity, not the thread count. One semaphore per dependency type stops a hot path from draining the pool everyone else shares. How we set them and what changes when traffic shape shifts.

J
Jagdish Salgotra
2026-04-19·10 min read·~332 words

Series navigation

← PreviousTwo workflow shapes that show up after fork-and-waitNext →Composing resilience policies as separable layers
Code repositoryproject-loom
#structured-concurrency
share
J

Written by

Jagdish Salgotra

Distributed systems, cloud-native architecture, and the JVM. mostly shipping, occasionally reading.

all posts

Keep reading · rest of the series

  • 2026-03-2215 min read
    Part 1
    What structured scopes actually catch
  • 2026-03-3010 min read
    Part 2
    What a missed deadline should do, and what it should not
  • 2026-04-0610 min read
    Part 3
    Cancelling siblings before they burn capacity
  • 2026-04-1210 min read
    Part 4
    Two workflow shapes that show up after fork-and-wait
Was this article helpful? or email →
anonymous · no account needed

On this page

Reading progress

0 min of 10 · ~10 left

Ask the post

Any answer points back at the paragraph it came from.

Note This article uses Java 21 preview StructuredTaskScope APIs (JEP 453). API changes in later previews are covered in Part 9. Compile and run with --enable-preview.

Why Resource Awareness Matters

Virtual threads and structured scopes improve concurrency ergonomics, but they do not remove hard limits:

  • DB connection pools,
  • HTTP client connection limits,
  • CPU core availability,
  • memory pressure under burst load.

Without explicit resource policy, request fan-out can overload dependencies.

Pattern 1: Bulkhead by Dependency Type

Use semaphores (or equivalent admission controls) to match real downstream capacity.

java
public List<String> executeResourceAware(List<ResourceTask> tasks) throws Exception {
    var cpuTasks = tasks.stream().filter(t -> t.getType() == ResourceType.CPU).toList();
    var memoryTasks = tasks.stream().filter(t -> t.getType() == ResourceType.MEMORY).toList();
    var ioTasks = tasks.stream().filter(t -> t.getType() == ResourceType.IO).toList();

    try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {

        var cpuResult = scope.fork(() -> executeResourceGroup(cpuTasks));
        var memoryResult = scope.fork(() -> executeResourceGroup(memoryTasks));
        var ioResult = scope.fork(() -> executeResourceGroup(ioTasks));

        scope.join();
        scope.throwIfFailed();

        List<String> allResults = new ArrayList<>();
        allResults.addAll(cpuResult.get());
        allResults.addAll(memoryResult.get());
        allResults.addAll(ioResult.get());

        return allResults;
    }
}

Then use these guards inside scope subtasks.

Pattern 2: Scoped Orchestration with Resource Guards

java
private List<String> executeResourceGroup(List<ResourceTask> tasks) throws Exception {
    List<String> results = new ArrayList<>();

    try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
        List<StructuredTaskScope.Subtask<String>> subtasks = new ArrayList<>();

        for (ResourceTask task : tasks) {
            subtasks.add(scope.fork(() -> {
                Thread.sleep(task.getDuration() * 100);
                return task.getName() + " completed";
            }));
        }

        scope.join();
        scope.throwIfFailed();

        for (var subtask : subtasks) {
            results.add(subtask.get());
        }
    }

    return results;
}

This keeps orchestration readable while enforcing dependency capacity boundaries.

Pattern 3: Separate CPU-Bound Phases

For CPU-heavy transforms, use bounded pools explicitly.

java
public List<T> executeAdaptive(List<Callable<T>> tasks) throws Exception {
    int batchSize = Math.min(5, tasks.size());
    List<T> allResults = new ArrayList<>();

    for (int i = 0; i < tasks.size(); i += batchSize) {
        int end = Math.min(i + batchSize, tasks.size());
        List<Callable<T>> batch = tasks.subList(i, end);

        long startTime = System.currentTimeMillis();
        List<T> batchResults = executeBatch(batch);
        long duration = System.currentTimeMillis() - startTime;

        allResults.addAll(batchResults);

        if (duration < 100) {
            batchSize = Math.min(batchSize * 2, 10);
        } else if (duration > 500) {
            batchSize = Math.max(batchSize / 2, 2);
        }
    }

    return allResults;
}

Structured concurrency is most valuable for lifecycle and I/O orchestration; CPU saturation still requires bounded execution design.

Pattern 4: Budget-Aware Admission

Reject or degrade requests early when resource pressure is high.

java
public String bulkheadPattern() throws Exception {
    try (var criticalScope = new StructuredTaskScope.ShutdownOnFailure();
         var nonCriticalScope = new StructuredTaskScope.ShutdownOnFailure()) {

        var criticalService1 = criticalScope.fork(() -> simulateServiceCall("critical-auth", 100));
        var criticalService2 = criticalScope.fork(() -> simulateServiceCall("critical-payment", 150));

        var nonCriticalService1 = nonCriticalScope.fork(() -> simulateServiceCall("analytics", 200));
        var nonCriticalService2 = nonCriticalScope.fork(() -> simulateServiceCall("logging", 50));
        criticalScope.join();
        criticalScope.throwIfFailed();
        try {
            nonCriticalScope.join();
            nonCriticalScope.throwIfFailed();
        } catch (Exception e) {
            logger.warn("Non-critical services failed: {}", e.getMessage());
        }

        String result = String.format("Bulkhead Pattern: Critical[%s, %s] Non-Critical[%s, %s]",
            criticalService1.get(), criticalService2.get(),
            "analytics-ok", "logging-ok");
        return result;
    }
}

Early admission control often protects p99 better than deep queueing.

Java 21 Preview Limitations

  1. Virtual threads can increase incoming concurrency faster than downstream capacity.
  2. Guard values must be aligned with actual pool/driver settings.
  3. Keep guard acquisition order consistent to avoid deadlock patterns.
  4. Use throwIfFailed() consistently in ShutdownOnFailure scopes.
  5. Semaphore acquisition can block the virtual thread; this is acceptable for I/O protection, but avoid long permit hold times.

Operational Metrics to Track

  • Semaphore contention and wait time.
  • DB pool utilization and wait queue depth.
  • Downstream timeout and error rates.
  • p95/p99 latency during spikes.
  • Rejected/degraded request rate.

Testing Guidance

For resource-aware scheduling, test:

  • High contention on DB permits.
  • Rejection/degraded response paths when queues grow.
  • CPU overload behavior with and without bounded CPU pools.
  • Dependency recovery after temporary saturation.
  • Tail latency behavior during burst traffic.

Build and Runtime Reminder

bash
javac --release 21 --enable-preview ...
java --enable-preview ...

Resources

  • JEP 453: Structured Concurrency (Preview)
  • JEP 444: Virtual Threads