Structured Concurrency · Part 5 of 9

Why downstream capacity is the real ceiling on fan-out

The real ceiling on a fan-out request is downstream capacity, not the thread count. One semaphore per dependency type stops a hot path from draining the pool everyone else shares. How we set them and what changes when traffic shape shifts.

Jagdish Salgotra

2026-04-19·10 min read·~332 words

← PreviousTwo workflow shapes that show up after fork-and-wait Next →Composing resilience policies as separable layers

Code repositoryproject-loom

#structured-concurrency

Written by

Jagdish Salgotra

Distributed systems, cloud-native architecture, and the JVM. mostly shipping, occasionally reading.

all posts

Keep reading · rest of the series

Was this article helpful? or email →

anonymous · no account needed

Structured Concurrency · Part 5 of 9

Why downstream capacity is the real ceiling on fan-out

Jagdish Salgotra

2026-04-19·10 min read·~332 words

Note This article uses Java 21 preview StructuredTaskScope APIs (JEP 453). API changes in later previews are covered in Part 9. Compile and run with --enable-preview.

Why Resource Awareness Matters

Virtual threads and structured scopes improve concurrency ergonomics, but they do not remove hard limits:

DB connection pools,
HTTP client connection limits,
CPU core availability,
memory pressure under burst load.

Without explicit resource policy, request fan-out can overload dependencies.

Pattern 1: Bulkhead by Dependency Type

Use semaphores (or equivalent admission controls) to match real downstream capacity.

java

public List<String> executeResourceAware(List<ResourceTask> tasks) throws Exception {
    var cpuTasks = tasks.stream().filter(t -> t.getType() == ResourceType.CPU).toList();
    var memoryTasks = tasks.stream().filter(t -> t.getType() == ResourceType.MEMORY).toList();
    var ioTasks = tasks.stream().filter(t -> t.getType() == ResourceType.IO).toList();

    try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {

        var cpuResult = scope.fork(() -> executeResourceGroup(cpuTasks));
        var memoryResult = scope.fork(() -> executeResourceGroup(memoryTasks));
        var ioResult = scope.fork(() -> executeResourceGroup(ioTasks));

        scope.join();
        scope.throwIfFailed();

        List<String> allResults = new ArrayList<>();
        allResults.addAll(cpuResult.get());
        allResults.addAll(memoryResult.get());
        allResults.addAll(ioResult.get());

        return allResults;
    }
}

Then use these guards inside scope subtasks.

Pattern 2: Scoped Orchestration with Resource Guards

java

private List<String> executeResourceGroup(List<ResourceTask> tasks) throws Exception {
    List<String> results = new ArrayList<>();

    try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
        List<StructuredTaskScope.Subtask<String>> subtasks = new ArrayList<>();

        for (ResourceTask task : tasks) {
            subtasks.add(scope.fork(() -> {
                Thread.sleep(task.getDuration() * 100);
                return task.getName() + " completed";
            }));
        }

        scope.join();
        scope.throwIfFailed();

        for (var subtask : subtasks) {
            results.add(subtask.get());
        }
    }

    return results;
}

This keeps orchestration readable while enforcing dependency capacity boundaries.

Pattern 3: Separate CPU-Bound Phases

For CPU-heavy transforms, use bounded pools explicitly.

java

public List<T> executeAdaptive(List<Callable<T>> tasks) throws Exception {
    int batchSize = Math.min(5, tasks.size());
    List<T> allResults = new ArrayList<>();

    for (int i = 0; i < tasks.size(); i += batchSize) {
        int end = Math.min(i + batchSize, tasks.size());
        List<Callable<T>> batch = tasks.subList(i, end);

        long startTime = System.currentTimeMillis();
        List<T> batchResults = executeBatch(batch);
        long duration = System.currentTimeMillis() - startTime;

        allResults.addAll(batchResults);

        if (duration < 100) {
            batchSize = Math.min(batchSize * 2, 10);
        } else if (duration > 500) {
            batchSize = Math.max(batchSize / 2, 2);
        }
    }

    return allResults;
}

Structured concurrency is most valuable for lifecycle and I/O orchestration; CPU saturation still requires bounded execution design.

Pattern 4: Budget-Aware Admission

Reject or degrade requests early when resource pressure is high.

java

public String bulkheadPattern() throws Exception {
    try (var criticalScope = new StructuredTaskScope.ShutdownOnFailure();
         var nonCriticalScope = new StructuredTaskScope.ShutdownOnFailure()) {

        var criticalService1 = criticalScope.fork(() -> simulateServiceCall("critical-auth", 100));
        var criticalService2 = criticalScope.fork(() -> simulateServiceCall("critical-payment", 150));

        var nonCriticalService1 = nonCriticalScope.fork(() -> simulateServiceCall("analytics", 200));
        var nonCriticalService2 = nonCriticalScope.fork(() -> simulateServiceCall("logging", 50));
        criticalScope.join();
        criticalScope.throwIfFailed();
        try {
            nonCriticalScope.join();
            nonCriticalScope.throwIfFailed();
        } catch (Exception e) {
            logger.warn("Non-critical services failed: {}", e.getMessage());
        }

        String result = String.format("Bulkhead Pattern: Critical[%s, %s] Non-Critical[%s, %s]",
            criticalService1.get(), criticalService2.get(),
            "analytics-ok", "logging-ok");
        return result;
    }
}

Early admission control often protects p99 better than deep queueing.

Java 21 Preview Limitations

Virtual threads can increase incoming concurrency faster than downstream capacity.
Guard values must be aligned with actual pool/driver settings.
Keep guard acquisition order consistent to avoid deadlock patterns.
Use throwIfFailed() consistently in ShutdownOnFailure scopes.
Semaphore acquisition can block the virtual thread; this is acceptable for I/O protection, but avoid long permit hold times.

Operational Metrics to Track

Semaphore contention and wait time.
DB pool utilization and wait queue depth.
Downstream timeout and error rates.
p95/p99 latency during spikes.
Rejected/degraded request rate.

Testing Guidance

For resource-aware scheduling, test:

High contention on DB permits.
Rejection/degraded response paths when queues grow.
CPU overload behavior with and without bounded CPU pools.
Dependency recovery after temporary saturation.
Tail latency behavior during burst traffic.

Build and Runtime Reminder

bash

javac --release 21 --enable-preview ...
java --enable-preview ...

Resources

← PreviousTwo workflow shapes that show up after fork-and-wait Next →Composing resilience policies as separable layers

Code repositoryproject-loom

#structured-concurrency

Written by

Jagdish Salgotra

Distributed systems, cloud-native architecture, and the JVM. mostly shipping, occasionally reading.

all posts

Keep reading · rest of the series

Was this article helpful? or email →

anonymous · no account needed

Note This article uses Java 21 preview StructuredTaskScope APIs (JEP 453). API changes in later previews are covered in Part 9. Compile and run with --enable-preview.

Why Resource Awareness Matters

Virtual threads and structured scopes improve concurrency ergonomics, but they do not remove hard limits:

DB connection pools,
HTTP client connection limits,
CPU core availability,
memory pressure under burst load.

Without explicit resource policy, request fan-out can overload dependencies.

Pattern 1: Bulkhead by Dependency Type

Use semaphores (or equivalent admission controls) to match real downstream capacity.

java

public List<String> executeResourceAware(List<ResourceTask> tasks) throws Exception {
    var cpuTasks = tasks.stream().filter(t -> t.getType() == ResourceType.CPU).toList();
    var memoryTasks = tasks.stream().filter(t -> t.getType() == ResourceType.MEMORY).toList();
    var ioTasks = tasks.stream().filter(t -> t.getType() == ResourceType.IO).toList();

    try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {

        var cpuResult = scope.fork(() -> executeResourceGroup(cpuTasks));
        var memoryResult = scope.fork(() -> executeResourceGroup(memoryTasks));
        var ioResult = scope.fork(() -> executeResourceGroup(ioTasks));

        scope.join();
        scope.throwIfFailed();

        List<String> allResults = new ArrayList<>();
        allResults.addAll(cpuResult.get());
        allResults.addAll(memoryResult.get());
        allResults.addAll(ioResult.get());

        return allResults;
    }
}

Then use these guards inside scope subtasks.

Pattern 2: Scoped Orchestration with Resource Guards

java

private List<String> executeResourceGroup(List<ResourceTask> tasks) throws Exception {
    List<String> results = new ArrayList<>();

    try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
        List<StructuredTaskScope.Subtask<String>> subtasks = new ArrayList<>();

        for (ResourceTask task : tasks) {
            subtasks.add(scope.fork(() -> {
                Thread.sleep(task.getDuration() * 100);
                return task.getName() + " completed";
            }));
        }

        scope.join();
        scope.throwIfFailed();

        for (var subtask : subtasks) {
            results.add(subtask.get());
        }
    }

    return results;
}

This keeps orchestration readable while enforcing dependency capacity boundaries.

Pattern 3: Separate CPU-Bound Phases

For CPU-heavy transforms, use bounded pools explicitly.

java

public List<T> executeAdaptive(List<Callable<T>> tasks) throws Exception {
    int batchSize = Math.min(5, tasks.size());
    List<T> allResults = new ArrayList<>();

    for (int i = 0; i < tasks.size(); i += batchSize) {
        int end = Math.min(i + batchSize, tasks.size());
        List<Callable<T>> batch = tasks.subList(i, end);

        long startTime = System.currentTimeMillis();
        List<T> batchResults = executeBatch(batch);
        long duration = System.currentTimeMillis() - startTime;

        allResults.addAll(batchResults);

        if (duration < 100) {
            batchSize = Math.min(batchSize * 2, 10);
        } else if (duration > 500) {
            batchSize = Math.max(batchSize / 2, 2);
        }
    }

    return allResults;
}

Structured concurrency is most valuable for lifecycle and I/O orchestration; CPU saturation still requires bounded execution design.

Pattern 4: Budget-Aware Admission

Reject or degrade requests early when resource pressure is high.

java

public String bulkheadPattern() throws Exception {
    try (var criticalScope = new StructuredTaskScope.ShutdownOnFailure();
         var nonCriticalScope = new StructuredTaskScope.ShutdownOnFailure()) {

        var criticalService1 = criticalScope.fork(() -> simulateServiceCall("critical-auth", 100));
        var criticalService2 = criticalScope.fork(() -> simulateServiceCall("critical-payment", 150));

        var nonCriticalService1 = nonCriticalScope.fork(() -> simulateServiceCall("analytics", 200));
        var nonCriticalService2 = nonCriticalScope.fork(() -> simulateServiceCall("logging", 50));
        criticalScope.join();
        criticalScope.throwIfFailed();
        try {
            nonCriticalScope.join();
            nonCriticalScope.throwIfFailed();
        } catch (Exception e) {
            logger.warn("Non-critical services failed: {}", e.getMessage());
        }

        String result = String.format("Bulkhead Pattern: Critical[%s, %s] Non-Critical[%s, %s]",
            criticalService1.get(), criticalService2.get(),
            "analytics-ok", "logging-ok");
        return result;
    }
}

Early admission control often protects p99 better than deep queueing.

Java 21 Preview Limitations

Virtual threads can increase incoming concurrency faster than downstream capacity.
Guard values must be aligned with actual pool/driver settings.
Keep guard acquisition order consistent to avoid deadlock patterns.
Use throwIfFailed() consistently in ShutdownOnFailure scopes.
Semaphore acquisition can block the virtual thread; this is acceptable for I/O protection, but avoid long permit hold times.

Operational Metrics to Track

Semaphore contention and wait time.
DB pool utilization and wait queue depth.
Downstream timeout and error rates.
p95/p99 latency during spikes.
Rejected/degraded request rate.

Testing Guidance

For resource-aware scheduling, test:

High contention on DB permits.
Rejection/degraded response paths when queues grow.
CPU overload behavior with and without bounded CPU pools.
Dependency recovery after temporary saturation.
Tail latency behavior during burst traffic.

Build and Runtime Reminder

bash

javac --release 21 --enable-preview ...
java --enable-preview ...

Why downstream capacity is the real ceiling on fan-out

Series navigation

Jagdish Salgotra

Keep reading · rest of the series

Why downstream capacity is the real ceiling on fan-out

Why Resource Awareness Matters

Pattern 1: Bulkhead by Dependency Type

Pattern 2: Scoped Orchestration with Resource Guards

Pattern 3: Separate CPU-Bound Phases

Pattern 4: Budget-Aware Admission

Java 21 Preview Limitations

Operational Metrics to Track

Testing Guidance

Build and Runtime Reminder

Resources

Series navigation

Jagdish Salgotra

Keep reading · rest of the series

Why Resource Awareness Matters

Pattern 1: Bulkhead by Dependency Type

Pattern 2: Scoped Orchestration with Resource Guards

Pattern 3: Separate CPU-Bound Phases

Pattern 4: Budget-Aware Admission

Java 21 Preview Limitations

Operational Metrics to Track

Testing Guidance

Build and Runtime Reminder

Resources