Skip to main content

Mastering Concurrency In Java - Part 3: Execution Models

Yash Sachdeva
Author
Yash Sachdeva
Software Engineer | Turning Complex Problems into Simple Solutions
Software engineer by trade, problem solver by nature. I write about the systems I build in my free time and the experiences that shape them.
Mastering Concurrency in Java - This article is part of a series.
Part : This Article

In Part 1 and Part 2, we covered the fundamentals and building blocks of concurrency. In this part, we will discuss the execution models of concurrency - classic thread pools, task queues, Future/Callable, CompletableFuture, and, from Java 21+, virtual threads and virtual-thread-per-task executors. Choosing among them is ultimately about latency, throughput, and operational simplicity, not about syntax.

The core questions around concurrency primitives are -

  • How many concurrent units of work can the service handle before it degrades?
  • What is the dominant cost per unit (CPU, remote I/O, memory)?
  • How much complexity can the team safely own in production?

Traditional platform threads are expensive because each is backed by an OS thread with significant memory and scheduling overhead. Virtual threads in Java 21 are much cheaper: millions of virtual threads can share a small pool of carrier (OS) threads, blocking freely on I/O while the JVM transparently mounts and unmounts them. This shifts the default from “avoid blocking at all costs” to “write simple blocking code unless there is a concrete reason to go async or reactive.”

With the above, let’s discuss the execution models -

1. Thread Pools
#

A thread pool is an ExecutionService that owns a bounded number of worker threads and a task queue. Tasks (often runnable or callable) are submitted to the pool and executed by available worker threads -

  • Bounded Concurrency - Limits the number of concurrent tasks to protect CPU and downstream dependency.
  • Amortized thread creation cost - Reuses threads across tasks instead of paying creation / destruction cost per task.
  • Centralized policy - You can tune pool size, queue capacity, and rejection policy based on workload.

When do you choose Thread Pools
#

  • You need strict control over concurrency to protect CPU or a fragile dependency (eg: database or 3rd party API) from overload.
  • The workload is CPU bound and you want a pool size tuned to your CPU cores (eg: 1-2 threads per core).
  • You are on pre-Java 21 or cannot use virtual threads (legacy runtime, compliance constraints), so each thread is still expensive.
  • You need prioritization or separate pools per traffic class (eg: user vs batch) to prevent starvation.

Example:

ExecutorService ioBoundPool = Executors.newFixedThreadPool(10);

for (int i = 0; i < 100; i++) {
    ioBoundPool.submit(() -> {
        // do io bound work
    });
}

ioBoundPool.shutdown();

2. Task Queues
#

A Task Queue is typically a BlockingQueue or a BlockingQueue decoupling producers from consumers inside a process.

When do you choose Task Queues
#

  • You want an internal buffer between request handling and slow work - eg: image processing or heavy db aggregations.
  • You want to smooth out bursts of work without overwhelming downstream consumers.
  • You are modelling the transition to a proper distributed queue later, starting with in-memory queues keeps the programming model similar.
import java.util.concurrent.*;

public class TaskQueueExample {
    private final BlockingQueue<Runnable> queue = new LinkedBlockingQueue<>(100);
    private final ExecutorService workerPool = Executors.newFixedThreadPool(5);

    public TaskQueueExample() {
        for (int i = 0; i < 10; i++) {
            workerPool.submit(this::workerLoop);
        }
    }

   private void workerLoop() {
        try {
            while (!Thread.currentThread().isInterrupted()) {
                Runnable task = queue.take();
                task.run();
            }
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
        }
    }

    // queue tasks
    public void populateQueue(String username) {
        try {
            queue.put(() -> doHeavyIO(username));
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
        }
    }

    private void doHeavyIO(String username) {
        // simulate heavy I/O
        try {
            Thread.sleep(1000);
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
        }
    }
}

3. Future and Callable
#

Callable represents a task that returns a value and may throw checked exceptions. Submitting a Callable to an ExecutorService returns a Future, which is a handle for the task’s eventual result. Future allows:

  • Blocking waits with get().
  • Timed waits with get(timeout).
  • Cancellation with cancel().

However, Future has notable limitations: no built-in composition operators, no callbacks on completion (without extra wrappers), and get() is blocking and potentially awkward for complex pipelines

When to choose Future / Callable
#

  • simple parallelism: fan-out a small number of CPU-bound or I/O-bound operations and then join.
  • You are on an older version of java and cannot use CompletableFuture.

private final ExecutorService pool = Executors.newFixedThreadPool(10);

public List<String> searchAllShards(String query) throws InterruptedException {
        List<Callable<String>> tasks = new ArrayList<>();
        for (int shardId = 0; shardId < 3; shardId++) {
            int finalShardId = shardId;
            tasks.add(() -> searchShard(finalShardId, query));
        }

        List<Future<String>> futures = pool.invokeAll(tasks, 50, TimeUnit.MILLISECONDS);

        List<String> results = new ArrayList<>();
        for (Future<String> f : futures) {
            try {
                if (!f.isCancelled()) {
                    results.add(f.get());
                }
            } catch (ExecutionException e) {
                // log and skip this shard
            }
        }
        return results;
    }
private String searchShard(int shardId, String query) {
    // blocking call to shard
    return "results-from-" + shardId + " for query " + query;
}

4. CompletableFuture
#

CompletableFuture (Java 8+) extends Future with the ability to complete manually, register callbacks, and compose stages non-blockingly. It is the foundation of Java’s async fluent API:

  • supplyAsync / runAsync to start async computations.
  • thenApply, thenCompose, thenAccept for transformations and side effects.
  • allOf, anyOf to combine multiple futures.
  • exceptionally, handle for explicit error handling.

By default, async stages run on ForkJoinPool.commonPool() unless a custom Executor is supplied. Using custom executors is important to avoid saturating the common pool with blocking I/O.

When do you choose CompletableFuture
#

Even in a world with virtual threads, CompletableFuture is valuable when:

  • You need true non-blocking behavior on a constrained thread pool, often in libraries that cannot assume virtual threads.
  • You are composing distributed I/O-heavy fan-out/fan-in workloads where overlap matters (e.g. hitting 5 microservices at once and timing out partial results).
  • You want to keep a reactive-ish style without adopting a full reactive stack (Project Reactor, RxJava).

You are building APIs or SDKs where consumers may or may not use virtual threads; returning CompletionStage keeps you neutral.

CompletableFuture gives resource efficiency and composability but at the cost of cognitive complexity and harder debugging.

import java.util.List;
import java.util.concurrent.*;
import java.util.stream.Collectors;

public class CompletableFutureExample {
    private final ExecutorService ioPool = Executors.newFixedThreadPool(64);

    public CompletableFuture<List<String>> fetchProfile(String userId) {
        CompletableFuture<String> basicFuture =
                CompletableFuture.supplyAsync(() -> fetchBasic(userId), ioPool);

        CompletableFuture<String> postsFuture =
                CompletableFuture.supplyAsync(() -> fetchPosts(userId), ioPool);

        CompletableFuture<String> friendsFuture =
                CompletableFuture.supplyAsync(() -> fetchFriends(userId), ioPool);

        return CompletableFuture
                .allOf(basicFuture, postsFuture, friendsFuture)
                .thenApply(v -> List.of(
                        basicFuture.join(),
                        postsFuture.join(),
                        friendsFuture.join()
                ));
    }

    private String fetchBasic(String userId) {
        // HTTP call to /basic
        return "basic-info";
    }

    private String fetchPosts(String userId) {
        // HTTP call to /posts
        return "posts";
    }

    private String fetchFriends(String userId) {
        // HTTP call to /friends
        return "friends";
    }
}

5. Virtual Threads (Java 21)
#

Virtual threads are lightweight threads managed by the JVM rather than the OS. Many virtual threads are multiplexed onto a smaller number of carrier (platform) threads via a scheduler built on ForkJoinPool.

Virtual threads block like normal threads in user code, but when they hit a blocking I/O operation (Socket.read, FileChannel, JDBC, etc.), they unmount from the carrier thread, allowing it to run other virtual threads.

Because stacks are stored in heap segments and can be parked/unparked cheaply, you can have hundreds of thousands or millions of virtual threads in a single JVM.

The standard factory Executors.newVirtualThreadPerTaskExecutor() creates an executor that spawns a new virtual thread for each submitted task and closes it when done.

When should you choose Virtual-Thread-Per-Task
#

With Java 21+, the default answer for an I/O-heavy service is often “one virtual thread per request,” because:

  • Code stays simple and blocking: no callbacks, no async chains; each request handler reads like synchronous code.
  • You achieve massive concurrency as long as tasks are mostly waiting on I/O, not burning CPU.
  • You can often avoid complex async frameworks (reactive, actor models) and instead rely on cheaper blocking.
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

public class VirtualThreadPerTaskExample {
    private final ExecutorService virtualExecutor =
            Executors.newVirtualThreadPerTaskExecutor();

    public void handleIncomingRequest(String payload) {
        virtualExecutor.submit(() -> {
            String userId = parse(payload);
            // All of this code can block freely due to virtual threads
            String profile = callProfileService(userId);
            String timeline = callTimelineService(userId);
            writeResponse(profile, timeline);
        });
    }

    private String parse(String payload) {
        return payload;
    }

    private String callProfileService(String userId) {
        // blocking HTTP/JDBC
        return "profile";
    }

    private String callTimelineService(String userId) {
        // blocking HTTP/JDBC
        return "timeline";
    }

    private void writeResponse(String profile, String timeline) {
        // write to socket
    }
}

Under the hood, many such tasks share a much smaller number of carrier threads; when they block on I/O, they are unmounted, and the carrier runs other tasks.

When Async Pipelines Are Worse Than Blocking Code on Virtual Threads
#

Cost Model: Async Graphs vs. Cheap Blocking
#

Before virtual threads, the main motivations for complex async pipelines (CompletableFuture, reactive streams, actor systems) were:

  • OS threads were expensive; you wanted to minimize thread count.
  • Blocking I/O meant tying up a thread, so you used non-blocking APIs and callbacks to multiplex work.

Virtual threads change this cost model by making blocking cheap and letting the JVM multiplex many blocked virtual threads onto a small number of carriers. As a result, several downsides of async pipelines become more significant relative to their benefits -

  • Cognitive complexity: deep chains of thenCompose/thenCombine are harder to reason about than straight-line code.
  • Debugging and observability: stack traces in async code are fragmented across callbacks, making it harder to reconstruct the logical call stack.
  • Error handling: exception flows are non-local, with exceptionally and handle sprinkled through the graph.
  • Context propagation: propagating tracing, security context, or transaction context across async boundaries is fragile.

When blocking is cheap, these costs often outweigh their benefits for typical request–response microservices.

Example: Aggregator Service – Async vs. Virtual Threads
#

Async with CompletableFuture
#
import java.util.concurrent.*;

public class AggregatorWithCompletableFuture {
    private final ExecutorService ioPool = Executors.newFixedThreadPool(64);

    public CompletableFuture<AggregatedResult> aggregate(String userId) {
        CompletableFuture<String> profileFuture =
                CompletableFuture.supplyAsync(() -> callProfile(userId), ioPool);

        CompletableFuture<String> timelineFuture =
                CompletableFuture.supplyAsync(() -> callTimeline(userId), ioPool);

        CompletableFuture<String> notificationsFuture =
                CompletableFuture.supplyAsync(() -> callNotifications(userId), ioPool);

        return CompletableFuture
                .allOf(profileFuture, timelineFuture, notificationsFuture)
                .orTimeout(200, TimeUnit.MILLISECONDS)
                .thenApply(v -> new AggregatedResult(
                        profileFuture.join(),
                        timelineFuture.join(),
                        notificationsFuture.join()
                ))
                .exceptionally(ex -> fallback(userId, ex));
    }

    private String callProfile(String userId) { return "profile"; }

    private String callTimeline(String userId) { return "timeline"; }

    private String callNotifications(String userId) { return "notifications"; }

    record AggregatedResult(String profile, String timeline, String notifications) {}

    private AggregatedResult fallback(String userId, Throwable ex) {
        // log and return degraded result
        return new AggregatedResult("partial-profile", "empty-timeline", "empty-notifications");
    }
}

This is efficient on a limited pool, but the control flow (timeouts, fallback, composition) is non-trivial.

Blocking on Virtual Threads
#
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;

public class AggregatorWithVirtualThreads {
    private final ExecutorService virtualExecutor =
            Executors.newVirtualThreadPerTaskExecutor();

    public AggregatedResult aggregate(String userId) {
        // Each aggregation runs on its own virtual thread
        try {
            return virtualExecutor.submit(() -> doAggregate(userId))
                    .get(200, TimeUnit.MILLISECONDS);
        } catch (Exception e) {
            return fallback(userId, e);
        }
    }

    private AggregatedResult doAggregate(String userId) {
        String profile = callProfile(userId);   // blocking
        String timeline = callTimeline(userId); // blocking
        String notifications = callNotifications(userId); // blocking
        return new AggregatedResult(profile, timeline, notifications);
    }

    private String callProfile(String userId) { return "profile"; }

    private String callTimeline(String userId) { return "timeline"; }

    private String callNotifications(String userId) { return "notifications"; }

    record AggregatedResult(String profile, String timeline, String notifications) {}

    private AggregatedResult fallback(String userId, Throwable ex) {
        // log and return degraded result
        return new AggregatedResult("partial-profile", "empty-timeline", "empty-notifications");
    }
}

Control flow is now straightforward, with a single blocking call and standard try/catch handling, while virtual threads preserve high concurrency.

When Async Pipelines Are Strictly Worse
#

  • I/O-bound request–response services where each request performs a handful of remote calls and modest CPU work. Virtual threads give you sufficient concurrency with far less complexity.
  • Teams without deep async expertise: subtle bugs in async graphs, lost exceptions, and context propagation issues are operationally expensive.
  • Debug-critical systems where clean stack traces and simple profiling are essential.

CompletableFuture-style async graphs still make sense when you:

  • Must run on pre–Java 21 runtimes or cannot enable virtual threads.
  • Are writing libraries that must be neutral about execution model and therefore expose CompletionStage.
  • Need tight integration with non-blocking I/O APIs or reactive clients where virtual threads would still block.
Mastering Concurrency in Java - This article is part of a series.
Part : This Article

Related

Mastering Concurrency In Java - Part 2: The Fundamentals

In Part 1, we discussed the core concurrency hazards and control concepts in Java: race conditions, visibility, atomicity, deadlocks, starvation, livelock, contention, backpressure, interruption, and cancellation. In this part, we will discuss the coordination primitives - synchronized, volatile, Atomics, Locks, Semaphores, Blocking Queues, and Concurrent Collections.

Mastering Concurrency In Java - Part 1: The Fundamentals

Introduction # Here, we will discuss the core concurrency hazards and control concepts in Java: race conditions, visibility, atomicity, deadlocks, starvation, livelock, contention, backpressure, interruption, and cancellation. Race Conditions # A race condition occurs when the correctness of a task depends on the relative timing or interleaving of threads accessing shared mutable state. The bug is not “thread A ran before B”, it is “if they interleave in this specific way, invariants break.”