Chapter 17 • Modern Practices • 14 min read • Updated May 16, 2026

Streams API

Filter, map, reduce — Java's functional pipeline for collections. When streams are clearer than loops, and when they aren't.

Loops, but declarative

Before streams, processing a collection in Java meant explicit loops:

Java

List<String> activeNames = new ArrayList<>();
for (User u : users) {
    if (u.isActive()) {
        activeNames.add(u.getName().toUpperCase());
    }
}
Collections.sort(activeNames);

With streams (Java 8+):

Java

List<String> activeNames = users.stream()
    .filter(User::isActive)
    .map(User::getName)
    .map(String::toUpperCase)
    .sorted()
    .toList();

The stream version describes WHAT you want — active users' names, uppercased, sorted. The loop version describes HOW step by step. Both produce the same result.

Streams aren't always better. For simple operations on small collections, a plain loop is often clearer. But for multi-step pipelines, especially anything involving filter/transform/group/sort, streams win on readability once you're used to them.

This chapter covers the pipeline mental model, the main operations, and the practical limits.

The pipeline mental model

A stream has three parts:

A stream pipeline has three parts: a source, zero or more intermediate operations (filter, map, sorted, …), and exactly one terminal operation (collect, count, forEach, …). Intermediate ops are lazy — they don't do anything until the terminal op triggers the actual work.

**Source.** A collection, an array, a file, a generator. Anything you can call .stream() on, plus Stream.of(...), Files.lines(...), IntStream.range(...), etc.

**Zero or more intermediate operations.** filter, map, sorted, distinct, limit, skip, flatMap, peek. Each returns a new Stream. They're **lazy** — they don't actually process anything until a terminal operation triggers the work.

**Exactly one terminal operation.** collect, count, forEach, reduce, findFirst, anyMatch, toList() (Java 16+). The terminal op consumes the stream and produces a result.

**Streams are single-use.** Once you've called a terminal operation, the stream is exhausted. You can't go back to the start:

Java

Stream<String> s = list.stream();
s.count();         // 5
s.count();         // IllegalStateException — stream already used

If you need the same source twice, call .stream() twice on the original collection.

Common intermediate operations

**filter(Predicate)** — keep elements that match:

Java

users.stream().filter(u -> u.getAge() >= 18)

**map(Function)** — transform each element:

Java

users.stream().map(User::getName)   // Stream<User> → Stream<String>

**flatMap(Function)** — transform each element into a stream, then flatten:

Java

List<List<String>> roles = ...;
roles.stream().flatMap(List::stream)   // Stream<List<String>> → Stream<String>

orders.stream().flatMap(o -> o.getItems().stream())   // all items across all orders

flatMap is the operation people miss most often. Use it whenever you have a stream of collections and want a stream of their elements.

**distinct()** — remove duplicates (using .equals()).

**sorted()** / **sorted(Comparator)** — sort. With no argument, requires elements to be Comparable.

**limit(n) / skip(n)** — take first n / skip first n. Often used together for pagination.

**peek(Consumer)** — perform a side effect without consuming. Mostly useful for debugging:

Java

users.stream()
    .filter(User::isActive)
    .peek(u -> log.debug("checking {}", u))
    .map(User::getName)
    .toList();

Don't use peek for real side effects in production code — the rules around when it runs are unintuitive.

Common terminal operations

**toList()** (Java 16+) — collect into an unmodifiable List:

Java

List<String> names = users.stream().map(User::getName).toList();

This replaced the older .collect(Collectors.toList()). Modern code uses .toList().

**collect(Collector)** — flexible collection into various shapes:

Java

// Set
Set<String> nameSet = users.stream().map(User::getName).collect(Collectors.toSet());

// Map by key
Map<Long, User> byId = users.stream().collect(Collectors.toMap(User::getId, u -> u));

// Group by attribute
Map<String, List<User>> byRole = users.stream()
    .collect(Collectors.groupingBy(User::getRole));

// Count by attribute
Map<String, Long> countByRole = users.stream()
    .collect(Collectors.groupingBy(User::getRole, Collectors.counting()));

// Join into a single string
String joined = users.stream().map(User::getName)
    .collect(Collectors.joining(", "));

**count()** — number of elements:

Java

long activeCount = users.stream().filter(User::isActive).count();

**anyMatch / allMatch / noneMatch** — short-circuit boolean checks:

Java

boolean hasAdmin = users.stream().anyMatch(u -> u.hasRole("ADMIN"));
boolean allActive = users.stream().allMatch(User::isActive);

These short-circuit — anyMatch stops as soon as it finds a match.

**findFirst / findAny** — return an Optional containing one element:

Java

Optional<User> admin = users.stream().filter(u -> u.hasRole("ADMIN")).findFirst();

**min / max** — extreme by comparator:

Java

Optional<User> oldest = users.stream().max(Comparator.comparingInt(User::getAge));

**reduce** — combine all elements into one result:

Java

int totalAge = users.stream().mapToInt(User::getAge).sum();
String concat = words.stream().reduce("", (a, b) -> a + b);

For numbers, prefer mapToInt/mapToLong/mapToDouble followed by sum() or average() — they avoid autoboxing.

**forEach(Consumer)** — perform an action on each element. Use sparingly; if you're just doing side effects, a plain loop is often clearer.

Real-world patterns

**Group and count.** "How many orders per status?"

Java

Map<Status, Long> counts = orders.stream()
    .collect(Collectors.groupingBy(Order::getStatus, Collectors.counting()));

**Aggregate within groups.** "Total revenue per region."

Java

Map<String, BigDecimal> revenueByRegion = orders.stream()
    .collect(Collectors.groupingBy(
        Order::getRegion,
        Collectors.reducing(BigDecimal.ZERO, Order::getAmount, BigDecimal::add)
    ));

**Top N.** "The five oldest users."

Java

List<User> oldest = users.stream()
    .sorted(Comparator.comparingInt(User::getAge).reversed())
    .limit(5)
    .toList();

**Index pairs with IntStream.range.** "Pair each item with its index."

Java

IntStream.range(0, names.size())
    .mapToObj(i -> i + ": " + names.get(i))
    .forEach(System.out::println);

**Read lines from a file.**

Java

try (Stream<String> lines = Files.lines(Path.of("data.txt"))) {
    long count = lines.filter(l -> !l.isBlank()).count();
}

Note the try-with-resources — file streams need closing.

**Filter then aggregate.** "Total revenue from active customers."

Java

BigDecimal total = orders.stream()
    .filter(o -> o.getCustomer().isActive())
    .map(Order::getAmount)
    .reduce(BigDecimal.ZERO, BigDecimal::add);

Parallel streams — and why you usually shouldn't

Adding .parallel() makes the stream process elements in parallel across multiple threads:

Java

long count = users.parallelStream().filter(User::isActive).count();

The JVM splits the work across threads in the common ForkJoinPool. For CPU-heavy operations on large collections, this can be a significant speedup.

**Why you usually shouldn't:**

**Most stream pipelines are I/O-bound or simple**, not CPU-bound. Parallelism adds overhead without gains.
**Order is no longer guaranteed** in some operations. findFirst becomes findAny-like in practice.
**The common ForkJoinPool is shared.** All parallel streams in the JVM compete for the same threads. One badly-behaved stream affects everything.
**Side effects become race conditions.** If your lambdas mutate shared state, parallel streams will corrupt it.
**Stateful operations like sorted and distinct work, but lose their efficiency in parallel mode.**

**When parallel streams genuinely help:**
- The collection is large (thousands+ elements).
- The per-element work is CPU-heavy (parsing, encoding, calculations).
- The operations are stateless and order-independent.
- You've profiled and confirmed sequential streams are the bottleneck.

For most application code, leave .parallel() off. If you need real parallelism, prefer CompletableFuture for concurrent I/O or explicit ExecutorService for controlled CPU parallelism.

When NOT to use streams

Streams are a tool, not a religion. They make some code clearer, other code worse.

**Plain loops are often clearer for simple cases.**

Java

// Stream
int max = values.stream().mapToInt(Integer::intValue).max().orElse(0);

// Loop — arguably clearer
int max = 0;
for (int v : values) {
    if (v > max) max = v;
}

For one-step operations on a list, the loop is fine. Streams shine for multi-step pipelines.

**Performance-critical hot loops.** Streams have overhead — pipeline construction, lambda dispatch, possibly boxing. For tight numeric loops, a plain for is faster. Profile if it matters.

**Side-effect-heavy logic.** Streams are designed for transformation pipelines. If your loop body does I/O, logging, updating multiple counters, or other side effects, a plain loop is more honest.

**Multi-collection coordination.** When you need to iterate two collections together, or use the index, plain for-loops or IntStream.range are often more readable than complex stream compositions.

**Debuggability.** Stack traces from streams point at the synthesised pipeline methods, not your business logic. For deeply complex pipelines, this can hurt debugging. Use .peek() or extract sub-pipelines into named methods if it matters.

The rule of thumb: streams are great when you have a clear "from X transform to Y" shape. They get worse the more side effects or interactions creep in. Trust your instinct — if the stream version is harder to read out loud, write the loop.

⁂ Back to all modules

Streams API

Loops, but declarative

The pipeline mental model

Common intermediate operations

Common terminal operations

Real-world patterns

Parallel streams — and why you usually shouldn't

When NOT to use streams

Continue reading

Concurrency and Multithreading

Build Tools and Project Structure

Modern Java: Records, Sealed Classes, Pattern Matching