当前位置：首页 > java >正文

【Java Stream流-59】Java Stream流式编程：高效、优雅的数据处理之道

java 2025/5/4 6:52:29

在Java 8中，Stream API的引入彻底改变了我们处理集合数据的方式。Stream提供了一种声明式、函数式的数据处理方法，让代码更加简洁、易读且易于维护。本文将深入探讨Java Stream的各个方面，帮助你掌握这一强大的编程工具。

1. 什么是Stream？

Stream（流）是Java 8中处理集合（Collection）数据的新抽象。它不是数据结构，而是一种对数据源（集合、数组等）进行高效聚合操作（如过滤、映射、排序等）的API。

核心特点：

非存储：Stream本身不存储数据，数据存储在底层集合或由生成器生成
函数式风格：支持链式调用和惰性计算
并行处理：可以透明地利用多核架构进行并行操作

2. Stream与集合的区别

特性	集合(Collection)	流(Stream)
存储	存储所有元素	不存储元素
数据结构	是数据结构	不是数据结构
操作方式	外部迭代(foreach循环)	内部迭代
延迟执行	不支持	支持
可重用性	可多次使用	只能使用一次
并行处理	需要手动实现	内置支持

3. Stream操作分类

Stream操作分为两大类：

中间操作(Intermediate Operations)：返回Stream本身，可以链式调用
- 过滤：filter()
- 映射：map(), flatMap()
- 去重：distinct()
- 排序：sorted()
- 截取：limit(), skip()
- 等等
终端操作(Terminal Operations)：触发实际计算，返回非Stream结果
- 遍历：forEach()
- 匹配：anyMatch(), allMatch(), noneMatch()
- 查找：findFirst(), findAny()
- 聚合：reduce(), collect(), count(), max(), min()
- 收集：toArray(), collect()

4. 常用Stream操作详解

4.1 创建Stream

// 从集合创建
List<String> list = Arrays.asList("a", "b", "c");
Stream<String> stream1 = list.stream();// 从数组创建
String[] array = {"a", "b", "c"};
Stream<String> stream2 = Arrays.stream(array);// 直接创建
Stream<String> stream3 = Stream.of("a", "b", "c");// 创建无限流
Stream<Integer> stream4 = Stream.iterate(0, n -> n + 2); // 无限流
Stream<Double> stream5 = Stream.generate(Math::random);  // 无限流

4.2 过滤操作

List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);// 过滤偶数
List<Integer> evenNumbers = numbers.stream().filter(n -> n % 2 == 0).collect(Collectors.toList());
// 结果: [2, 4, 6, 8, 10]// 多条件过滤
List<Integer> filteredNumbers = numbers.stream().filter(n -> n > 5).filter(n -> n < 9).collect(Collectors.toList());
// 结果: [6, 7, 8]

4.3 映射操作

List<String> words = Arrays.asList("Java", "Stream", "API");// 转换为大写
List<String> upperCaseWords = words.stream().map(String::toUpperCase).collect(Collectors.toList());
// 结果: ["JAVA", "STREAM", "API"]// 获取字符串长度
List<Integer> wordLengths = words.stream().map(String::length).collect(Collectors.toList());
// 结果: [4, 6, 3]// 复杂对象映射
class Person {private String name;private int age;// 构造方法、getter/setter省略
}List<Person> people = Arrays.asList(new Person("Alice", 25),new Person("Bob", 30),new Person("Charlie", 35)
);List<String> names = people.stream().map(Person::getName).collect(Collectors.toList());
// 结果: ["Alice", "Bob", "Charlie"]

4.4 扁平化操作(flatMap)

List<List<String>> listOfLists = Arrays.asList(Arrays.asList("a", "b"),Arrays.asList("c", "d"),Arrays.asList("e", "f")
);List<String> flatList = listOfLists.stream().flatMap(Collection::stream).collect(Collectors.toList());
// 结果: ["a", "b", "c", "d", "e", "f"]

4.5 排序操作

List<String> names = Arrays.asList("John", "Alice", "Bob", "Diana");// 自然排序
List<String> sortedNames = names.stream().sorted().collect(Collectors.toList());
// 结果: ["Alice", "Bob", "Diana", "John"]// 自定义排序
List<String> customSorted = names.stream().sorted((s1, s2) -> s1.length() - s2.length()).collect(Collectors.toList());
// 结果: ["Bob", "John", "Alice", "Diana"]

4.6 去重操作

List<Integer> numbers = Arrays.asList(1, 2, 2, 3, 3, 3, 4, 5, 5);List<Integer> distinctNumbers = numbers.stream().distinct().collect(Collectors.toList());
// 结果: [1, 2, 3, 4, 5]

4.7 限制和跳过

Stream<Integer> numbers = Stream.iterate(1, n -> n + 1); // 无限流: 1, 2, 3, 4, ...// 取前10个
List<Integer> firstTen = numbers.limit(10).collect(Collectors.toList());
// 结果: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]// 跳过前5个，取接下来的5个
List<Integer> skipFive = Stream.iterate(1, n -> n + 1).skip(5).limit(5).collect(Collectors.toList());
// 结果: [6, 7, 8, 9, 10]

4.8 匹配操作

List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5);boolean allEven = numbers.stream().allMatch(n -> n % 2 == 0); // false
boolean anyEven = numbers.stream().anyMatch(n -> n % 2 == 0); // true
boolean noneNegative = numbers.stream().noneMatch(n -> n < 0); // true

4.9 查找操作

List<String> names = Arrays.asList("Alice", "Bob", "Charlie");Optional<String> first = names.stream().findFirst(); // Optional["Alice"]
Optional<String> any = names.stream().findAny();     // 可能返回任意元素

4.10 归约操作(reduce)

List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5);// 求和
Optional<Integer> sum = numbers.stream().reduce(Integer::sum); // Optional[15]// 求积
Optional<Integer> product = numbers.stream().reduce((a, b) -> a * b); // Optional[120]// 带初始值的reduce
Integer sumWithIdentity = numbers.stream().reduce(0, Integer::sum); // 15

4.11 收集操作(collect)

List<String> names = Arrays.asList("Alice", "Bob", "Charlie", "David");// 转换为List
List<String> list = names.stream().collect(Collectors.toList());// 转换为Set
Set<String> set = names.stream().collect(Collectors.toSet());// 转换为Map
Map<String, Integer> nameLengthMap = names.stream().collect(Collectors.toMap(Function.identity(), // 键String::length      // 值));
// 结果: {Alice=5, Bob=3, Charlie=7, David=5}// 分组
Map<Integer, List<String>> groupByLength = names.stream().collect(Collectors.groupingBy(String::length));
// 结果: {3=[Bob], 5=[Alice, David], 7=[Charlie]}// 分区
Map<Boolean, List<String>> partition = names.stream().collect(Collectors.partitioningBy(s -> s.length() > 4));
// 结果: {false=[Bob], true=[Alice, Charlie, David]}// 连接字符串
String joined = names.stream().collect(Collectors.joining(", "));
// 结果: "Alice, Bob, Charlie, David"

5. 并行Stream

Stream API内置支持并行处理，可以轻松利用多核处理器：

List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);// 顺序流
long sequentialCount = numbers.stream().filter(n -> n % 2 == 0).count();// 并行流
long parallelCount = numbers.parallelStream().filter(n -> n % 2 == 0).count();

注意事项：

并行流不总是比顺序流快，对于小数据集可能更慢
确保操作是无状态的、不干扰的、关联的
避免在并行流中使用有副作用的操作

6. Stream最佳实践

优先使用方法引用：使代码更简洁

// 不推荐
.map(s -> s.toUpperCase())// 推荐
.map(String::toUpperCase)

避免嵌套Stream：使用flatMap代替

// 不推荐
.map(list -> list.stream()...)// 推荐
.flatMap(List::stream)

重用中间结果：避免重复计算

// 不推荐
stream.filter(...).count();
stream.filter(...).collect(...);// 推荐
Stream filtered = stream.filter(...);
filtered.count();
filtered.collect(...); // 错误！Stream只能使用一次

谨慎使用无限流：确保有限制操作

// 危险
Stream.iterate(0, i -> i + 1).forEach(System.out::println);// 安全
Stream.iterate(0, i -> i + 1).limit(100).forEach(System.out::println);

考虑使用原始类型流：避免装箱/拆箱开销

// 装箱流
Stream<Integer> boxed = ...;// 原始类型流
IntStream intStream = boxed.mapToInt(Integer::intValue);

7. Stream性能考虑

小数据集：顺序流可能更快（并行有开销）
大数据集：并行流通常更快
操作类型：
- 过滤、映射等轻量级操作适合并行
- 排序等重量级操作并行收益可能不大
顺序依赖：有顺序依赖的操作不适合并行

8. 常见问题与解决方案

问题1：Stream只能使用一次

Stream<String> stream = Stream.of("a", "b", "c");
stream.forEach(System.out::println);
stream.forEach(System.out::println); // 抛出IllegalStateException

解决方案：每次需要时重新创建Stream

问题2：并行流中的线程安全问题

List<Integer> list = new ArrayList<>();
IntStream.range(0, 1000).parallel().forEach(list::add); // 线程不安全

解决方案：使用线程安全集合或collect方法

List<Integer> safeList = IntStream.range(0, 1000).parallel().boxed().collect(Collectors.toList());

问题3：无限流导致程序挂起

Stream.iterate(0, i -> i + 1).forEach(System.out::println); // 无限循环

解决方案：总是使用limit等限制操作

9. 实际应用案例

9.1 案例1：统计单词频率

String text = "hello world hello java world stream";Map<String, Long> wordCount = Arrays.stream(text.split(" ")).collect(Collectors.groupingBy(Function.identity(),Collectors.counting()));
// 结果: {hello=2, world=2, java=1, stream=1}

9.2 案例2：处理CSV文件

try (Stream<String> lines = Files.lines(Paths.get("data.csv"))) {List<Person> people = lines.skip(1) // 跳过标题行.map(line -> {String[] parts = line.split(",");return new Person(parts[0], Integer.parseInt(parts[1]));}).collect(Collectors.toList());
} catch (IOException e) {e.printStackTrace();
}

9.3 案例3：多条件分组统计

class Employee {private String name;private String department;private double salary;// 构造方法、getter/setter省略
}List<Employee> employees = ...;// 按部门分组，统计每个部门的平均工资
Map<String, Double> avgSalaryByDept = employees.stream().collect(Collectors.groupingBy(Employee::getDepartment,Collectors.averagingDouble(Employee::getSalary)));// 多级分组：按部门分组，再按薪资范围分组
Map<String, Map<String, List<Employee>>> multiGroup = employees.stream().collect(Collectors.groupingBy(Employee::getDepartment,Collectors.groupingBy(e -> {if (e.getSalary() < 5000) return "Low";else if (e.getSalary() < 10000) return "Medium";else return "High";})));