Is this a bug in Files.lines(), or am I misunderstanding something about parallel streams?(这是 Files.lines() 中的错误,还是我对并行流有误解?)
问题描述
Environment: Ubuntu x86_64 (14.10), Oracle JDK 1.8u25
I try and use a parallel stream of Files.lines()
but I want to .skip()
the first line (it's a CSV file with a header). Therefore I try and do this:
try (
final Stream<String> stream = Files.lines(thePath, StandardCharsets.UTF_8)
.skip(1L).parallel();
) {
// etc
}
But then one column failed to parse to an int...
So I tried some simple code. The file is question is dead simple:
$ cat info.csv
startDate;treeDepth;nrMatchers;nrLines;nrChars;nrCodePoints;nrNodes
1422758875023;34;54;151;4375;4375;27486
$
And the code is equally simple:
public static void main(final String... args)
{
final Path path = Paths.get("/home/fge/tmp/dd/info.csv");
Files.lines(path, StandardCharsets.UTF_8).skip(1L).parallel()
.forEach(System.out::println);
}
And I systematically get the following result (OK, I have only run it something around 20 times):
startDate;treeDepth;nrMatchers;nrLines;nrChars;nrCodePoints;nrNodes
What am I missing here?
EDIT It seems like the problem, or misunderstanding, is much more rooted than that (the two examples below were cooked up by a fellow on FreeNode's ##java):
public static void main(final String... args)
{
new BufferedReader(new StringReader("Hello
World")).lines()
.skip(1L).parallel()
.forEach(System.out::println);
final Iterator<String> iter
= Arrays.asList("Hello", "World").iterator();
final Spliterator<String> spliterator
= Spliterators.spliteratorUnknownSize(iter, Spliterator.ORDERED);
final Stream<String> s
= StreamSupport.stream(spliterator, true);
s.skip(1L).forEach(System.out::println);
}
This prints:
Hello
Hello
Uh.
@Holger suggested that this happens for any stream which is ORDERED
and not SIZED
with this other sample:
Stream.of("Hello", "World")
.filter(x -> true)
.parallel()
.skip(1L)
.forEach(System.out::println);
Also, it stems from all the discussion which already took place that the problem (if it is one?) is with .forEach()
(as @SotiriosDelimanolis first pointed out).
Since the current state of the issue is quite the opposite of the earlier statements made here, it should be noted, that there is now an explicit statement by Brian Goetz about the back-propagation of the unordered characteristic past a skip
operation is considered a bug. It’s also stated that it is now considered to have no back-propagation of the ordered-ness of a terminal operation at all.
There is also a related bug report, JDK-8129120 whose status is "fixed in Java 9" and it’s backported to Java 8, update 60
I did some tests with jdk1.8.0_60
and it seems that the implementation now indeed exhibits the more intuitive behavior.
这篇关于这是 Files.lines() 中的错误,还是我对并行流有误解?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:这是 Files.lines() 中的错误,还是我对并行流有误解?
基础教程推荐
- 无法使用修饰符“public final"访问 java.util.Ha 2022-01-01
- FirebaseListAdapter 不推送聊天应用程序的单个项目 - Firebase-Ui 3.1 2022-01-01
- 如何使用 Java 创建 X509 证书? 2022-01-01
- 降序排序:Java Map 2022-01-01
- “未找到匹配项"使用 matcher 的 group 方法时 2022-01-01
- 在 Libgdx 中处理屏幕的正确方法 2022-01-01
- Java Keytool 导入证书后出错,"keytool error: java.io.FileNotFoundException &拒绝访问" 2022-01-01
- 减少 JVM 暂停时间 >1 秒使用 UseConcMarkSweepGC 2022-01-01
- 设置 bean 时出现 Nullpointerexception 2022-01-01
- Java:带有char数组的println给出乱码 2022-01-01