Hadoop map-reduce operation is failing on writing output(Hadoop map-reduce 操作在写入输出时失败)
问题描述
我终于能够在 Hadoop 上启动 map-reduce 作业(在单个 debian 机器上运行).但是,map reduce 作业总是失败并出现以下错误:
I am finally able to start a map-reduce job on Hadoop (running on a single debian machine). However, the map reduce job always fails with the following error:
hadoopmachine@debian:~$ ./hadoop-1.0.1/bin/hadoop jar hadooptest/main.jar nl.mydomain.hadoop.debian.test.Main /user/hadoopmachine/input /user/hadoopmachine/output
Warning: $HADOOP_HOME is deprecated.
12/04/03 07:29:35 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
****hdfs://localhost:9000/user/hadoopmachine/input
12/04/03 07:29:35 INFO input.FileInputFormat: Total input paths to process : 1
12/04/03 07:29:35 INFO mapred.JobClient: Running job: job_201204030722_0002
12/04/03 07:29:36 INFO mapred.JobClient: map 0% reduce 0%
12/04/03 07:29:41 INFO mapred.JobClient: Task Id : attempt_201204030722_0002_m_000002_0, Status : FAILED
Error initializing attempt_201204030722_0002_m_000002_0:
ENOENT: No such file or directory
at org.apache.hadoop.io.nativeio.NativeIO.chmod(Native Method)
at org.apache.hadoop.fs.FileUtil.execSetPermission(FileUtil.java:692)
at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:647)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:344)
at org.apache.hadoop.mapred.JobLocalizer.initializeJobLogDir(JobLocalizer.java:239)
at org.apache.hadoop.mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:196)
at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1226)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1201)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1116)
at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2404)
at java.lang.Thread.run(Thread.java:636)
12/04/03 07:29:41 WARN mapred.JobClient: Error reading task outputhttp://localhost:50060/tasklog?plaintext=true&attemptid=attempt_201204030722_0002_m_000002_0&filter=stdout
12/04/03 07:29:41 WARN mapred.JobClient: Error reading task outputhttp://localhost:50060/tasklog?plaintext=true&attemptid=attempt_201204030722_0002_m_000002_0&filter=stderr
不幸的是,它只说:ENOENT:没有这样的文件或目录",它没有说明它实际尝试访问的目录.ping localhost 有效,并且输入目录确实存在.jar 位置也正确.
Unfortunately, it only says: "ENOENT: No such file or directory", it doesn't say what directory it actually tries to access. Pinging localhost works, and the input directory does exist. The jar location is also correct.
谁能告诉我如何解决这个错误,或者如何找出 Hadoop 试图访问的文件?
Can anybody give me a pointer on how to fix for this error, or how to find out which file Hadoop is trying to access?
我在 Hadoop 邮件列表中发现了几个类似的问题,但没有任何回应...
I found several similar problems on the Hadoop mailing list, but no responses on those...
谢谢!
附:mapred.local.dir 的配置如下所示(在 mapred-site.xml 中):
P.S. The config for mapred.local.dir looks like this (in mapred-site.xml):
<property>
<name>mapred.local.dir</name>
<value>/home/hadoopmachine/hadoop_data/mapred</value>
<final>true</final>
</property>
根据要求,ps auxww | 的输出grep TaskTracker 是:
As requested, the output of ps auxww | grep TaskTracker is:
1000 4249 2.2 0.8 1181992 30176 ? Sl 12:09 0:00
/usr/lib/jvm/java-6-openjdk/bin/java -Dproc_tasktracker -Xmx1000m -Dhadoop.log.dir=/home/hadoopmachine/hadoop-1.0.1/libexec/../logs
-Dhadoop.log.file=hadoop-hadoopmachine-tasktracker-debian.log -Dhadoop.home.dir=/home/hadoopmachine/hadoop-1.0.1/libexec/..
-Dhadoop.id.str=hadoopmachine -Dhadoop.root.logger=INFO,DRFA -Dhadoop.security.logger=INFO,NullAppender
-Djava.library.path=/home/hadoopmachine/hadoop-1.0.1/libexec/../lib/native/Linux-i386-32
-Dhadoop.policy.file=hadoop-policy.xml -classpath [ommitted very long list of jars] org.apache.hadoop.mapred.TaskTracker
推荐答案
从作业跟踪器中,确定该任务在哪个 hadoop 节点上执行.SSH 到该节点并确定 hadoop.log.dir
目录的位置(检查该节点的 mapred-site.xml) - 我的猜测是 hadoop 用户没有正确的创建权限此文件夹中的子目录
From the job tracker, identify which hadoop node this task executed on. SSH to that node and identify the location of the hadoop.log.dir
directory (check the mapred-site.xml for this node) - my guess is the hadoop user does not have the correct permissions to create sub-directories in this folder
它尝试创建的实际文件夹位于 ${hadoop.log.dir}/userlogs 文件夹下 - 检查此文件夹是否具有正确的权限
The actual folder it's trying to create lies under the ${hadoop.log.dir}/userlogs folder - check this folder has the correct permissions
在您的情况下,查看 ps 输出,我猜这是您需要检查以下权限的文件夹:
In your case, looking at the ps output, i'm guessing this is the folder you need to examine the permission of:
/home/hadoopmachine/hadoop-1.0.1/libexec/../logs
这篇关于Hadoop map-reduce 操作在写入输出时失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:Hadoop map-reduce 操作在写入输出时失败
基础教程推荐
- 在螺旋中写一个字符串 2022-01-01
- 如何在不安装整个 WTP 包的情况下将 Tomcat 8 添加到 Eclipse Kepler 2022-01-01
- Java 中保存最后 N 个元素的大小受限队列 2022-01-01
- 由于对所需库 rt.jar 的限制,对类的访问限制? 2022-01-01
- 如何使用 Stream 在集合中拆分奇数和偶数以及两者的总和 2022-01-01
- 如何对 HashSet 进行排序? 2022-01-01
- Spring Boot Freemarker从2.2.0升级失败 2022-01-01
- 首次使用 Hadoop,MapReduce Job 不运行 Reduce Phase 2022-01-01
- 如何强制对超级方法进行多态调用? 2022-01-01
- 如何使用 Eclipse 检查调试符号状态? 2022-01-01