Note that there are some explanatory texts on larger screens.

plurals
  1. POException during reduce phase when remotely executing Hadoop job
    primarykey
    data
    text
    <p>I've got a small 10 node hadoop cluster running 1.0.4 and I'm trying to get it setup so I'm able to submit jobs from machines on the network that are not the NameNode. I've got a simple example setup where I execute the job using <a href="http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/util/ToolRunner.html#run%28org.apache.hadoop.conf.Configuration,%20org.apache.hadoop.util.Tool,%20java.lang.String%5B%5D%29" rel="nofollow"><code>ToolRunner</code></a>, building the <a href="http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapred/JobConf.html" rel="nofollow"><code>JobConf</code></a> manually, and submitting with <a href="http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapred/JobClient.html#submitJob%28org.apache.hadoop.mapred.JobConf%29" rel="nofollow"><code>JobClient.submitJob()</code></a>. Everything works as expected when I run this from the NameNode.</p> <p>When I run from any other node in the network the job is submitted and all map tasks successfully complete, but all the reduce tasks fail with the following exception:</p> <pre><code>org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find output/map_0.out in any of the configured local directories at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:429) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160) at org.apache.hadoop.mapred.MapOutputFile.getInputFile(MapOutputFile.java:161) at org.apache.hadoop.mapred.ReduceTask.getMapFiles(ReduceTask.java:220) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:398) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249) </code></pre> <p>I think that means that the reduce tasks can't find the output from the mappers. I'm fairly certain I'm just missing a config value somewhere, but I can't figure out which ones (I've tried <code>mapred.local.dir</code> and <code>hadoop.tmp.dir</code> with no success). Does anyone know exactly what the above message means and how to fix it, or know a simple way to execute jobs from machines other than the NameNode?</p> <p><strong>Edit</strong>: I think this may also have something to do with permissions. The <code>hadoop</code> user owns pretty much all files on the hdfs, but when I'm logged in on a different machine it's as a different username. I've tried updating <code>mapred-site.xml</code> on all the nodes in the cluster similar to <a href="http://hadoop.apache.org/docs/stable/Secure_Impersonation.html" rel="nofollow">this</a>, and wrapping <code>JobClient.submitJob()</code> inside of a <code>UserGroupInformation.doAs()</code> but I still get an error similar to:</p> <pre><code>SEVERE: PriviledgedActionException as:hadoop via oren cause:org.apache.hadoop.ipc.RemoteException: User: oren is not allowed to impersonate hadoop </code></pre>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload