Note that there are some explanatory texts on larger screens.

plurals
  1. POIn Hadoop where does the framework save the output of the Map task in a normal Map-Reduce Application?
    primarykey
    data
    text
    <p>I am trying to find out where does the output of a Map task is saved to disk before it can be used by a Reduce task.</p> <p><strong>Note:</strong> - version used is Hadoop 0.20.204 with the new API</p> <p>For example, when overwriting the map method in the Map class:</p> <pre><code>public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); context.write(word, one); } // code that starts a new Job. } </code></pre> <p>I am interested to find out where does context.write() ends up writing the data. So far i've ran into the:</p> <pre><code>FileOutputFormat.getWorkOutputPath(context); </code></pre> <p>Which gives me the following location on hdfs:</p> <pre><code>hdfs://localhost:9000/tmp/outputs/1/_temporary/_attempt_201112221334_0001_m_000000_0 </code></pre> <p>When i try to use it as input for another job it gives me the following error:</p> <pre><code>org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://localhost:9000/tmp/outputs/1/_temporary/_attempt_201112221334_0001_m_000000_0 </code></pre> <p><strong>Note:</strong> the job is started in the Mapper, so technically, the temporary folder where the Mapper task is writing it's output exists when the new job begins. Then again, it still says that the input path does not exist.</p> <p>Any ideas to where the temporary output is written to? Or maybe what is the location where i can find the output of a Map task during a job that has both a Map and a Reduce stage?</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload