Note that there are some explanatory texts on larger screens.

plurals
  1. POWhat's the best way to have multiple outputs for a job using Hadoop stable version?
    text
    copied!<p>I have a mapreduce job whose role is to split my input file into two files according to a given criterion. I am currently using Hadoop r0.20.203 because it is the current stable version<br> This version offers two APIs :</p> <ul> <li>The old/deprecated one (<a href="http://hadoop.apache.org/common/docs/r0.20.203.0/api/org/apache/hadoop/mapred/package-summary.html" rel="nofollow">org.apache.hadoop.mapred</a>)</li> <li>The new one (<a href="http://hadoop.apache.org/common/docs/r0.20.203.0/api/org/apache/hadoop/mapreduce/package-summary.html" rel="nofollow">org.apache.hadoop.mapreduce</a>)</li> </ul> <p>As you can imagine, I am using the <em>new API</em>, and my problem is that Hadoop r0.20.203 does not offer any <code>MultipleOutput</code> formats in the new API.<br> Hadoop 0.20.203 stills offers <code>MultipleTextOutputFormat</code> and <code>MultipleTextOutputs</code> (which are both suitable for my case) in the <em>old API</em>. Moreover, the newer Hadoop <em>0.22</em> offers <code>MultipleOutputs</code> in the new API.</p> <p>I see four solutions to my problem :</p> <ul> <li>Switch to Hadoop 0.22. The problem with this solution is that the version may not be deployed on the clusters I'm using because of its beta status.</li> <li>Use the old API for this specific job and the new one for the others. I have seen that the old API has been undeprecated in Hadoop 1.0.0, so can it still be used ? If I need to switch to a newer Hadoop version later, I would have only this job to rewrite.</li> <li>Use the old API for all my jobs to avoid compatibility/consistency problems. Do you think it could harm the evolution of my program ? Especially if I need to switch to a newer Hadoop version later.</li> <li>Forget about multiple outputs and find another solution.</li> </ul> <p>What would you do if you were me ?</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload