Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>There are several ways to use external jars with your map reduce code: </p> <ol> <li><p>Include the referenced JAR in the lib subdirectory of the submittable JAR: The job will unpack the JAR from this lib subdirectory into the jobcache on the respective TaskTracker nodes and point your tasks to this directory to make the JAR available to your code. If the JARs are small, change often, and are job-specific this is the preferred method. This is what <a href="https://stackoverflow.com/a/16825894/1150329">@clement</a> suggested in his answer.</p></li> <li><p>Install the JAR on the cluster nodes. The easiest way is to place the JAR into <code>$HADOOP_HOME/lib</code> directory as everything from this directory is included when a Hadoop daemon starts. Note that a start stop will be needed to make this effective. </p></li> <li><p>TaskTrackers will be using the external JAR, so you can provide it by modifying <code>HADOOP_TASKTRACKER_OPTS</code> option in the <code>hadoop-env.sh</code> configuration file and make it point to the jar. The jar needs to be present at the same path on all the nodes where task-tracker runs.</p></li> <li><p>Include the JAR in the “<code>-libjars</code>” command line option of the <code>hadoop jar …</code> command. The jar will be placed in distributed cache and will be made available to all of the job’s task attempts. Your map-reduce code must use <code>GenericOptionsParser</code>. For more details read <a href="http://grepalex.com/2013/02/25/hadoop-libjars/" rel="nofollow noreferrer">this blog post</a>.</p></li> </ol> <p>Comparison:</p> <ul> <li><h1>1 is a legacy method but discouraged because it has a large negative performance cost.</h1></li> <li><h1>2 and #3 are good for private clusters but pretty lame practice as you cannot expect end users to do that.</h1></li> <li><h1>4 is the most recommended option.</h1></li> </ul> <p>Read the <a href="http://blog.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/" rel="nofollow noreferrer">main post</a> from Cloudera). </p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload