StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>If you have not set dfs exclude file before, follow 1-3. Else start from 4.</p> <ol> <li>Shut down the NameNode.</li> <li>Set dfs.hosts.exclude to point to an empty exclude file. </li> <li>Restart NameNode.</li> <li>In the dfs exclude file, specify the nodes using the full hostname or IP or IP:port format. </li> <li>Do the same in mapred.exclude</li> <li>execute <code>bin/hadoop dfsadmin -refreshNodes</code>. This forces the NameNode to reread the exclude file and start the decommissioning process.</li> <li>execute <code>bin/hadoop mradmin -refreshNodes</code></li> <li>Monitor the NameNode and JobTracker web UI and confirm the decommission process is in progress. It can take a few seconds to update. Messages like <code>"Decommission complete for node XXXX.XXXX.X.XX:XXXXX"</code> will appear in the NameNode log files when it finishes decommissioning, at which point you can remove the nodes from the cluster. </li> <li>When the process has completed, the namenode UI will list the datanode as decommissioned. The Jobtracker page will show the updated number of active nodes. Run <code>bin/hadoop dfsadmin -report</code> to verify. Stop the datanode and tasktracker process on the excluded node(s). </li> <li>If you do not plan to reintroduce the machine to the cluster, remove it from the include and exclude files.</li> </ol> <p>To add a node as datanode and tasktracker see <a href="http://wiki.apache.org/hadoop/FAQ#I_have_a_new_node_I_want_to_add_to_a_running_Hadoop_cluster.3B_how_do_I_start_services_on_just_one_node.3F">Hadoop FAQ page</a></p> <p><strong>EDIT : When a live node is to be removed from the cluster, what happens to the Job ?</strong></p> <p>The jobs running on a node to be de-commissioned would get affected as the tasks of the job scheduled on that node(s) would be marked as KILLED_UNCLEAN (for map and reduce tasks) or KILLED (for job setup and cleanup tasks). See line 4633 in <a href="http://svn.apache.org/viewvc/hadoop/common/branches/branch-1.1/src/mapred/org/apache/hadoop/mapred/JobTracker.java?view=markup">JobTracker.java</a> for details. The job will be informed to fail that task. Most of the time, Job tracker will reschedule execution. However, after many repeated failures it may instead decide to allow the entire job to fail or succeed. See line 2957 onwards in <a href="http://svn.apache.org/viewvc/hadoop/common/branches/branch-1.1/src/mapred/org/apache/hadoop/mapred/JobInProgress.java?view=markup">JobInProgress.java</a>.</p>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload