Note that there are some explanatory texts on larger screens.

plurals
  1. POhadoop: having more than one reducers under pseudo distributed environment?
    text
    copied!<p>I am newbie to hadoop. I have successfully configured a hadoop setup in pseudo distributed mode. I want to have multiple reducers with the option <code>-D mapred.reduce.tasks=2</code> (with hadoop-streaming). however there's still only one reducer.</p> <p>according to Google I'm sure that mapred.LocalJobRunner limits number of reducers to 1. But I wonder is there any workaround to have more reducers?</p> <p><strong>my hadoop configuration files:</strong></p> <pre><code>[admin@localhost string-count-hadoop]$ cat ~/hadoop-1.1.2/conf/core-site.xml &lt;?xml version="1.0"?&gt; &lt;?xml-stylesheet type="text/xsl" href="configuration.xsl"?&gt; &lt;!-- Put site-specific property overrides in this file. --&gt; &lt;configuration&gt; &lt;property&gt; &lt;name&gt;fs.default.name&lt;/name&gt; &lt;value&gt;hdfs://localhost:9000&lt;/value&gt; &lt;/property&gt; &lt;property&gt; &lt;name&gt;hadoop.tmp.dir&lt;/name&gt; &lt;value&gt;/home/admin/hadoop-data/tmp&lt;/value&gt; &lt;/property&gt; &lt;/configuration&gt; [admin@localhost string-count-hadoop]$ cat ~/hadoop-1.1.2/conf/mapred-site.xml &lt;?xml version="1.0"?&gt; &lt;?xml-stylesheet type="text/xsl" href="configuration.xsl"?&gt; &lt;!-- Put site-specific property overrides in this file. --&gt; &lt;configuration&gt; &lt;property&gt; &lt;name&gt;mapred.job.tracker&lt;/name&gt; &lt;value&gt;localhost:9001&lt;/value&gt; &lt;/property&gt; &lt;/configuration&gt; [admin@localhost string-count-hadoop]$ cat ~/hadoop-1.1.2/conf/hdfs-site.xml &lt;?xml version="1.0"?&gt; &lt;?xml-stylesheet type="text/xsl" href="configuration.xsl"?&gt; &lt;!-- Put site-specific property overrides in this file. --&gt; &lt;configuration&gt; &lt;property&gt; &lt;name&gt;dfs.name.dir&lt;/name&gt; &lt;value&gt;/home/admin/hadoop-data/name&lt;/value&gt; &lt;/property&gt; &lt;property&gt; &lt;name&gt;dfs.data.dir&lt;/name&gt; &lt;value&gt;/home/admin/hadoop-data/data&lt;/value&gt; &lt;/property&gt; &lt;property&gt; &lt;name&gt;dfs.replication&lt;/name&gt; &lt;value&gt;1&lt;/value&gt; &lt;/property&gt; &lt;/configuration&gt; </code></pre> <p><strong>the way I start job:</strong></p> <pre><code>[admin@localhost string-count-hadoop]$ cat hadoop-startjob.sh #!/bin/sh ~/hadoop-1.1.2/bin/hadoop jar ~/hadoop-1.1.2/contrib/streaming/hadoop-streaming-1.1.2.jar \ -D mapred.job.name=string-count \ -D mapred.reduce.tasks=2 \ -mapper mapper \ -file mapper \ -reducer reducer \ -file reducer \ -input $1 \ -output $2 [admin@localhost string-count-hadoop]$ ./hadoop-startjob.sh /z/programming/testdata/items_sequence /z/output packageJobJar: [mapper, reducer] [] /tmp/streamjob837249979139287589.jar tmpDir=null 13/07/17 20:21:10 INFO util.NativeCodeLoader: Loaded the native-hadoop library 13/07/17 20:21:10 WARN snappy.LoadSnappy: Snappy native library not loaded 13/07/17 20:21:10 INFO mapred.FileInputFormat: Total input paths to process : 1 13/07/17 20:21:11 WARN mapred.LocalJobRunner: LocalJobRunner does not support symlinking into current working dir. ... ... </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload