StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POhow to collect three arguments in mapper output .Is there any way
text
Body
copied!<p>I am new to map Reduce and hadoop concept.So please help</p> <p>I have near about 100 files containing data in this format</p> <pre><code>conf/iceis/GochenouerT01a:::John E. Gochenouer::Michael L. Tyler:::Voyeurism, Exhibitionism, and Privacy on the Internet. </code></pre> <p>which i am supposed to do via map reduce algorithm. Now in the output I want to display </p> <pre><code>John E. Gochenoue Voyeurism . John E. Gochenoue Exhibitionism John E. Gochenoue and John E. Gochenoue privacy John E. Gochenoue on John E. Gochenoue the John E. Gochenoue internet Michael L. Tyler Voyeurism . Michael L. Tyler Exhibitionism Michael L. Tyler and Michael L. Tyler privacy Michael L. Tyler on Michael L. Tyler the Michael L. Tyler internet </code></pre> <p>so now it is single line. So there are 'n' number of lines like that containing plenty of names and plenty of books.</p> <p>So if I consider one document with 110 lines. Can I have an output of my mapper like this</p> <pre><code>John E. Gochenoue Voyeurism 1 John E. Gochenoue Exhibitionism 3 Michael L. Tyler on 7 </code></pre> <p>I.E. To say it displays the name and the work followed by the occourence of the word in the document and finally after reduce it should display the name followed by the words the name has against it and the combined frequency of the word it has occoured in 'n' document.</p> <p>Well i know output.collecter() but it takes two arguments</p> <pre><code>output.collect(arg0, arg1) </code></pre> <p>Is there any method so as to collect three values like name,word and occourence of word</p> <p>The following is my code</p> <pre><code>public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { String line = value.toString(); /* * StringTokenizer tokenizer = new StringTokenizer(line); while * (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); * output.collect(word, one); */ String strToSplit[] = line.split(":::"); String end = strToSplit[strToSplit.length - 1]; String[] names = strToSplit[1].split("::"); for (String name : names) { StringTokenizer tokens = new StringTokenizer(end, " "); while (tokens.hasMoreElements()) { output.collect(arg0, arg1) System.out.println(tokens.nextElement()); } } } } public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> { public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { int sum = 0; while (values.hasNext()) { sum += values.next().get(); } output.collect(key, new IntWritable(sum)); } } public static void main(String[] args) throws Exception { JobConf conf = new JobConf(example.class); conf.setJobName("wordcount"); conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(IntWritable.class); conf.setMapperClass(Map.class); conf.setCombinerClass(Reduce.class); conf.setReducerClass(Reduce.class); conf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); FileInputFormat.setInputPaths(conf, "/home/vishal/workspace/hw3data"); FileOutputFormat.setOutputPath(conf, new Path("/home/vishal/nmnmnmnmnm")); JobClient.runJob(conf); } </code></pre>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload