Note that there are some explanatory texts on larger screens.

plurals
  1. POHadoop Map Reduce For Google web graph
    text
    copied!<p>we have been given as an assignment the task of creating map reduce functions that will output for each node n in the google web graph list the nodes that you can go from node n in 3 hops. (The actual data can be found here: <a href="http://snap.stanford.edu/data/web-Google.html" rel="nofollow noreferrer">http://snap.stanford.edu/data/web-Google.html</a>) Here's an example of how the items in the list will be : </p> <pre><code>1 2 1 3 2 4 3 4 3 5 4 1 4 5 4 6 5 6 </code></pre> <p>From the above an example graph will be this <img src="https://i.stack.imgur.com/RmKWu.png" alt="Graph example"></p> <p>In the above simplified example the paths for example of node 1 are α [1 -> 2 -> 4 -> 1], [1 -> 2 -> 4 -> 5], [1 -> 2 -> 4 -> 6], [1 -> 3 -> 4 -> 1], [1 -> 3 -> 4 -> 5], [1 -> 3 -> 4 -> 6] και [1 -> 3 -> 5 -> 6] and thus the map reduce will output for node 1 the vertices 1,5,6 ( (a) each vertex must be counted only once, and (b) we include the current vertex only when there is a circular path of length 3, as is the example [1 -> 2 -> 4 -> 1] and [1 -> 3 -> 4 -> 1].</p> <p>I am very lost because i believe this requires knowledge of graph theory and algorithms and we haven't been taught anything related to that. </p> <p>I will greatly appreciate if somebody can give me a right direction to start . ( I've looked into shortest path theory and such but i'm not sure if it's going to be useful for this particular exercise) </p> <p>Thanks in advance, and have a good Holidays season.</p> <p><strong><em>EDIT</em></strong></p> <p>I try to create the adjucency list , but while i expect the output to be "vertexID" "node1 node2 node3 node4..." i see that in the output file my reducer splits the list for each vertex Id into pairs of three.</p> <p>for example if i have vertex A that connects to Z,X,C,V,B,N,M,,G,H,J,K,L I it outputs this as</p> <p>A Z,X,C</p> <p>A V,B,N</p> <p>A M,G,H</p> <p>A J,K,L</p> <p>below are my mapper and reducer </p> <pre><code>public class AdjacentsListDriver extends Configured implements Tool { @Override public int run(String[] args) throws Exception { Configuration conf = getConf(); Job job = Job.getInstance(conf); job.setJobName("Test driver"); job.setJarByClass(AdjacentsListDriver.class); String[] arg0 = new GenericOptionsParser(conf, args).getRemainingArgs(); if (arg0.length != 2) { System.err.println("Usage: hadoop jar &lt;my_jar&gt; &lt;input_dir&gt; &lt;output_dir&gt;"); System.exit(1); } Path in = new Path(arg0[0]); Path out = new Path(arg0[1]); FileInputFormat.setInputPaths(job, in); FileOutputFormat.setOutputPath(job, out); job.setMapperClass(ListMapper.class); job.setReducerClass(ListReducer.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(Text.class); job.waitForCompletion(true); return 0; } public static void main(String[] args) throws Exception { int res = ToolRunner.run(new Configuration(), new AdjacentsListDriver(), args); System.exit(res); } } /** * @author George * Theoretically this takes a key(vertexID) and maps all nodes that are connected to it in one hop.... * */ public class ListMapper extends Mapper&lt;LongWritable, Text, Text, Text&gt; { private Text vertexID = new Text(); //private LongWritable vertice= new LongWritable(0); private Text vertice=new Text(); public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); StringTokenizer itr = new StringTokenizer(line,"\n"); StringTokenizer itrInside; //vertice=new LongWritable(Long.valueOf(value.toString()).longValue()); while (itr.hasMoreTokens()) { if(itr.countTokens() &gt; 2){ }//ignore first line ?? else{ itrInside=new StringTokenizer(itr.toString()); vertexID.set(itr.nextToken()); while(itrInside.hasMoreTokens()){ vertice.set(itrInside.nextToken()); context.write(vertexID, value); } } } } } @override public class ListReducer extends Reducer&lt;Text, Text, Text, Text&gt;{ public void reduce(Text key, Iterable&lt;Text&gt; values, Context context) throws IOException, InterruptedException { String vertices=""; for (Text value : values) { if(vertices=="") vertices+=value.toString(); else vertices=vertices+","+value.toString(); } Text value=new Text(); value.set(vertices); context.write(key, value); } } </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload