Note that there are some explanatory texts on larger screens.

plurals
  1. POparsing json input in hadoop java
    primarykey
    data
    text
    <p>My input data is in hdfs. I am simply trying to do wordcount but there is slight difference. The data is in json format. So each line of data is:</p> <pre><code>{"author":"foo", "text": "hello"} {"author":"foo123", "text": "hello world"} {"author":"foo234", "text": "hello this world"} </code></pre> <p>I only want to do wordcount of words in "text" part.</p> <p>How do I do this?</p> <p>I tried the following variant so far:</p> <pre><code>public static class TokenCounterMapper extends Mapper&lt;Object, Text, Text, IntWritable&gt; { private static final Log log = LogFactory.getLog(TokenCounterMapper.class); private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(Object key, Text value, Context context) throws IOException, InterruptedException { try { JSONObject jsn = new JSONObject(value.toString()); //StringTokenizer itr = new StringTokenizer(value.toString()); String text = (String) jsn.get("text"); log.info("Logging data"); log.info(text); StringTokenizer itr = new StringTokenizer(text); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } } catch (JSONException e) { // TODO Auto-generated catch block e.printStackTrace(); } } } </code></pre> <p>But I am getting this error:</p> <pre><code>Error: java.lang.ClassNotFoundException: org.json.JSONException at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:865) at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:199) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:719) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) at org.apache.hadoop.mapred.Child.main(Child.java:249) </code></pre>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload