Note that there are some explanatory texts on larger screens.

plurals
  1. POcrawl websites out of java web application without using bin/nutch
    primarykey
    data
    text
    <p>i am trying to using nutch (1.1) without bin/nutch from my (java) mojarra 2.0.2 webapp... i am searching at google for examples, but there are no examples how i can realize this :/ ... i get an exception and the job fails :/ (i think of cause something with hadoop)... here is my code:</p> <pre> public void run() throws Exception { final String[] args = new String[] { String.format("%s%s%s%s", JSFUtils.getWebAppRoot(), "nutch", File.separator, DIRECTORY_URLS), "-dir", String.format("%s%s%s%s", JSFUtils.getWebAppRoot(), "nutch", File.separator, DIRECTORY_CRAWL), "-threads", this.preferences.get("threads"), "-depth", this.preferences.get("depth"), "-topN", this.preferences.get("topN"), "-solr", this.preferences.get("solr") }; Crawl.main(args); } </pre> <p>and a part of the logging:</p> <pre> 10/05/17 10:42:54 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 10/05/17 10:42:54 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 10/05/17 10:42:54 INFO mapred.FileInputFormat: Total input paths to process : 1 10/05/17 10:42:54 INFO mapred.JobClient: Running job: job_local_0001 10/05/17 10:42:54 INFO mapred.FileInputFormat: Total input paths to process : 1 10/05/17 10:42:55 INFO mapred.MapTask: numReduceTasks: 1 10/05/17 10:42:55 INFO mapred.MapTask: io.sort.mb = 100 java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232) at org.apache.nutch.crawl.Injector.inject(Injector.java:211) at org.apache.nutch.crawl.Crawl.main(Crawl.java:124) at lan.localhost.process.NutchCrawling.run(NutchCrawling.java:108) at lan.localhost.main.Index.indexing(Index.java:71) at lan.localhost.bean.FeedingBean.actionStart(FeedingBean.java:25) .... </pre> <p>can someone help me or tell me how i can crawling from a java application? i have increased the Xms to 256m and Xmx to 768m, but nothing changed...</p> <p>best regards marcel</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload