Note that there are some explanatory texts on larger screens.

plurals
  1. PORunning a standalone Hadoop application on multiple CPU cores
    primarykey
    data
    text
    <p>My team built a Java application using the Hadoop libraries to transform a bunch of input files into useful output. Given the current load a single multicore server will do fine for the coming year or so. We do not (yet) have the need to go for a multiserver Hadoop cluster, yet we chose to start this project "being prepared".</p> <p>When I run this app on the command-line (or in eclipse or netbeans) I have not yet been able to convince it to use more that one map and/or reduce thread at a time. Given the fact that the tool is very CPU intensive this "single threadedness" is my current bottleneck.</p> <p>When running it in the netbeans profiler I do see that the app starts several threads for various purposes, but only a single map/reduce is running at the same moment.</p> <p>The input data consists of several input files so Hadoop should at least be able to run 1 thread per input file at the same time for the map phase.</p> <p>What do I do to at least have 2 or even 4 active threads running (which should be possible for most of the processing time of this application)?</p> <p>I'm expecting this to be something very silly that I've overlooked.</p> <hr> <p>I just found this: <a href="https://issues.apache.org/jira/browse/MAPREDUCE-1367" rel="nofollow noreferrer">https://issues.apache.org/jira/browse/MAPREDUCE-1367</a> This implements the feature I was looking for in Hadoop 0.21 It introduces the flag mapreduce.local.map.tasks.maximum to control it.</p> <p>For now I've also found the solution described <a href="https://stackoverflow.com/questions/3546025/is-it-possible-to-run-hadoop-in-pseudo-distributed-operation-without-hdfs">here in this question</a>. </p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload