Note that there are some explanatory texts on larger screens.

plurals
  1. POPoor performance in multi-threaded C++ program
    primarykey
    data
    text
    <p>I have a C++ program running on Linux in which a new thread is created to do some computationally expensive work independent of the main thread (The computational work completes by writing the results to files, which end up being very large). However, I'm getting relatively poor performance. </p> <p>If I implement the program straightforward (without introducing other threads), it completes the task in roughly 2 hours. With the multi-threaded program it takes around 12 hours to do the same task (this was tested with only one thread spawned).</p> <p>I've tried a couple of things, including <a href="https://www.kernel.org/doc/man-pages/online/pages/man3/pthread_setaffinity_np.3.html" rel="nofollow">pthread_setaffinity_np</a> to set the thread to a single CPU (out of the 24 available on the server I'm using), as well as <a href="https://www.kernel.org/doc/man-pages/online/pages/man3/pthread_setschedparam.3.html" rel="nofollow">pthread_setschedparam</a> to set the scheduling policy (I've only tried SCHED_BATCH). But the effects of these have so far been negligible.</p> <p>Are there any general causes for this kind of problem?</p> <p>EDIT: I've added some example code that I'm using, which is hopefully the most relevant parts. The function process_job() is what actually does the computational work, but it would be too much to include here. Basically, it reads in two files of data, and uses these to perform queries on an in-memory graph database, in which the results are written to two large files over a period of hours.</p> <p>EDIT part 2: Just to clarify, the problem is not that I want to use threads to increase the performance of an algorithm I have. But rather, I want to run many instances of my algorithm simultaneously. Therefore, I expect the algorithm would run at a similar speed when put in a thread as it would if I didn't use multi-threads at all.</p> <p>EDIT part 3: Thanks for the suggestions all. I'm currently doing some unit tests (seeing which parts are slowing down) as some have suggested. As the program takes a while to load and execute, it is taking time to see any results from the tests and therefore I apologize for late responses. I think the main point I wanted to clarify is possible reasons why threading could cause a program to run slowly. From what I gather from the comments, it simply shouldn't be. I'll post when I can find a reasonable resolution, thanks again.</p> <p>(FINAL) EDIT part 4: It turns out that the problem was not related to threading after all. Describing it would be too cumbersome at this point (including the use of compiler optimization levels), but the ideas posted here were very useful and appreciated.</p> <pre><code>struct sched_param sched_param = { sched_get_priority_min(SCHED_BATCH) }; int set_thread_to_core(const long tid, const int &amp;core_id) { cpu_set_t mask; CPU_ZERO(&amp;mask); CPU_SET(core_id, &amp;mask); return pthread_setaffinity_np(tid, sizeof(mask), &amp;mask); } void *worker_thread(void *arg) { job_data *temp = (job_data *)arg; // get the information for the task passed in ... long tid = pthread_self(); int set_thread = set_thread_to_core(tid, slot_id); // assume slot_id is 1 (it is in the test case I run) sched_get_priority_min(SCHED_BATCH); pthread_setschedparam(tid, SCHED_BATCH, &amp;sched_param); int success = process_job(...); // this is where all the work actually happens pthread_exit(NULL); } int main(int argc, char* argv[]) { ... pthread_t temp; pthread_create(&amp;temp, NULL, worker_thread, (void *) &amp;jobs[i]); // jobs is a vector of a class type containing information for the task ... return 0; } </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload