Note that there are some explanatory texts on larger screens.

plurals
  1. POOpenMP and cores/threads
    primarykey
    data
    text
    <p>My CPU is a Core i3 330M with 2 cores and 4 threads. When I execute the command cat <code>/proc/cpuinfo</code> in my terminal, it is like I have 4 CPUS. When I use the OpenMP function <code>get_omp_num_procs()</code> I also get 4. </p> <p>Now I have a standard C++ vector class, I mean a fixed-size double array class that does not use expression templates. I have carefully parallelized all the methods of my class and I get the "expected" speedup. </p> <p>The question is: can I guess the expected speedup in such a simple case? For instance, if I add two vectors without parallelized for-loops I get some time (using the shell time command). Now if I use OpenMP, should I get a time divided by 2 or 4, according to the number of cores/threads? I emphasize that I am only asking for this particular simple problem, where there is no interdependence in the data and everything is linear (vector addition). </p> <p>Here is some code:</p> <pre><code>Vector Vector::operator+(const Vector&amp; rhs) const { assert(m_size == rhs.m_size); Vector result(m_size); #pragma omp parallel for schedule(static) for (unsigned int i = 0; i &lt; m_size; i++) result.m_data[i] = m_data[i]+rhs.m_data[i]; return result; } </code></pre> <p>I have already read this post: <a href="https://stackoverflow.com/questions/4717251/openmp-thread-mapping-to-physical-cores">OpenMP thread mapping to physical cores</a>.</p> <p>I hope that somebody will tell me more about how OpenMP get the work done in this simple case. I should say that I am a beginner in parallel computing.</p> <p>Thanks!</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload