Note that there are some explanatory texts on larger screens.

plurals
  1. PODifference between two omp for loops
    text
    copied!<p>I'm just starting to use OpenMP and am writing a function which divides an array into <code>numBlocks</code> blocks and computes a histogram on every block (i.e., one histogram per block) by inspecting the <code>blockSize</code> elements of each block (in the code I'm providing, the histogram is recording the divisibility of the elements in the block by the integers <code>1</code> to <code>numBuckets</code>). </p> <p>In the first <code>omp for</code> loop I create a thread for each block using:</p> <pre><code>#pragma omp for schedule(static) for(uint blockNum = 0; blockNum &lt; numBlocks; blockNum++){ for(uint blockSubIdx = 0; blockSubIdx &lt; blockSize; blockSubIdx++){ uint idx = blockNum * blockSize + blockSubIdx; // Compute histogram here by examining array[idx] } } </code></pre> <p>Implementing it another way, I request threads to each operate on <code>blockSize</code> elements, where I have previously asserted that <code>array.size() == numBlocks * blockSize</code>:</p> <pre><code>#pragma omp for schedule(static, blockSize) for(uint idx = 0; idx &lt; array.size(); idx++){ uint blockNum = idx / blockSize; // Compute histogram here by examining array[idx] } </code></pre> <p>This second method does not work correctly if I increase the number of threads (using <code>export OMP_NUM_THREADS</code>) above some threshold (78 on my compute node)--the resulting histogram values do not match those from a serial computation.</p> <p>According to <a href="https://computing.llnl.gov/tutorials/openMP/" rel="nofollow">https://computing.llnl.gov/tutorials/openMP/</a>, it looks like the blocks of size <code>blockSize</code> in the second method are contiguous, so I'm not clear as to why it's failing. It seems like multiple threads are writing to the same index in the histogram, though. Is there a subtlety that I'm missing?</p> <p>Here is a gist of the whole source code: <a href="https://gist.github.com/anonymous/5391777" rel="nofollow">https://gist.github.com/anonymous/5391777</a>.</p> <p><strong>UPDATE</strong>: The thread threshold does show some dependence on <code>array.size()</code>. It's not the particular value of the threshold that I'm trying to understand, but rather the existence of the threshold.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload