Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>(The answer below refers to the initial code in the question, which since then was improved with applying these suggestions)</p> <hr> <p>You need to read more about how to use OpenMP. The specification is available at <a href="http://www.openmp.org" rel="nofollow">http://www.openmp.org</a>; and there are links to tutorials and other resources.</p> <p>I will point out some issues in your code and give recommendations how to fix those.</p> <pre><code> float *d = malloc(3*sizeof(float)); float diff; </code></pre> <p><code>d</code> is used as a temporary variable, so should be marked as <code>private</code> in <code>#pragma omp parallel for</code> (see below) to avoid data races. Meanwhile, instead of dynamic allocation, I would just use 3 separate floats. <code>diff</code> also holds a temporary value, so should also be <code>private</code>.</p> <pre><code> #pragma omp parallel for(i=0;i&lt;ncft;i+=jump){ #pragma omp parallel for(j=0;j&lt;ntri2;j++){ </code></pre> <p>You created a parallel region where each thread executes the whole loop (because the region does not contain any work sharing constructs), and inside it you created a nested region with a new(!) set of threads, also each executing the whole inner loop. It adds lots of overhead and unnecessary computation to your program. What you need is <code>#pragma omp parallel for</code>, and only applied to the outer loop.</p> <pre><code> d[0] = vG1[i][0] - vG2[j][0]; d[1] = vG1[i][1] - vG2[j][1]; d[2] = vG1[i][2] - vG2[j][2]; diff = sqrt(pow(d[0],2) + pow(d[1],2) + pow(d[2],2)); </code></pre> <p>Not related to parallelism, but why calling <code>pow</code> just to compute squares? A good old multiplication would likely be both simpler to read and faster.</p> <pre><code> if(j==0) dist[k] = diff; else if(diff&lt;dist[k]) dist[k] = diff; </code></pre> <p>Since the action is the same (<code>dist[k]=diff;</code>), the code can be simplified by combining two conditions with <code>||</code> (logical or).</p> <pre><code> } avg += dist[k]; if(dist[k]&gt;max) max = dist[k]; </code></pre> <p>Here you compute aggregate values across the outer loop. In OpenMP, this is done with <code>reduction</code> clause of <code>#pragma omp for</code>.</p> <pre><code> k++; } </code></pre> <p>Currently, you increment <code>k</code> at each iteration, thus creating an unnecessary dependency between iterations that leads to a data race in parallel code. According to your code, <code>k</code> is just a convenient "alias" for <code>i/jump</code> - so just assign it as such at the beginning of the iteration, and make <code>private</code>.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload