Note that there are some explanatory texts on larger screens.

plurals
  1. POTroubles introducing SIMD commands into the code
    text
    copied!<p>I have a basic calculation function that I apply on each item in an array. This function does more then just summing two vectors.</p> <p>I wanted to work on multiple items from my array in parallel using SIMD commands.</p> <p>As I found these kind of examples too simple for my case (they don't include function calls): <a href="http://www.doc.ic.ac.uk/~nloriant/files/scfpsc-pc.pdf" rel="nofollow">http://www.doc.ic.ac.uk/~nloriant/files/scfpsc-pc.pdf</a></p> <p>I tried using array notation as in here: <a href="http://software.intel.com/sites/products/documentation/hpc/composerxe/en-us/cpp/mac/optaps/common/optaps_elem_functions.htm" rel="nofollow">http://software.intel.com/sites/products/documentation/hpc/composerxe/en-us/cpp/mac/optaps/common/optaps_elem_functions.htm</a></p> <p>But this did not accelerate my code. I don't understand what I am doing wrong and if I need to go to the more assembly-like style of SIMD, how do I introduce function calls there...</p> <p>If anyone can help me or refer me to a good source for my needs I'll be very thakful.</p> <p>Thank you!!!!</p> <hr> <p>code example:</p> <p>This is the basic function applied on each item in the array:</p> <pre><code>float VarFlow::gauss_seidel_step(IplImage* u, int i, float h, float J11, float J12, float J13, float vi){ int x = i%u-&gt;width; int y = i/u-&gt;width; int start_y, end_y, start_x, end_x; int N_num = 0; start_y = y - 1; end_y = y + 1; start_x = x - 1; end_x = x+1; float temp_u = 0; // Sum top neighbor if(start_y &gt; -1){ temp_u += *((float*)(u-&gt;imageData + start_y*u-&gt;widthStep) + x); N_num++; } // Sum bottom neighbor if(end_y &lt; u-&gt;height){ temp_u += *((float*)(u-&gt;imageData + end_y*u-&gt;widthStep) + x); N_num++; } // Sum left neighbor if(start_x &gt; -1){ temp_u += *((float*)(u-&gt;imageData + y*u-&gt;widthStep) + start_x); N_num++; } // Sum right neighbor if(end_x &lt; u-&gt;width){ temp_u += *((float*)(u-&gt;imageData + y*u-&gt;widthStep) + end_x); N_num++; } temp_u = temp_u - (h*h/alpha)*(J12*vi + J13); temp_u = temp_u / (N_num + (h*h/alpha)*J11); return temp_u; </code></pre> <p>}</p> <p>I'd like to declare it with __declspec (vector) and call it like so:</p> <pre><code> u_ptr[0:max_i:1] = gauss_seidel_step(imgU, vect[0:max_i:1], h, fxfx_ptr[0:max_i:1], fxfy_ptr[0:max_i:1], fxft_ptr[0:max_i:1], v_ptr[0:max_i:1]); v_ptr[0:max_i:1] = gauss_seidel_step(imgV, vect[0:max_i:1], h, fyfy_ptr[0:max_i:1], fxfy_ptr[0:max_i:1], fyft_ptr[0:max_i:1], u_ptr[0:max_i:1]); </code></pre> <p>Instead of a for loop.</p> <p>I'll be happy to get a direction with this (maybe a link to a similar example) but <strong>not</strong> a full solution.</p> <p>Thanks!</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload