Note that there are some explanatory texts on larger screens.

plurals
  1. POFill with/without intrinsics C++
    primarykey
    data
    text
    <p>I'm studying intrinsic functions impact on performance, and I'm a little bit confused: they seem to have no impact at all! I'm trying to fill an array of doubles with two different functions and I see no differences. I allocated the array with a call to _aligned_malloc with alignment parameter set to 8. I use Visual Studio 2008 and I compiled in Release mode, both with and without optimizations (/O2 - /Od) and both with and without intrinsics (/Oi) - all the four combinations. Two different versions follow:</p> <pre><code>#ifdef _NO_INTRIN void my_fill(double* vett, double value, int N) { double* last = vett + N; while( vett != last) { *vett++ = value; } } #else void my_fill(double* vett, double value, int N) { double* last = vett + N; // set "classically" unaligned data, if any while( (0xF &amp; (uintptr_t)vett) &amp;&amp; vett != last ) *vett++ = value; __m128d* vett_ = (__m128d*)vett; uintptr_t fff0 = ~0 &lt;&lt; 4; // round address to nearest aligned data setting to zero least significant 4 bits __m128d* last_ = (__m128d*)( fff0 &amp; (uintptr_t)last); // process until second-last element to manage odd values of N for( ; vett_ &lt; last_-1; vett_++ ) { *vett_ = _mm_set1_pd(value); } vett = (double*)vett_; while(vett != last) *vett++ = value; } #endif </code></pre> <p>As a last specification, I aligned my data to 8B and not to 16 because I plan to execute this function in a multi-threaded way on different portions of the array. So, also aligning data to 16B I couldn't be sure that all the portions of the array would be aligned (es. 303 elements, 3 threads, 101 element per thread, 1st portion aligned to 16B, 2nd portion starting @ vett+101*8 ==> unaligned). That's why I tried to implement an alignment-agnostic function. I tried to fill an array of 1M elements on my Intel Atom CPU N570 @ 1.66 GHz and I got always the same execution time. So... what's wrong with my approach? Why I see no differences? Thank you all in advance.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload