Note that there are some explanatory texts on larger screens.

plurals
  1. POWhat's the best way to demonstrate the effect of affinity setting?
    primarykey
    data
    text
    <p>Once I noticed that Windows doesn't keep computation-intensive threads on a specific core - it keeps switching cores instead. So I speculated that the job would be done faster, if the thread would keep access to the same data caches. And really, I was able to observe a stable ~1% speed improvement after setting the thread's affinity mask to a single core (in a ppmd (de)compression thread). But then I tried to build a simple demo for this effect, and more or less failed - that is, it works as expected on my system (Q9450):</p> <pre> buflog=21 bufsize=2097152 (cache flush) first run = 6.938s time with default affinity = 6.782s time with first core only = 6.578s speed gain is 3.01% </pre> <p>but people I asked weren't exactly able to reproduce the effect. Any suggestions?</p> <pre><code>#include &lt;stdio.h&gt; #include &lt;windows.h&gt; int buflog=21, bufsize, bufmask; char* a; char* b; volatile int r = 0; __declspec(noinline) int benchmark( char* a ) { int t0 = GetTickCount(); int i,h=1,s=0; for( i=0; i&lt;1000000000; i++ ) { h = h*200002979 + 1; s += ((int&amp;)a[h&amp;bufmask]) + ((int&amp;)a[h&amp;(bufmask&gt;&gt;2)]) + ((int&amp;)a[h&amp;(bufmask&gt;&gt;4)]); } r = s; t0 = GetTickCount() - t0; return t0; } DWORD WINAPI loadcore( LPVOID ) { SetThreadAffinityMask( GetCurrentThread(), 2 ); while(1) benchmark(b); } int main( int argc, char** argv ) { if( (argc&gt;1) &amp;&amp; (atoi(argv[1])&gt;16) ) buflog=atoi(argv[1]); bufsize=1&lt;&lt;buflog; bufmask=bufsize-1; a = new char[bufsize+4]; b = new char[bufsize+4]; printf( "buflog=%i bufsize=%i\n", buflog, bufsize ); CreateThread( 0, 0, &amp;loadcore, 0, 0, 0 ); printf( "(cache flush) first run = %.3fs\n", float(benchmark(a))/1000 ); float t1 = benchmark(a); t1/=1000; printf( "time with default affinity = %.3fs\n", t1 ); SetThreadAffinityMask( GetCurrentThread(), 1 ); float t2 = benchmark(a); t2/=1000; printf( "time with first core only = %.3fs\n", t2 ); printf( "speed gain is %4.2f%%\n", (t1-t2)*100/t1 ); return 0; } </code></pre> <p>P.S. I can post a link to compiled version if anybody needs that.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload