Note that there are some explanatory texts on larger screens.

plurals
  1. POCan't change array size over 10000 in array additions and performance issue CPU vs GPU
    primarykey
    data
    text
    <p>I'm new with OpenCL and have some problems with the array additions I use the code provided in the link below</p> <p><a href="http://code.google.com/p/opencl-book-samples/source/browse/#svn%2Ftrunk%2Fsrc%2FChapter_2%2FHelloWorld%253Fstate%253Dclosed" rel="nofollow">http://code.google.com/p/opencl-book-samples/source/browse/#svn%2Ftrunk%2Fsrc%2FChapter_2%2FHelloWorld%253Fstate%253Dclosed</a></p> <p>and I added some parts to measure the performance of the GPU</p> <pre><code>clFinish(commandQueue); // Queue the kernel up for execution across the array cl_ulong start, end; cl_event k_events; errNum = clEnqueueNDRangeKernel(commandQueue, kernel, 1, NULL, globalWorkSize, localWorkSize, 0, NULL, &amp;k_events); clGetEventProfilingInfo(k_events, CL_PROFILING_COMMAND_START, sizeof(cl_ulong), &amp;start, NULL); clWaitForEvents(1 , &amp;k_events); clGetEventProfilingInfo(k_events, CL_PROFILING_COMMAND_END, sizeof(cl_ulong), &amp;end, NULL); clGetEventProfilingInfo(k_events, CL_PROFILING_COMMAND_START, sizeof(cl_ulong), &amp;start, NULL); float GPUTime = (end - start); </code></pre> <p>And this to measure the CPU time</p> <pre><code>LARGE_INTEGER CPUstart, finish, freq; QueryPerformanceFrequency(&amp;freq); QueryPerformanceCounter(&amp;CPUstart); for (int i=0;i&lt;ARRAY_SIZE;i++){ result[i]=a[i]+b[i]; } QueryPerformanceCounter(&amp;finish); double timeCPU=(finish.QuadPart - CPUstart.QuadPart) /((double)freq.QuadPart)/1000000000.0) ; </code></pre> <p>The first problem I encountered is <strong>the array size ; it can't go beyond 10000 ; if I do this ; it just crash</strong> . How to fix it ?</p> <p>The second problem is the performance ; <strong>the GPU/CPU ratio range is too wide ; from 13% to 210%(ish)</strong> . Why does this happen and can you suggest a fix ?</p> <p>Edit : I figured out the 2nd ; the lag was caused by the power saving mode ; it set the core/mem to much lower than default . Just use a program to lock it ; and the performance are rocking stable at ~150-300 % (GPU/CPU)</p> <p>Good case </p> <pre><code>GPU time :632667 nanosecs. CPU time : 990023 nanosecs. GPU/CPU ratio : 156.484 percent. </code></pre> <p>And bad one :</p> <pre><code>GPU time :6.83267e+006 nanosecs. CPU time : 1.00756e+006 nanosecs. GPU/CPU ratio : 14.7462 percent. </code></pre> <p>Any ideas will be appreciated . Thank you :D</p> <p>PS : The CPU is core i3-370M ; GPU : HD5470 . I use VS2008 on windows 7</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload