Note that there are some explanatory texts on larger screens.

plurals
  1. POCuda doesn't calculate what it is expected to, just silently ignores my code
    text
    copied!<p>I'm encountering a very strange problem: Mu 9800GT doesnt seem to calculate at all. I've tried all hello-worlds i've found in the internet, here's one of them:</p> <p>this program creates 1..100 array on hosts, sends it to device, calculates a square of each value, returns it to host, prints the results.</p> <pre><code>#include "stdafx.h" #include &lt;stdio.h&gt; #include &lt;cuda.h&gt; __global__ void square_array(float *a, int N) { int idx = blockIdx.x * blockDim.x + threadIdx.x; if (idx&lt;N) a[idx] = a[idx] * a[idx]; } // main routine that executes on the host int main(void) { float *a_h, *a_d; // Pointer to host &amp; device arrays const int N = 100; // Number of elements in arrays size_t size = N * sizeof(float); a_h = (float *)malloc(size); // Allocate array on host cudaMalloc((void **) &amp;a_d, size); // Allocate array on device // Initialize host array and copy it to CUDA device for (int i=0; i&lt;N; i++) a_h[i] = (float)i; cudaMemcpy(a_d, a_h, size, cudaMemcpyHostToDevice); // Do calculation on device: int block_size = 4; int n_blocks = N/block_size + (N%block_size == 0 ? 0:1); square_array &lt;&lt;&lt; n_blocks, block_size &gt;&gt;&gt; (a_d, N); // Retrieve result from device and store it in host array cudaMemcpy(a_h, a_d, sizeof(float)*N, cudaMemcpyDeviceToHost); // Print results for (int i=0; i&lt;N; i++) printf("%d %f\n", i, a_h[i]); // Cleanup free(a_h); cudaFree(a_d); } </code></pre> <p>so the output is expected to be:</p> <blockquote> <p>1 1.000</p> <p>2 4.000</p> <p>3 9.000 </p> <p>4 16.000</p> </blockquote> <p>.. I swear back in 2009 it worked perfectly (vista 32, deviceemu)</p> <p>now i get output:</p> <blockquote> <p>1 1.000</p> <p>2 2.000</p> <p>3 3.000</p> <p>4 4.000</p> </blockquote> <p>so my card doesnt do anything. What can be the problem? Configuration is: win7x64 visual studio 2010 32bit cuda toolkit 3.2 64bit</p> <p>compilation settings: cuda 3.2 toolkit, 32-bit target platform, deviceemu or not - doesnt matter, the results are the same.</p> <p>i also tried it on my vmware xp(32bit) visual studio 2008. the result is the same.</p> <p>Please help me, i barely made the programe to compile, now i need it to work.</p> <p>You can also view my project with all it needs from <a href="http://forums.nvidia.com/index.php?act=attach&amp;type=post&amp;id=24060" rel="nofollow">my post at nvidia forums</a> ( 2.7 kb)</p> <p>Thanks, Ilya</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload