Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Start by doing proper <a href="https://stackoverflow.com/questions/14038589/what-is-the-canonical-way-to-check-for-errors-using-the-cuda-runtime-api">cuda error checking</a> on all your cuda API calls (e.g. cudaMemcpy, etc.) and kernel calls.</p> <p>When you do that you'll discover that your kernels are not running successfully. These types of things won't work:</p> <pre><code>uchar4 *devPtr; // you've just created an unallocated NULL host pointer size_t img1_size = IMAGESIZE_MAX; kernel&lt;&lt;&lt;grids,threads&gt;&gt;&gt;(devPtr); // this kernel will fail uchar4 *devPtr2; // you've just created an unallocated NULL host pointer size_t img2_size = IMAGESIZE_MAX; kernel2&lt;&lt;&lt;grids,threads&gt;&gt;&gt;(devPtr2); // this kernel will fail </code></pre> <p><code>devPtr</code> and <code>devPtr2</code> in the above code are NULL pointers. You haven't allocated any storage associated with them. Furthermore, since you are passing them to device kernels, they need to be allocated with <code>cudaMalloc</code> or similar API function, in order for the pointers to be usable in device code.</p> <p>Since they are not allocated with <code>cudaMalloc</code>, as soon as you try to dereference those pointers in device code, you'll create a kernel fault. This will be evident if you do error checking, as you will have "unspecified launch failure" or similar report from those kernels.</p> <p>I think there are probably a number of other problems in your code, but first you should do proper cuda error checking and at least get your code to the point where everything you've written is, in fact, running.</p> <p>And the code you've posted doesn't actually compile.</p> <p>After fixing the compile errors I also discovered that you have another infinite loop:</p> <pre><code>cudaMalloc ( (uchar4 **)&amp;pBufferCurrent, sizeTotal + sizeof(size) + size); cudaMalloc ( (uchar4 **)&amp;pBuffer, sizeTotal + sizeof(size) + size); do { if (!pBufferCurrent) { break; } pBuffer = pBufferCurrent; pBufferCurrent += sizeTotal; imageget ( pBufferCurrent + sizeof(size), size, devPtr); sizeTotal += (sizeof(size) + size); } while (a==1); </code></pre> <p>Since <code>a</code> is initialized to 1 in your loop, and nothing in the loop modifies <code>a</code>, the loop will never exit based on the <code>while</code> condition. Since pBufferCurrent is also never zero if it's been properly set up by <code>cudaMalloc</code>, the <code>break</code> will never be taken.</p> <p>If you <code>malloc</code> or <code>cudaMalloc</code> a pointer called <code>pBufferCurrent</code>, it's hard for me to imagine under what circumstances this would ever make sense:</p> <pre><code>pBufferCurrent += sizeTotal; </code></pre> <p>and although this is legal, I don't see how this makes sense:</p> <pre><code>pBuffer = pBufferCurrent; </code></pre> <p>You just created an allocation for <code>pBuffer</code> using <code>cudaMalloc</code>, but the first thing you do is throw it away?</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload