Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>First off and if I understand you correctly, clCreateSubBuffer is probably not what you want, as it creates a sub-buffer from an existing OpenCL <em>buffer object</em>. The documentation you linked also tells us that:</p> <blockquote> <p>The CL_MEM_USE_HOST_PTR, CL_MEM_ALLOC_HOST_PTR and CL_MEM_COPY_HOST_PTR values cannot be specified in flags but are inherited from the corresponding memory access qualifiers associated with buffer.</p> </blockquote> <p>You said you have a vector on the host and want to send half of it to the device. For this, I would use a regular buffer of half the vector's size (in bytes) on the device.</p> <p>Then, with a regular buffer, the performance you see is expected.</p> <ol> <li><code>CL_MEM_ALLOC_HOST_PTR</code> only allocates memory on the host, which does not incur any transfer at all: it is like doing a malloc and not filling the memory.</li> <li><code>CL_MEM_COPY_HOST_PTR</code> will allocate a buffer on the device, most probably the RAM on GPUs, and then copy your whole host buffer over to the device memory.</li> <li>On GPUs, <code>CL_MEM_USE_HOST_PTR</code> most likely allocates so-called <em>page-locked</em> or <em>pinned</em> memory. This kind of memory is the fastest for host->GPU memory transfer and this is the recommended way to do the copy.</li> </ol> <p>To read how to correctly use pinned memory on NVidia devices, refer to chapter 3.1.1 of <a href="http://www.nvidia.com/content/cudazone/CUDABrowser/downloads/papers/NVIDIA_OpenCL_BestPracticesGuide.pdf" rel="nofollow noreferrer">NVidia's OpenCL best practices guide</a>. Note that if you use too much pinned memory, performance may drop below a host copied memory.</p> <p>The <em>reason</em> why pinned memory is faster than copied device memory is well-explained in <a href="https://stackoverflow.com/questions/5736968/why-is-cuda-pinned-memory-so-fast">this SO question</a> aswell as <a href="http://forums.nvidia.com/index.php?showtopic=164661" rel="nofollow noreferrer">this forum thread</a> it points to.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload