Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Instead of using clock() to measure time, you should Events:</p> <p>Using events you would have something like this: </p> <pre><code> cudaEvent_t start, stop; // variables that holds 2 events float time; // Variable that will hold the time cudaEventCreate(&amp;start); // creating the event 1 cudaEventCreate(&amp;stop); // creating the event 2 cudaEventRecord(start, 0); // start measuring the time // What you want to measure cudaMalloc((void**)&amp;dev_B, Nel*Nface*sizeof(int)); cudaMemcpy(dev_B, B, Nel*Nface*sizeof(int),cudaMemcpyHostToDevice); cudaEventRecord(stop, 0); // Stop time measuring cudaEventSynchronize(stop); // Wait until the completion of all device // work preceding the most recent call to cudaEventRecord() cudaEventElapsedTime(&amp;time, start, stop); // Saving the time measured </code></pre> <p><strong>EDIT</strong> : Additional information :</p> <p>"The kernel launch returns control to the CPU thread before it is finished. Therefore your timing construct is measuring both the kernel execution time as well as the 2nd memcpy. When timing the copy after the kernel, your timer code is being executed immediately, but the cudaMemcpy is waiting for the kernel to complete before it starts. This also explains why your timing measurement for the data return seems to vary based on kernel loop iterations. It also explains why the time spent on your kernel function is "negligible"". credits to <a href="https://stackoverflow.com/users/1695960/robert-crovella">Robert Crovella</a></p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload