Note that there are some explanatory texts on larger screens.

plurals
  1. POKnow the Block ID in CUDA from a given 2D offset
    primarykey
    data
    text
    <p>i've trying to calculate the blockIdx.x and blockIdx.y from a given offset in CUDA but i'm totally mind-blocked. The idea is read data from shared memory when possible and from global memory in other case.</p> <p>In example, if I've a 1D array of 64 elements and I configure a kernel with 16x1 threads (4 blocks in total) each thread can access to a position using:</p> <pre><code>int idx = blockDim.x*blockIdx.x + threadIdx.x </code></pre> <p>and i can easily get the blockIdx.x of a given index value from the idx as </p> <pre><code>int blockNumber = idx / blockDim.x; </code></pre> <p>but in a 2D scenario with 8x8 elements and a kernel configuration of 4x4 threads (2x2 blocks in total) each thread accesses to a position using:</p> <pre><code>int x = threadIdx.x + blockIdx.x * blockDim.x; int y = threadIdx.y + blockIdx.y * blockDim.y; int pitch = blockDim.x * gridDim.x; int idx = x + y * pitch; int sharedMemIndex = threadIdx.x+threadIdx.y+BLOCK_DIM_X; __shared_block[sharedMemIndex] = fromGlobalMemory[idx]; __syncthreads(); // ... some operations int unknow_index = __shared_block[sharedMemIndex]; if ( unknow_index within this block? ) // ... read from shared memory else // ... read from global memory </code></pre> <p>How can i know the Block ID.x and ID.y at a given idx? i.e. index 34 and 35 are in block (0, 1) and index 36 in block (1, 1). So, if a thread in block (0, 1) read a value of index 35, that thread will know that the value is within its block and will read it from shared memory. The index 35 value will be in stored in the position 11 of the shared memory of the block (0. 1).</p> <p>Thanks in advance!</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload