Note that there are some explanatory texts on larger screens.

plurals
  1. PODijkstra's algorithm in CUDA
    primarykey
    data
    text
    <p>I am having troubles with this piece of CUDA code I have written. This is supposed to be the CUDA implementation of the <a href="http://en.wikipedia.org/wiki/Dijkstra%27s_algorithm" rel="nofollow">Dijkstra's algorithm</a>. The code is as follows: </p> <pre><code> __global__ void cuda_dijkstra_kernel_1(float* Va, int* Ea, int* Sa, float* Ca, float* Ua, char* Ma, unsigned int* lock){ int tid = blockIdx.x; if(Ma[tid]=='1'){ Ma[tid] = '0'; int ind_Ea = Sa[tid * 2]; int num_edges = Sa[(tid * 2) + 1]; int v; float wt = 0; unsigned int leaveloop; leaveloop = 0u; while(leaveloop==0u){ if(atomicExch(lock, 1u) == 0u){ for(v = 0; v &lt; num_edges; v++){ wt = (Va[tid * 3] - Va[Ea[ind_Ea + v] * 3]) * (Va[tid * 3] - Va[Ea[ind_Ea + v] * 3]) + (Va[(tid * 3) + 1] - Va[(Ea[ind_Ea + v] * 3) + 1]) * (Va[(tid * 3) + 1] - Va[(Ea[ind_Ea + v] * 3) + 1]) + (Va[(tid * 3) + 2] - Va[(Ea[ind_Ea + v] * 3) + 2]) * (Va[(tid * 3) + 2] - Va[(Ea[ind_Ea + v] * 3) + 2]) ; wt = sqrt(wt); if(Ca[Ea[ind_Ea + v]] &gt; (Ca[tid] + wt)){ Ca[Ea[ind_Ea + v]] = Ca[tid] + wt; Ma[Ea[ind_Ea + v]] = '1'; } __threadfence(); leaveloop = 1u; atomicExch(lock, 0u); } } } } } </code></pre> <p>The problem is in the <em>relaxation phase</em> of the Dijkstra's algorithm. I have implemented such a phase as a <em>critical section</em>. If there is a vertex (lets say <code>a</code>) which is a neighbor of more than one vertex (i.e., connecting to other vertices with edges), then all of the threads for those vertices will try to write to the location of vertex <code>a</code> in the Cost Array <code>Ca</code>. Now my goal is to have the smaller value written in that location. To do that, I am trying to serialize the process and applying <code>__threadfence()</code> as well so that value written by one thread is visible to others and then eventually the smaller value is retained in the location of vertex <code>a</code>. But the problem is, that this logic is not working. The location of vertex <code>a</code> does not get the smallest value of all the threads trying to write to that location and I don't understand why. Any help will be highly appreciated.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload