StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
6522268
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
9
CommunityOwnedDate
CreationDate
2011-06-29T14:23:31.313
FavoriteCount
0
LastActivityDate
2011-06-29T14:23:31.313
LastEditDate
LastEditorUserId
0
OwnerUserId
681865
ParentId
5793600
PostTypeId
2
Score
2
ViewCount
0
LastEditorDisplayName
text
Body
I can't reproduce this with CUDA 3.2 and QT4 on a 64 bit Ubuntu 10.04LTS system. I took this main: <pre><code>#include <QtCore/QCoreApplication> extern float cudamain(); int main(int argc, char *argv[]) { QCoreApplication a(argc, argv); float gflops = cudamain(); return 0; } </code></pre> and a <code>cudamain()</code> containing this: <pre><code>#include <assert.h> #define blocksize 16 #define HM (4096) #define WM (4096) #define WN (4096) #define HN WM #define WP WN #define HP HM #define PTH WM #define PTW HM __global__ void nonsquare(float*M, float*N, float*P, int uWM,int uWN) { __shared__ float MS[blocksize][blocksize]; __shared__ float NS[blocksize][blocksize]; int tx=threadIdx.x, ty=threadIdx.y, bx=blockIdx.x, by=blockIdx.y; int rowM=ty+by*blocksize; int colN=tx+bx*blocksize; float Pvalue=0; for(int m=0; m<uWM; m+=blocksize){ MS[ty][tx]=M[rowM*uWM+(m+tx)] ; NS[ty][tx]=M[colN + uWN*(m+ty)]; __syncthreads(); for(int k=0;k<blocksize;k++) Pvalue+=MS[ty][k]*NS[k][tx]; __syncthreads(); } P[rowM*WP+colN]=Pvalue; } inline void gpuerrorchk(cudaError_t state) { assert(state == cudaSuccess); } float cudamain(){ cudaEvent_t evstart, evstop; cudaEventCreate(&evstart); cudaEventCreate(&evstop); float*M=(float*)malloc(sizeof(float)*HM*WM); float*N=(float*)malloc(sizeof(float)*HN*WN); for(int i=0;i<WM*HM;i++) M[i]=(float)i; for(int i=0;i<WN*HN;i++) N[i]=(float)i; float*P=(float*)malloc(sizeof(float)*HP*WP); float *Md,*Nd,*Pd; gpuerrorchk( cudaMalloc((void**)&Md,HM*WM*sizeof(float)) ); gpuerrorchk( cudaMalloc((void**)&Nd,HN*WN*sizeof(float)) ); gpuerrorchk( cudaMalloc((void**)&Pd,HP*WP*sizeof(float)) ); gpuerrorchk( cudaMemcpy(Md,M,HM*WM*sizeof(float),cudaMemcpyHostToDevice) ); gpuerrorchk( cudaMemcpy(Nd,N,HN*WN*sizeof(float),cudaMemcpyHostToDevice) ); dim3 dimBlock(blocksize,blocksize);//(tile_width , tile_width); dim3 dimGrid(WN/dimBlock.x,HM/dimBlock.y);//(width/tile_width , width/tile_witdh); gpuerrorchk( cudaEventRecord(evstart,0) ); nonsquare<<<dimGrid,dimBlock>>>(Md,Nd,Pd,WM, WN); gpuerrorchk( cudaPeekAtLastError() ); gpuerrorchk( cudaEventRecord(evstop,0) ); gpuerrorchk( cudaEventSynchronize(evstop) ); float time; cudaEventElapsedTime(&time,evstart,evstop); gpuerrorchk( cudaMemcpy(P,Pd,WP*HP*sizeof(float),cudaMemcpyDeviceToHost) ); cudaFree(Md); cudaFree(Nd); cudaFree(Pd); float gflops=(2.e-6*WM*WM*WM)/(time); cudaThreadExit(); return gflops; } </code></pre> (pay no attention to the actual code other than it doing memory transactions and running a kernel, it is nonsense otherwise). Compiling the code like this: <pre><code>cuda:~$ nvcc -arch=sm_20 -c -o cudamain.o cudamain.cu cuda:~$ g++ -o qtprob -I/usr/include/qt4 qtprob.cc cudamain.o -L $CUDA_INSTALL_PATH/lib64 -lQtCore -lcuda -lcudart cuda:~$ ldd qtprob linux-vdso.so.1 => (0x00007fff242c8000) libQtCore.so.4 => /opt/cuda-3.2/computeprof/bin/libQtCore.so.4 (0x00007fbe62344000) libcuda.so.1 => /usr/lib/libcuda.so.1 (0x00007fbe61a3d000) libcudart.so.3 => /opt/cuda-3.2/lib64/libcudart.so.3 (0x00007fbe617ef000) libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007fbe614db000) libm.so.6 => /lib/libm.so.6 (0x00007fbe61258000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007fbe61040000) libc.so.6 => /lib/libc.so.6 (0x00007fbe60cbd000) libz.so.1 => /lib/libz.so.1 (0x00007fbe60aa6000) libgthread-2.0.so.0 => /usr/lib/libgthread-2.0.so.0 (0x00007fbe608a0000) libglib-2.0.so.0 => /lib/libglib-2.0.so.0 (0x00007fbe605c2000) librt.so.1 => /lib/librt.so.1 (0x00007fbe603ba000) libpthread.so.0 => /lib/libpthread.so.0 (0x00007fbe6019c000) libdl.so.2 => /lib/libdl.so.2 (0x00007fbe5ff98000) /lib64/ld-linux-x86-64.so.2 (0x00007fbe626c0000) libpcre.so.3 => /lib/libpcre.so.3 (0x00007fbe5fd69000) </code></pre> produces an executable which profiles without error as many times as I care to run it with the CUDA 3.2 release profiler. All I can suggest is try my repro case and see whether it works or not. If it fails, then perhaps you have either a broken CUDA or QT installation. If it doesn't fail (and I suspect it won't), then you either have a problem with the way you are building the QT project or the actual CUDA code you are running itself.
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POQt and CUDA VIsual Profiler error in memory transfer size
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. This table or related slice is empty.
UserOwnerUserId
1. UStalonmies
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. POQt and CUDA VIsual Profiler error in memory transfer size
 singulars
 PostTypePostTypeId
 PTQuestion
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTBountyClose
3. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTAcceptedByOriginator
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.