Note that there are some explanatory texts on larger screens.

plurals
  1. POOpenMP not starting threads in one machine but works OK in another one running the same OS
    primarykey
    data
    text
    <p>Recently I've had success in paralellizing a program (which is somewhat big) written in Fortran with some libraries written in C (most notably, UMFPACK). We compiled those with Intel's C Compiler and Intel's Fortran Compiler (icc and ifort) 14.0. We run Ubuntu 12.04.3.</p> <p>I made all routines thread-safe and used the code below to perform the paralellization using OpenMP:</p> <pre><code>!$omp parallel do default(shared) private(gs,ibk,ij) schedule(dynamic) do ibk=1,numcell call CellGaussPoints(ibk,numcell,nquado,numq,numgauss, &amp; xc,noCell,gauss,gs) do ij=1,numgauss gs_3D(ibk,1,ij)=gs(1,ij) gs_3D(ibk,2,ij)=gs(2,ij) gs_3D(ibk,3,ij)=gs(3,ij) gs_3D(ibk,4,ij)=gs(4,ij) call SearchMaterial(tree3,my_array0,node,gs_3D(ibk,1,ij),gs_3D(ibk,2,ij),numnode,mat_2D(ibk,ij),nf,numd,elements) end do end do !$omp end parallel do </code></pre> <p>It works well when compiled with -openmp. But not in every PC... The gs_3D is a 3 dimensional array used to store SearchMaterial's results.</p> <p>I have a Core i5-2400 and tested both with a VMWare Virtual Machine running Linux (Windows Host) and my native Linux install. Worked fine on both. But on another PC (a Core i7-3860X), also running Ubuntu 12.04.3, with the same compiler and libraries installed, it will only run using one thread. Compile options are all the same. I even tried running the binary I compiled in my PC on the other one.</p> <p>Not only that but using OpenBLAS' OpenMP implementation worked fine on my native Linux installation but not on my Virtual Machine and the i7-3860X).</p> <p>After some research, which produced nothing, I decided to ask for help.</p> <p>(OMP_NUM_THREADS was properly set in all these cases)</p> <p>ulimit -a returns the following</p> <pre><code>core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 63687 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 63687 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited </code></pre> <p>I usuall do a ulimit -s unlimited before running the program, since I get a segmentation fault otherwise.</p> <p>OMP_THREAD_LIMIT was not set on the machine my code doesn't work.</p> <p>EDIT: as for the BLAS problem, I discovered that compiling it without processor affinity makes it use all cores. My program, on the other hand, still doesn't work on the i7</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload