Note that there are some explanatory texts on larger screens.

plurals
  1. POOpenMPI debugging with Valgrind and suppressions in OS X
    primarykey
    data
    text
    <p>I am writing a parallel code in C++ on my OS X (Snow Leopard) laptop, and I am trying to debug it with memchecker. I have successfully built OpenMPI with valgrind support with: <code>configure --prefix=/opt/openmpi-1.4.3/ --enable-debug --enable-memchecker --with-valgrind=/opt/valgrind-3.6.0/ FFLAGS=-m64 F90FLAGS=-m64</code> (Ignore the Fortran flags, it's due to my Fortran compiler being from GCC).</p> <p>When I run my application with</p> <blockquote> <p>mpirun -np 2 valgrind --suppressions=/opt/openmpi-1.4.3/share/openmpi/openmpi-valgrind.supp --leak-check=yes --dsymutil=yes ./program</p> </blockquote> <p>I get a whole lot of warnings from Valgrind (the most of them from the heap summary at the end). I have included a small snippet of the warnings below. What I get from them is that Valgrind detects memory leaks and uninitialised values in the MPI library, but I'm not really interested in that. I want warnings from the code I write. I already run Valgrind with the suppression file provided by OpenMPI, but evidently it is not enough. How can I easily ignore all the other warnings detected in the OpenMPI distribution? Is it possible to find a suppression file for OpenMPI debugging with Valgrind on OS X, or do you know any cunning trick?</p> <p>The first warning is</p> <pre><code> ==1531== Syscall param writev(vector[...]) points to uninitialised byte(s) ==1531== at 0x1014E16E2: writev (in /usr/lib/libSystem.B.dylib) ==1531== by 0x101AEA4C5: mca_oob_tcp_peer_send (in /opt/openmpi-1.4.3/lib/openmpi/mca_oob_tcp.so) ==1531== by 0x101AF0B88: mca_oob_tcp_send_nb (in /opt/openmpi-1.4.3/lib/openmpi/mca_oob_tcp.so) ==1531== by 0x101AC7F48: orte_rml_oob_send (in /opt/openmpi-1.4.3/lib/openmpi/mca_rml_oob.so) ==1531== by 0x101AC8AA1: orte_rml_oob_send_buffer (in /opt/openmpi-1.4.3/lib/openmpi/mca_rml_oob.so) ==1531== by 0x101B3489E: allgather (in /opt/openmpi-1.4.3/lib/openmpi/mca_grpcomm_bad.so) ==1531== by 0x101B3525D: modex (in /opt/openmpi-1.4.3/lib/openmpi/mca_grpcomm_bad.so) ==1531== by 0x1000A48E6: ompi_mpi_init (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x1000F7806: MPI_Init (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x100001AF2: main (main.cpp:34) ==1531== Address 0x101a8911b is 107 bytes inside a block of size 256 alloc'd ==1531== at 0x10002DB2D: realloc (vg_replace_malloc.c:525) ==1531== by 0x1012240B6: opal_dss_buffer_extend (in /opt/openmpi-1.4.3/lib/libopen- pal.0.dylib) ==1531== by 0x101225CF7: opal_dss_copy_payload (in /opt/openmpi-1.4.3/lib/libopen-pal.0.dylib) ==1531== by 0x101B347CA: allgather (in /opt/openmpi-1.4.3/lib/openmpi/mca_grpcomm_bad.so) ==1531== by 0x101B3525D: modex (in /opt/openmpi-1.4.3/lib/openmpi/mca_grpcomm_bad.so) ==1531== by 0x1000A48E6: ompi_mpi_init (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x1000F7806: MPI_Init (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x100001AF2: main (main.cpp:34) </code></pre> <p>After execution a small snippet of the heap summary looks like this</p> <pre><code> ==1531== 88 bytes in 1 blocks are definitely lost in loss record 1,950 of 2,194 ==1531== at 0x10002D915: malloc (vg_replace_malloc.c:236) ==1531== by 0x100073888: opal_obj_new (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x100073808: opal_obj_new_debug (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x100073C17: ompi_attr_create_keyval_impl (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x100073FCF: ompi_attr_create_keyval (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x100077C96: create_comm (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x10007798A: ompi_attr_create_predefined (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x1000737CF: ompi_attr_init (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x1000A4840: ompi_mpi_init (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x1000F7806: MPI_Init (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x100001AF2: main (main.cpp:34) </code></pre> <blockquote> <p>...</p> </blockquote> <pre><code> ==1531== 88 bytes in 1 blocks are definitely lost in loss record 1,952 of 2,194 ==1531== at 0x10002D915: malloc (vg_replace_malloc.c:236) ==1531== by 0x100073888: opal_obj_new (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x100073808: opal_obj_new_debug (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x100073C17: ompi_attr_create_keyval_impl (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x100073FCF: ompi_attr_create_keyval (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x10014CEC5: PMPI_Keyval_create (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x1065ACFE6: ??? ==1531== by 0x10658867B: ??? ==1531== by 0x10017A591: module_init (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x100179985: mca_io_base_file_select (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x100089D55: ompi_file_open (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x1000E1ED1: MPI_File_open (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== ==1531== 88 bytes in 1 blocks are definitely lost in loss record 1,953 of 2,194 ==1531== at 0x10002D915: malloc (vg_replace_malloc.c:236) ==1531== by 0x100073888: opal_obj_new (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x100073808: opal_obj_new_debug (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x100073C17: ompi_attr_create_keyval_impl (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x100073FCF: ompi_attr_create_keyval (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x10014CEC5: PMPI_Keyval_create (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) ==1531== by 0x1065A6210: ??? ==1531== by 0x106597149: ??? ==1531== by 0x106596AAB: ??? ==1531== by 0x1065AD14C: ??? ==1531== by 0x10658867B: ??? ==1531== by 0x10017A591: module_init (in /opt/openmpi-1.4.3/lib/libmpi.0.dylib) </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload