[Q-e-developers] Using Valgrind and GDB with QE

Ye Luo xw111luoye at gmail.com
Thu May 5 21:31:10 CEST 2016


Hi Thomas,

I just checked my local QE build with gcc 4.8.4 and gdb 7.7.1. And I can
see the line info.
Here is the configuration in my make.sys and only -g is needed.

CFLAGS         = -O3 $(DFLAGS) $(IFLAGS)
F90FLAGS       = $(FFLAGS) -x f95-cpp-input $(FDFLAGS) $(IFLAGS) $(MODFLAGS)
FFLAGS         = -O3 -g

I'd like you to check two things:
1) How you modified the makefile to add your mbdvdw module source file. Are
all the flags passed to it when it is compiled?
You can check this remove the .o and .mod file associated to the module and
rerun make. Check if -g appears in the compilation.
2) Try another computer.. I have no better ideas..

Ye

My call tree:
#0  simpson (mesh=<optimized out>, func=..., rab=..., asum=0) at
simpsn.f90:39
#1  0x000000000057f345 in atomic_rho (rhoa=..., nspina=1) at
atomic_rho.f90:95
#2  0x00000000005052a4 in potinit () at potinit.f90:123
#3  0x00000000004b64e3 in init_run () at init_run.f90:99
#4  0x0000000000407219 in run_pwscf (exit_status=0) at run_pwscf.f90:72
#5  0x00000000004070a5 in pwscf () at pwscf.f90:30
#6  main (argc=argc at entry=3, argv=0x7fffffffdf98) at pwscf.f90:14
#7  0x00007ffff69b6ec5 in __libc_start_main (main=0x407050 <main>, argc=3,
argv=0x7fffffffdbd8, init=<optimized out>, fini=<optimized out>,
rtld_fini=<optimized out>, stack_end=0x7fffffffdbc8)
    at libc-start.c:287
#8  0x00000000004070ef in _start ()


===================
Ye Luo, Ph.D.
Leadership Computing Facility
Argonne National Laboratory

2016-05-05 13:39 GMT-05:00 Thomas Markovich <thomasmarkovich at gmail.com>:

> Hi Ye,
>
> Thank you for your response! I just ran GDB to see if it was an issue with
> valgrind alone, but I'm getting similar behaviour:
>
> (gdb) where
> #0  0x0000000100267d24 in __mbdvdw_module_MOD_mbdvdw_compute_tdip ()
> #1  0x000000010026a392 in __mbdvdw_module_MOD_mbdvdw_compute_tlr ()
> #2  0x000000010026ac7f in __mbdvdw_module_MOD_mbdvdw_compute_tlr_complex ()
> #3  0x000000010026c425 in __mbdvdw_module_MOD_mbdvdw_tgg_complex ()
> #4  0x0000000100288e0a in __mbdvdw_module_MOD_mbdvdw_check_quantity_dh ()
> #5  0x0000000000000000 in ?? ()
>
> (gdb) info symbol 0x0000000100288e0a
> __mbdvdw_module_MOD_mbdvdw_check_quantity_dh + 4475 in section .text of
> /Users/tmarkovich/Dropbox (Aspuru-Guzik
> Lab)/Projects/MBD/mbd_espresso/PW/src/pw.x
> (gdb) info symbol 0x0000000100267d24
> __mbdvdw_module_MOD_mbdvdw_compute_tdip + 62 in section .text of
> /Users/tmarkovich/Dropbox (Aspuru-Guzik
> Lab)/Projects/MBD/mbd_espresso/PW/src/pw.x
>
> Note, this was done with GDB 7.11. I have pasted my make.sys file below.
>
> Best,
> Thomas
>
> # make.sys.  Generated from make.sys.in by configure.
>
> # compilation rules
>
> .SUFFIXES :
> .SUFFIXES : .o .c .f .f90
>
> # most fortran compilers can directly preprocess c-like directives: use
> # $(MPIF90) $(F90FLAGS) -c $<
> # if explicit preprocessing by the C preprocessor is needed, use:
> # $(CPP) $(CPPFLAGS) $< -o $*.F90
> # $(MPIF90) $(F90FLAGS) -c $*.F90 -o $*.o
> # remember the tabulator in the first column !!!
>
> .f90.o:
> $(MPIF90) $(F90FLAGS) -c $<
>
> # .f.o and .c.o: do not modify
>
> .f.o:
> $(F77) $(FFLAGS) -c $<
>
> .c.o:
> $(CC) $(CFLAGS)  -c $<
>
>
>
> # topdir for linking espresso libs with plugins
> TOPDIR = /Users/tmarkovich/Dropbox (Aspuru-Guzik
> Lab)/Projects/MBD/mbd_espresso
>
> # DFLAGS  = precompilation options (possible arguments to -D and -U)
> #           used by the C compiler and preprocessor
> # FDFLAGS = as DFLAGS, for the f90 compiler
> # See include/defs.h.README for a list of options and their meaning
> # With the exception of IBM xlf, FDFLAGS = $(DFLAGS)
> # For IBM xlf, FDFLAGS is the same as DFLAGS with separating commas
>
> # MANUAL_DFLAGS  = additional precompilation option(s), if desired
> #                  You may use this instead of tweaking DFLAGS and FDFLAGS
> #                  BEWARE: will not work for IBM xlf! Manually edit FDFLAGS
> MANUAL_DFLAGS  =
> DFLAGS         =  -D__GFORTRAN -D__STD_F95 -D__FFTW -D__MPI -D__PARA
> -D__SCALAPACK -D__OPENMP $(MANUAL_DFLAGS)
> FDFLAGS        = $(DFLAGS) $(MANUAL_DFLAGS)
>
> # IFLAGS = how to locate directories where files to be included are
> # In most cases, IFLAGS = -I../include
>
> IFLAGS         = -I../include
>
> # MOD_FLAGS = flag used by f90 compiler to locate modules
> # Each Makefile defines the list of needed modules in MODFLAGS
>
> MOD_FLAG      = -I
>
> # Compilers: fortran-90, fortran-77, C
> # If a parallel compilation is desired, MPIF90 should be a fortran-90
> # compiler that produces executables for parallel execution using MPI
> # (such as for instance mpif90, mpf90, mpxlf90,...);
> # otherwise, an ordinary fortran-90 compiler (f90, g95, xlf90, ifort,...)
> # If you have a parallel machine but no suitable candidate for MPIF90,
> # try to specify the directory containing "mpif.h" in IFLAGS
> # and to specify the location of MPI libraries in MPI_LIBS
>
> MPIF90         = mpif90
> #F90           = gfortran
> CC             = gcc-5
> F77            = gfortran
>
> # C preprocessor and preprocessing flags - for explicit preprocessing,
> # if needed (see the compilation rules above)
> # preprocessing flags must include DFLAGS and IFLAGS
>
> CPP            = cpp-5 -ansi -C
> CPPFLAGS       =  $(DFLAGS) $(IFLAGS)
>
> # compiler flags: C, F90, F77
> # C flags must include DFLAGS and IFLAGS
> # F90 flags must include MODFLAGS, IFLAGS, and FDFLAGS with appropriate
> syntax
>
> CFLAGS         = -Og -g $(DFLAGS) $(IFLAGS)
> F90FLAGS       = $(FFLAGS) -x f95-cpp-input -fopenmp $(FDFLAGS) $(IFLAGS)
> $(MODFLAGS)
> FFLAGS         = -Og -g -fopenmp -Wall -Wextra -Warray-temporaries
> -Wconversion -fbacktrace -ffree-line-length-0 -finit-real=nan
> -ffpe-trap=zero,invalid,zero,overflow
>
> # compiler flags without optimization for fortran-77
> # the latter is NEEDED to properly compile dlamch.f, used by lapack
>
> FFLAGS_NOOPT   = -O0 -g
>
> # compiler flag needed by some compilers when the main is not fortran
> # Currently used for Yambo
>
> FFLAGS_NOMAIN   =
>
> # Linker, linker-specific flags (if any)
> # Typically LD coincides with F90 or MPIF90, LD_LIBS is empty
>
> LD             = mpif90
> LDFLAGS        = -g -pthread -fopenmp
> LD_LIBS        =
>
> # External Libraries (if any) : blas, lapack, fft, MPI
>
> # If you have nothing better, use the local copy :
> # BLAS_LIBS = /your/path/to/espresso/BLAS/blas.a
> # BLAS_LIBS_SWITCH = internal
>
> BLAS_LIBS      =  -lblas
> BLAS_LIBS_SWITCH = external
>
> # If you have nothing better, use the local copy :
> # LAPACK_LIBS = /your/path/to/espresso/lapack-3.2/lapack.a
> # LAPACK_LIBS_SWITCH = internal
> # For IBM machines with essl (-D__ESSL): load essl BEFORE lapack !
> # remember that LAPACK_LIBS precedes BLAS_LIBS in loading order
>
> LAPACK_LIBS    =  -llapack  -lblas
> LAPACK_LIBS_SWITCH = external
>
> ELPA_LIBS_SWITCH = disabled
> SCALAPACK_LIBS = -lscalapack
>
> # nothing needed here if the the internal copy of FFTW is compiled
> # (needs -D__FFTW in DFLAGS)
>
> FFT_LIBS       =
>
> # For parallel execution, the correct path to MPI libraries must
> # be specified in MPI_LIBS (except for IBM if you use mpxlf)
>
> MPI_LIBS       =
>
> # IBM-specific: MASS libraries, if available and if -D__MASS is defined in
> FDFLAGS
>
> MASS_LIBS      =
>
> # ar command and flags - for most architectures: AR = ar, ARFLAGS = ruv
>
> AR             = ar
> ARFLAGS        = ruv
>
> # ranlib command. If ranlib is not needed (it isn't in most cases) use
> # RANLIB = echo
>
> RANLIB         = ranlib -c
>
> # all internal and external libraries - do not modify
>
> FLIB_TARGETS   = all
>
> LIBOBJS        = ../flib/ptools.a ../flib/flib.a ../clib/clib.a
> ../iotk/src/libiotk.a
> LIBS           = $(SCALAPACK_LIBS) $(LAPACK_LIBS) $(FFT_LIBS) $(BLAS_LIBS)
> $(MPI_LIBS) $(MASS_LIBS) $(LD_LIBS)
>
> # wget or curl - useful to download from network
> WGET = wget -O
>
>
>
>
> On Thu, May 5, 2016 at 2:29 PM, Ye Luo <xw111luoye at gmail.com> wrote:
>
>> Hi Thomas,
>>
>> Sorry about that I didn't catch the exact point you asked.
>> I just want to add a general comment.
>> In my experience, -g should be sufficient for your purpose.
>> All the profiling or debugging tools I use, including DDT, gdb,
>> HPCtoolkit, vtune only requires the code compiled with -g and can provide
>> information about call stack and line specific details.
>>
>> Ye
>>
>> ===================
>> Ye Luo, Ph.D.
>> Leadership Computing Facility
>> Argonne National Laboratory
>>
>> 2016-05-05 12:40 GMT-05:00 Thomas Markovich <thomasmarkovich at gmail.com>:
>>
>>> Hi Ye,
>>>
>>> Thank you for your comment!  __mbdvdw_module_MOD_mbdvdw_tgg_complex is
>>> a subroutine, not an array. It appears that the invalid write is happening
>>> somewhere inside of tgg_complex, presumably the variable is stored at 0x103131248.
>>> I was hoping that when compiled with debug symbols, valgrind would be able
>>> to tell me the line number of tgg_complex.
>>>
>>> Best,
>>> Thomas
>>>
>>> On Thu, May 5, 2016 at 1:31 PM, Ye Luo <xw111luoye at gmail.com> wrote:
>>>
>>>> It doesn't seem to be a libc bug.
>>>> The call stack shows that in the subroutine check_quantity_dh of your
>>>> module mbdvdw,
>>>> the code failed in writing something into tgg_complex which seems to be
>>>> a variable belongs to your module.
>>>> Is this variable dynamically allocatable and not initialized?
>>>>
>>>> Ye
>>>>
>>>> ===================
>>>> Ye Luo, Ph.D.
>>>> Leadership Computing Facility
>>>> Argonne National Laboratory
>>>>
>>>> 2016-05-05 12:14 GMT-05:00 Hsin-Yu Ko <hsinyu at princeton.edu>:
>>>>
>>>>> Thomas,
>>>>>
>>>>> That is interesting. What you are seeing seems to be a libc bug [1]. I
>>>>> have encountered something similar last month. I fixed the problem on
>>>>> my
>>>>> machine by recompiling glibc with debug features enabled (I am not sure
>>>>> how useful gentoo documentation is but I put the reference here just in
>>>>> case [2]). I think removing -pg may be a quick fix according to [1].
>>>>>
>>>>> Best,
>>>>> Hsin-Yu
>>>>>
>>>>> [1] http://valgrind-users.narkive.com/MPnV7HOw/gcc-pg-valgrind-errors
>>>>> [2] https://wiki.gentoo.org/wiki/Debugging
>>>>>
>>>>> On 05/05/2016 12:34 PM, Thomas Markovich wrote:
>>>>> > Hsin-Yu,
>>>>> >
>>>>> > Thank you for the suggestion!
>>>>> >
>>>>> > I had -g in LDFLAGS:
>>>>> > LDFLAGS        = -g -pthread -fopenmp
>>>>> >
>>>>> > but nothing equivalent in CFLAGS, which was defined as:
>>>>> > CFLAGS         = -O3 $(DFLAGS) $(IFLAGS)
>>>>> >
>>>>> > I have since gone ahead and changed CFLAGS to
>>>>> > CFLAGS         = -Og -g $(DFLAGS) $(IFLAGS)
>>>>> >
>>>>> > The resulting fortran compile statements look something like:
>>>>> > mpif90 -Og -g -pg -fopenmp -Wall -Wextra -Warray-temporaries
>>>>> > -Wconversion -fbacktrace -ffree-line-length-0 -finit-real=nan
>>>>> > -ffpe-trap=zero,invalid,zero,overflow -x f95-cpp-input -fopenmp
>>>>> > -D__GFORTRAN -D__STD_F95 -D__FFTW -D__MPI -D__PARA -D__SCALAPACK
>>>>> > -D__OPENMP   -I../include -I../iotk/src -I../ELPA/src -I. -c
>>>>> mbdvdw.f90
>>>>> >
>>>>> > and reran valgrind. It gave the following output:
>>>>> >
>>>>> > ==30486== Invalid write of size 8
>>>>> > ==30486==    at 0x1002735ED: __mbdvdw_module_MOD_mbdvdw_tgg_complex
>>>>> (in
>>>>> > /Users/tmarkovich/bin/pw.x)
>>>>> > ==30486==    by 0x100290F58:
>>>>> > __mbdvdw_module_MOD_mbdvdw_check_quantity_dh (in
>>>>> /Users/tmarkovich/bin/pw.x)
>>>>> > ==30486==  Address 0x1037989f0 is 0 bytes after a block of size 1,728
>>>>> > alloc'd
>>>>> > ==30486==    at 0x10092B4AB: malloc (in
>>>>> >
>>>>> /usr/local/Cellar/valgrind/HEAD/lib/valgrind/vgpreload_memcheck-amd64-darwin.so)
>>>>> > ==30486==    by 0x10028FEE2:
>>>>> > __mbdvdw_module_MOD_mbdvdw_check_quantity_dh (in
>>>>> /Users/tmarkovich/bin/pw.x)
>>>>> > ==30486==    by 0x1001D4567: v_of_rho_ (in
>>>>> /Users/tmarkovich/bin/pw.x)
>>>>> > ==30486==    by 0x10007C0BE: electrons_scf_ (in
>>>>> /Users/tmarkovich/bin/pw.x)
>>>>> > ==30486==    by 0x10007D385: electrons_ (in
>>>>> /Users/tmarkovich/bin/pw.x)
>>>>> > ==30486==    by 0x10018B30B: run_pwscf_ (in
>>>>> /Users/tmarkovich/bin/pw.x)
>>>>> > ==30486==    by 0x100001157: MAIN__ (pwscf.f90:30)
>>>>> > ==30486==    by 0x1004EC496: main (pwscf.f90:14)
>>>>> >
>>>>> > This appears to not have changed much.
>>>>> >
>>>>> > Best,
>>>>> > Thomas
>>>>> >
>>>>> > On Thu, May 5, 2016 at 10:15 AM, Hsin-Yu Ko <hsinyu at princeton.edu
>>>>> > <mailto:hsinyu at princeton.edu>> wrote:
>>>>> >
>>>>> >     Hi Thomas,
>>>>> >
>>>>> >     Did you put -g in CFLAGS and LDFLAGS? Valgrind seems to
>>>>> recognize some
>>>>> >     lines inside MAIN__ while failing to find the linked ones.
>>>>> >
>>>>> >     Best,
>>>>> >     Hsin-Yu
>>>>> >
>>>>> >     On 05/05/2016 09:48 AM, Thomas Markovich wrote:
>>>>> >     > Hi,
>>>>> >     >
>>>>> >     > I'm preparing to push my module that implements the Many Body
>>>>> >     Dispersion
>>>>> >     > van der Waals correction, and all associated forces. As a last
>>>>> >     thing, I
>>>>> >     > ran my code through valgrind, and it seems to have popped up a
>>>>> >     couple of
>>>>> >     > remaining things that I would like to fix before release[1].
>>>>> >     > Unfortunately, the valgrind output below is less than clear on
>>>>> where
>>>>> >     > exactly the error is, and it doesn't give any important line
>>>>> numbers.
>>>>> >     > Beyond this, addr2line gives thoroughly unhelpful output:
>>>>> >     > ▶ gaddr2line -e pw.x 0x100520518
>>>>> >     > ??:0.
>>>>> >     >
>>>>> >     > I have compiled QE given the following flags with gfortran 4.9:
>>>>> >     > FFLAGS         = -Og -g -pg -fopenmp -fbacktrace -fcheck=all
>>>>> >     > -finit-real=nan -ffpe-trap=zero,invalid,zero,overflow
>>>>> >     >
>>>>> >     > Is there any way to compile QE such that it generates all the
>>>>> >     debugging
>>>>> >     > symbols, so that I can get more readable and informative
>>>>> output from
>>>>> >     > valgrind? I thought all I needed was the -g flag, but it
>>>>> appears
>>>>> >     that I
>>>>> >     > might need more?
>>>>> >     >
>>>>> >     > Best,
>>>>> >     > Thomas Markovich
>>>>> >     >
>>>>> >     > [1]
>>>>> >     > ==10233== Invalid write of size 8
>>>>> >     > ==10233==    at 0x100520518:
>>>>> >     __mbdvdw_module_MOD_mbdvdw_tgg_complex (in
>>>>> >     > /Users/tmarkovich/bin/pw.x)
>>>>> >     > ==10233==    by 0x100555A88:
>>>>> >     > __mbdvdw_module_MOD_mbdvdw_check_quantity_dh (in
>>>>> >     /Users/tmarkovich/bin/pw.x)
>>>>> >     > ==10233==  Address 0x103131248 is 8 bytes after a block of
>>>>> size 1,728
>>>>> >     > alloc'd
>>>>> >     > ==10233==    at 0x1011814AB: malloc (in
>>>>> >     >
>>>>> >
>>>>>  /usr/local/Cellar/valgrind/HEAD/lib/valgrind/vgpreload_memcheck-amd64-darwin.so)
>>>>> >     > ==10233==    by 0x1005547B0:
>>>>> >     > __mbdvdw_module_MOD_mbdvdw_check_quantity_dh (in
>>>>> >     /Users/tmarkovich/bin/pw.x)
>>>>> >     > ==10233==    by 0x1003D56E4: v_of_rho_ (in
>>>>> /Users/tmarkovich/bin/pw.x)
>>>>> >     > ==10233==    by 0x1000F540B: electrons_scf_ (in
>>>>> >     /Users/tmarkovich/bin/pw.x)
>>>>> >     > ==10233==    by 0x1000F6E18: electrons_ (in
>>>>> >     /Users/tmarkovich/bin/pw.x)
>>>>> >     > ==10233==    by 0x10032AA28: run_pwscf_ (in
>>>>> >     /Users/tmarkovich/bin/pw.x)
>>>>> >     > ==10233==    by 0x1000010BB: MAIN__ (pwscf.f90:30)
>>>>> >     > ==10233==    by 0x100B67C1F: main (pwscf.f90:14)
>>>>> >     _______________________________________________
>>>>> >     Q-e-developers mailing list
>>>>> >     Q-e-developers at qe-forge.org <mailto:Q-e-developers at qe-forge.org>
>>>>> >     http://qe-forge.org/mailman/listinfo/q-e-developers
>>>>> >
>>>>> >
>>>>> _______________________________________________
>>>>> Q-e-developers mailing list
>>>>> Q-e-developers at qe-forge.org
>>>>> http://qe-forge.org/mailman/listinfo/q-e-developers
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Q-e-developers mailing list
>>>> Q-e-developers at qe-forge.org
>>>> http://qe-forge.org/mailman/listinfo/q-e-developers
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Q-e-developers mailing list
>>> Q-e-developers at qe-forge.org
>>> http://qe-forge.org/mailman/listinfo/q-e-developers
>>>
>>>
>>
>> _______________________________________________
>> Q-e-developers mailing list
>> Q-e-developers at qe-forge.org
>> http://qe-forge.org/mailman/listinfo/q-e-developers
>>
>>
>
> _______________________________________________
> Q-e-developers mailing list
> Q-e-developers at qe-forge.org
> http://qe-forge.org/mailman/listinfo/q-e-developers
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/developers/attachments/20160505/28ce644e/attachment.html>


More information about the developers mailing list