[Q-e-developers] Using Valgrind and GDB with QE

Thu May 5 20:39:28 CEST 2016

Hi Ye,

Thank you for your response! I just ran GDB to see if it was an issue with
valgrind alone, but I'm getting similar behaviour:

(gdb) where
#0  0x0000000100267d24 in __mbdvdw_module_MOD_mbdvdw_compute_tdip ()
#1  0x000000010026a392 in __mbdvdw_module_MOD_mbdvdw_compute_tlr ()
#2  0x000000010026ac7f in __mbdvdw_module_MOD_mbdvdw_compute_tlr_complex ()
#3  0x000000010026c425 in __mbdvdw_module_MOD_mbdvdw_tgg_complex ()
#4  0x0000000100288e0a in __mbdvdw_module_MOD_mbdvdw_check_quantity_dh ()
#5  0x0000000000000000 in ?? ()

(gdb) info symbol 0x0000000100288e0a
__mbdvdw_module_MOD_mbdvdw_check_quantity_dh + 4475 in section .text of
/Users/tmarkovich/Dropbox (Aspuru-Guzik
Lab)/Projects/MBD/mbd_espresso/PW/src/pw.x
(gdb) info symbol 0x0000000100267d24
__mbdvdw_module_MOD_mbdvdw_compute_tdip + 62 in section .text of
/Users/tmarkovich/Dropbox (Aspuru-Guzik
Lab)/Projects/MBD/mbd_espresso/PW/src/pw.x

Note, this was done with GDB 7.11. I have pasted my make.sys file below.

Best,
Thomas

# make.sys.  Generated from make.sys.in by configure.

# compilation rules

.SUFFIXES :
.SUFFIXES : .o .c .f .f90

# most fortran compilers can directly preprocess c-like directives: use
# $(MPIF90) $(F90FLAGS) -c $<
# if explicit preprocessing by the C preprocessor is needed, use:
# $(CPP) $(CPPFLAGS) $< -o $*.F90
# $(MPIF90) $(F90FLAGS) -c $*.F90 -o $*.o
# remember the tabulator in the first column !!!

.f90.o:
$(MPIF90) $(F90FLAGS) -c $<

# .f.o and .c.o: do not modify

.f.o:
$(F77) $(FFLAGS) -c $<

.c.o:
$(CC) $(CFLAGS)  -c $<

# topdir for linking espresso libs with plugins
TOPDIR = /Users/tmarkovich/Dropbox (Aspuru-Guzik
Lab)/Projects/MBD/mbd_espresso

# DFLAGS  = precompilation options (possible arguments to -D and -U)
#           used by the C compiler and preprocessor
# FDFLAGS = as DFLAGS, for the f90 compiler
# See include/defs.h.README for a list of options and their meaning
# With the exception of IBM xlf, FDFLAGS = $(DFLAGS)
# For IBM xlf, FDFLAGS is the same as DFLAGS with separating commas

# MANUAL_DFLAGS  = additional precompilation option(s), if desired
#                  You may use this instead of tweaking DFLAGS and FDFLAGS
#                  BEWARE: will not work for IBM xlf! Manually edit FDFLAGS
MANUAL_DFLAGS  =
DFLAGS         =  -D__GFORTRAN -D__STD_F95 -D__FFTW -D__MPI -D__PARA
-D__SCALAPACK -D__OPENMP $(MANUAL_DFLAGS)
FDFLAGS        = $(DFLAGS) $(MANUAL_DFLAGS)

# IFLAGS = how to locate directories where files to be included are
# In most cases, IFLAGS = -I../include

IFLAGS         = -I../include

# MOD_FLAGS = flag used by f90 compiler to locate modules
# Each Makefile defines the list of needed modules in MODFLAGS

MOD_FLAG      = -I

# Compilers: fortran-90, fortran-77, C
# If a parallel compilation is desired, MPIF90 should be a fortran-90
# compiler that produces executables for parallel execution using MPI
# (such as for instance mpif90, mpf90, mpxlf90,...);
# otherwise, an ordinary fortran-90 compiler (f90, g95, xlf90, ifort,...)
# If you have a parallel machine but no suitable candidate for MPIF90,
# try to specify the directory containing "mpif.h" in IFLAGS
# and to specify the location of MPI libraries in MPI_LIBS

MPIF90         = mpif90
#F90           = gfortran
CC             = gcc-5
F77            = gfortran

# C preprocessor and preprocessing flags - for explicit preprocessing,
# if needed (see the compilation rules above)
# preprocessing flags must include DFLAGS and IFLAGS

CPP            = cpp-5 -ansi -C
CPPFLAGS       =  $(DFLAGS) $(IFLAGS)

# compiler flags: C, F90, F77
# C flags must include DFLAGS and IFLAGS
# F90 flags must include MODFLAGS, IFLAGS, and FDFLAGS with appropriate
syntax

CFLAGS         = -Og -g $(DFLAGS) $(IFLAGS)
F90FLAGS       = $(FFLAGS) -x f95-cpp-input -fopenmp $(FDFLAGS) $(IFLAGS)
$(MODFLAGS)
FFLAGS         = -Og -g -fopenmp -Wall -Wextra -Warray-temporaries
-Wconversion -fbacktrace -ffree-line-length-0 -finit-real=nan
-ffpe-trap=zero,invalid,zero,overflow

# compiler flags without optimization for fortran-77
# the latter is NEEDED to properly compile dlamch.f, used by lapack

FFLAGS_NOOPT   = -O0 -g

# compiler flag needed by some compilers when the main is not fortran
# Currently used for Yambo

FFLAGS_NOMAIN   =

# Linker, linker-specific flags (if any)
# Typically LD coincides with F90 or MPIF90, LD_LIBS is empty

LD             = mpif90
LDFLAGS        = -g -pthread -fopenmp
LD_LIBS        =

# External Libraries (if any) : blas, lapack, fft, MPI

# If you have nothing better, use the local copy :
# BLAS_LIBS = /your/path/to/espresso/BLAS/blas.a
# BLAS_LIBS_SWITCH = internal

BLAS_LIBS      =  -lblas
BLAS_LIBS_SWITCH = external

# If you have nothing better, use the local copy :
# LAPACK_LIBS = /your/path/to/espresso/lapack-3.2/lapack.a
# LAPACK_LIBS_SWITCH = internal
# For IBM machines with essl (-D__ESSL): load essl BEFORE lapack !
# remember that LAPACK_LIBS precedes BLAS_LIBS in loading order

LAPACK_LIBS    =  -llapack  -lblas
LAPACK_LIBS_SWITCH = external

ELPA_LIBS_SWITCH = disabled
SCALAPACK_LIBS = -lscalapack

# nothing needed here if the the internal copy of FFTW is compiled
# (needs -D__FFTW in DFLAGS)

FFT_LIBS       =

# For parallel execution, the correct path to MPI libraries must
# be specified in MPI_LIBS (except for IBM if you use mpxlf)

MPI_LIBS       =

# IBM-specific: MASS libraries, if available and if -D__MASS is defined in
FDFLAGS

MASS_LIBS      =

# ar command and flags - for most architectures: AR = ar, ARFLAGS = ruv

AR             = ar
ARFLAGS        = ruv

# ranlib command. If ranlib is not needed (it isn't in most cases) use
# RANLIB = echo

RANLIB         = ranlib -c

# all internal and external libraries - do not modify

FLIB_TARGETS   = all

LIBOBJS        = ../flib/ptools.a ../flib/flib.a ../clib/clib.a
../iotk/src/libiotk.a
LIBS           = $(SCALAPACK_LIBS) $(LAPACK_LIBS) $(FFT_LIBS) $(BLAS_LIBS)
$(MPI_LIBS) $(MASS_LIBS) $(LD_LIBS)

# wget or curl - useful to download from network
WGET = wget -O

On Thu, May 5, 2016 at 2:29 PM, Ye Luo <xw111luoye at gmail.com> wrote:

> Hi Thomas,
>
> Sorry about that I didn't catch the exact point you asked.
> I just want to add a general comment.
> In my experience, -g should be sufficient for your purpose.
> All the profiling or debugging tools I use, including DDT, gdb,
> HPCtoolkit, vtune only requires the code compiled with -g and can provide
> information about call stack and line specific details.
>
> Ye
>
> ===================
> Ye Luo, Ph.D.
> Leadership Computing Facility
> Argonne National Laboratory
>
> 2016-05-05 12:40 GMT-05:00 Thomas Markovich <thomasmarkovich at gmail.com>:
>
>> Hi Ye,
>>
>> Thank you for your comment!  __mbdvdw_module_MOD_mbdvdw_tgg_complex is a
>> subroutine, not an array. It appears that the invalid write is happening
>> somewhere inside of tgg_complex, presumably the variable is stored at 0x103131248.
>> I was hoping that when compiled with debug symbols, valgrind would be able
>> to tell me the line number of tgg_complex.
>>
>> Best,
>> Thomas
>>
>> On Thu, May 5, 2016 at 1:31 PM, Ye Luo <xw111luoye at gmail.com> wrote:
>>
>>> It doesn't seem to be a libc bug.
>>> The call stack shows that in the subroutine check_quantity_dh of your
>>> module mbdvdw,
>>> the code failed in writing something into tgg_complex which seems to be
>>> a variable belongs to your module.
>>> Is this variable dynamically allocatable and not initialized?
>>>
>>> Ye
>>>
>>> ===================
>>> Ye Luo, Ph.D.
>>> Leadership Computing Facility
>>> Argonne National Laboratory
>>>
>>> 2016-05-05 12:14 GMT-05:00 Hsin-Yu Ko <hsinyu at princeton.edu>:
>>>
>>>> Thomas,
>>>>
>>>> That is interesting. What you are seeing seems to be a libc bug [1]. I
>>>> have encountered something similar last month. I fixed the problem on my
>>>> machine by recompiling glibc with debug features enabled (I am not sure
>>>> how useful gentoo documentation is but I put the reference here just in
>>>> case [2]). I think removing -pg may be a quick fix according to [1].
>>>>
>>>> Best,
>>>> Hsin-Yu
>>>>
>>>> [1] http://valgrind-users.narkive.com/MPnV7HOw/gcc-pg-valgrind-errors
>>>> [2] https://wiki.gentoo.org/wiki/Debugging
>>>>
>>>> On 05/05/2016 12:34 PM, Thomas Markovich wrote:
>>>> > Hsin-Yu,
>>>> >
>>>> > Thank you for the suggestion!
>>>> >
>>>> > I had -g in LDFLAGS:
>>>> > LDFLAGS        = -g -pthread -fopenmp
>>>> >
>>>> > but nothing equivalent in CFLAGS, which was defined as:
>>>> > CFLAGS         = -O3 $(DFLAGS) $(IFLAGS)
>>>> >
>>>> > I have since gone ahead and changed CFLAGS to
>>>> > CFLAGS         = -Og -g $(DFLAGS) $(IFLAGS)
>>>> >
>>>> > The resulting fortran compile statements look something like:
>>>> > mpif90 -Og -g -pg -fopenmp -Wall -Wextra -Warray-temporaries
>>>> > -Wconversion -fbacktrace -ffree-line-length-0 -finit-real=nan
>>>> > -ffpe-trap=zero,invalid,zero,overflow -x f95-cpp-input -fopenmp
>>>> > -D__GFORTRAN -D__STD_F95 -D__FFTW -D__MPI -D__PARA -D__SCALAPACK
>>>> > -D__OPENMP   -I../include -I../iotk/src -I../ELPA/src -I. -c
>>>> mbdvdw.f90
>>>> >
>>>> > and reran valgrind. It gave the following output:
>>>> >
>>>> > ==30486== Invalid write of size 8
>>>> > ==30486==    at 0x1002735ED: __mbdvdw_module_MOD_mbdvdw_tgg_complex
>>>> (in
>>>> > /Users/tmarkovich/bin/pw.x)
>>>> > ==30486==    by 0x100290F58:
>>>> > __mbdvdw_module_MOD_mbdvdw_check_quantity_dh (in
>>>> /Users/tmarkovich/bin/pw.x)
>>>> > ==30486==  Address 0x1037989f0 is 0 bytes after a block of size 1,728
>>>> > alloc'd
>>>> > ==30486==    at 0x10092B4AB: malloc (in
>>>> >
>>>> /usr/local/Cellar/valgrind/HEAD/lib/valgrind/vgpreload_memcheck-amd64-darwin.so)
>>>> > ==30486==    by 0x10028FEE2:
>>>> > __mbdvdw_module_MOD_mbdvdw_check_quantity_dh (in
>>>> /Users/tmarkovich/bin/pw.x)
>>>> > ==30486==    by 0x1001D4567: v_of_rho_ (in /Users/tmarkovich/bin/pw.x)
>>>> > ==30486==    by 0x10007C0BE: electrons_scf_ (in
>>>> /Users/tmarkovich/bin/pw.x)
>>>> > ==30486==    by 0x10007D385: electrons_ (in
>>>> /Users/tmarkovich/bin/pw.x)
>>>> > ==30486==    by 0x10018B30B: run_pwscf_ (in
>>>> /Users/tmarkovich/bin/pw.x)
>>>> > ==30486==    by 0x100001157: MAIN__ (pwscf.f90:30)
>>>> > ==30486==    by 0x1004EC496: main (pwscf.f90:14)
>>>> >
>>>> > This appears to not have changed much.
>>>> >
>>>> > Best,
>>>> > Thomas
>>>> >
>>>> > On Thu, May 5, 2016 at 10:15 AM, Hsin-Yu Ko <hsinyu at princeton.edu
>>>> > <mailto:hsinyu at princeton.edu>> wrote:
>>>> >
>>>> >     Hi Thomas,
>>>> >
>>>> >     Did you put -g in CFLAGS and LDFLAGS? Valgrind seems to recognize
>>>> some
>>>> >     lines inside MAIN__ while failing to find the linked ones.
>>>> >
>>>> >     Best,
>>>> >     Hsin-Yu
>>>> >
>>>> >     On 05/05/2016 09:48 AM, Thomas Markovich wrote:
>>>> >     > Hi,
>>>> >     >
>>>> >     > I'm preparing to push my module that implements the Many Body
>>>> >     Dispersion
>>>> >     > van der Waals correction, and all associated forces. As a last
>>>> >     thing, I
>>>> >     > ran my code through valgrind, and it seems to have popped up a
>>>> >     couple of
>>>> >     > remaining things that I would like to fix before release[1].
>>>> >     > Unfortunately, the valgrind output below is less than clear on
>>>> where
>>>> >     > exactly the error is, and it doesn't give any important line
>>>> numbers.
>>>> >     > Beyond this, addr2line gives thoroughly unhelpful output:
>>>> >     > ▶ gaddr2line -e pw.x 0x100520518
>>>> >     > ??:0.
>>>> >     >
>>>> >     > I have compiled QE given the following flags with gfortran 4.9:
>>>> >     > FFLAGS         = -Og -g -pg -fopenmp -fbacktrace -fcheck=all
>>>> >     > -finit-real=nan -ffpe-trap=zero,invalid,zero,overflow
>>>> >     >
>>>> >     > Is there any way to compile QE such that it generates all the
>>>> >     debugging
>>>> >     > symbols, so that I can get more readable and informative output
>>>> from
>>>> >     > valgrind? I thought all I needed was the -g flag, but it appears
>>>> >     that I
>>>> >     > might need more?
>>>> >     >
>>>> >     > Best,
>>>> >     > Thomas Markovich
>>>> >     >
>>>> >     > [1]
>>>> >     > ==10233== Invalid write of size 8
>>>> >     > ==10233==    at 0x100520518:
>>>> >     __mbdvdw_module_MOD_mbdvdw_tgg_complex (in
>>>> >     > /Users/tmarkovich/bin/pw.x)
>>>> >     > ==10233==    by 0x100555A88:
>>>> >     > __mbdvdw_module_MOD_mbdvdw_check_quantity_dh (in
>>>> >     /Users/tmarkovich/bin/pw.x)
>>>> >     > ==10233==  Address 0x103131248 is 8 bytes after a block of size
>>>> 1,728
>>>> >     > alloc'd
>>>> >     > ==10233==    at 0x1011814AB: malloc (in
>>>> >     >
>>>> >
>>>>  /usr/local/Cellar/valgrind/HEAD/lib/valgrind/vgpreload_memcheck-amd64-darwin.so)
>>>> >     > ==10233==    by 0x1005547B0:
>>>> >     > __mbdvdw_module_MOD_mbdvdw_check_quantity_dh (in
>>>> >     /Users/tmarkovich/bin/pw.x)
>>>> >     > ==10233==    by 0x1003D56E4: v_of_rho_ (in
>>>> /Users/tmarkovich/bin/pw.x)
>>>> >     > ==10233==    by 0x1000F540B: electrons_scf_ (in
>>>> >     /Users/tmarkovich/bin/pw.x)
>>>> >     > ==10233==    by 0x1000F6E18: electrons_ (in
>>>> >     /Users/tmarkovich/bin/pw.x)
>>>> >     > ==10233==    by 0x10032AA28: run_pwscf_ (in
>>>> >     /Users/tmarkovich/bin/pw.x)
>>>> >     > ==10233==    by 0x1000010BB: MAIN__ (pwscf.f90:30)
>>>> >     > ==10233==    by 0x100B67C1F: main (pwscf.f90:14)
>>>> >     _______________________________________________
>>>> >     Q-e-developers mailing list
>>>> >     Q-e-developers at qe-forge.org <mailto:Q-e-developers at qe-forge.org>
>>>> >     http://qe-forge.org/mailman/listinfo/q-e-developers
>>>> >
>>>> >
>>>> _______________________________________________
>>>> Q-e-developers mailing list
>>>> Q-e-developers at qe-forge.org
>>>> http://qe-forge.org/mailman/listinfo/q-e-developers
>>>>
>>>
>>>
>>> _______________________________________________
>>> Q-e-developers mailing list
>>> Q-e-developers at qe-forge.org
>>> http://qe-forge.org/mailman/listinfo/q-e-developers
>>>
>>>
>>
>> _______________________________________________
>> Q-e-developers mailing list
>> Q-e-developers at qe-forge.org
>> http://qe-forge.org/mailman/listinfo/q-e-developers
>>
>>
>
> _______________________________________________
> Q-e-developers mailing list
> Q-e-developers at qe-forge.org
> http://qe-forge.org/mailman/listinfo/q-e-developers
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/developers/attachments/20160505/723a73f8/attachment.html>