[Pw_forum] qe-gipaw 6.0 Segmentation fault

Davide Ceresoli davide.ceresoli at cnr.it
Mon Jan 30 12:03:52 CET 2017


Dear Yasser,
     I'm glad it works! notwithstanding that you link the gfortran
MKL (mkl_gf_lp64), zdotc still need a workaround, interesting!
It seems to me that you are already using FFT from MKL (-D__DFTI).

Best,
     Davide



On 01/29/2017 04:21 PM, Yasser Fowad AlWahedi wrote:
> Dear Paolo and Davide,
>
> Thank you so much for your kind help.  I have tested solutions 1,2 and 4 since the 3 was not applicable.  Two solutions of the suggested ones worked. For the sake of record; I have put below the details of my trials and the outcomes I got as to help any one who might be facing this problem in the future.
>
>
> My GNU compilers versions is: GNU Fortran (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
>
> the successful make.inc file is attached.
>
>
> Solution 1 (failed):  I modified the greenfunction.f90 in the src under gipaw to:
>
> eprec (ibnd) = 1.35d0 * sum(conjg(evq(1:npw,ibnd))*work(1:npw)) !zdotc (npw, evq (1, ibnd), 1, work, 1)
>
> recompiled gipaw.x and tested again, here I got the following error
>
> Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
>
> Backtrace for this error:
>
> Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
>
> Backtrace for this error:
>
> Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
>
> Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
>
> Backtrace for this error:
>
> Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
>
> Backtrace for this error:
>
> Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
>
> Backtrace for this error:
>
> Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
>
> Backtrace for this error:
>
> Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
>
> Backtrace for this error:
>
> Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
>
> Backtrace for this error:
>
> Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
>
> Backtrace for this error:
>
> Backtrace for this error:
> #0  0x7F6A0A0A6E08
> #1  0x7F6A0A0A5F90
> #0  0x7F2817102E08
> #2  0x7F6A095BB4AF
> #1  0x7F2817101F90
> #2  0x7F28166174AF
> #3  0x7F6A0CF5B2CD
> #3  0x7F2819FB72CD
> #0  0x7FB0AF7DBE08
> #0  0x7F334BB99E08
> #0  0x7FCA597BCE08
> #1  0x7FB0AF7DAF90
> #1  0x7F334BB98F90
> #1  0x7FCA597BBF90
> #2  0x7FCA58CD14AF
> #2  0x7FB0AECF04AF
> #3  0x#2  0x7F334B0AE4AF
> #3  0x7FCA5C6712CD
> 7FB0B26902CD
> #3  0x7F334EA4E2CD
> #0  0x7FAB00A1DE08
> #1  0x7FAB00A1CF90
> #0  0x7F9E42A5DE08
> #2  0x7FAAFFF324AF
> #3  0x7FAB038D22CD
> #1  0x7F9E42A5CF90
> #2  0x7F9E41F724AF
> #3  0x7F9E459122CD
> #0  0x7F6E3302AE08
> #1  0x7F6E33029F90
> #2  0x7F6E3253F4AF
> #3  0x7F6E35EDF2CD
> #0  0x7F7F133A2E08
> #0  0x7F1935284E08
> #1  0x7F7F133A1F90
> #1  0x7F1935283F90
> #2  0x7F7F128B74AF
> #2  0x7F19347994AF
> #3  0x7F7F162572CD
> #3  0x7F19381392CD
> #4  0x412E66 in cgsolve_all_ at cgsolve_all.f90:148 (discriminator 2)
> #4  0x412E66 in cgsolve_all_ at cgsolve_all.f90:148 (discriminator 2)
> #5  0x411E6A in greenfunction_ at greenfunction.f90:257
> #5  0x411E6A in greenfunction_ at greenfunction.f90:257
> #4  0x412E66 in cgsolve_all_ at cgsolve_all.f90:148 (discriminator 2)
> #4  0x412E66 in cgsolve_all_ at cgsolve_all.f90:148 (discriminator 2)
> #5  0x411E6A in greenfunction_ at greenfunction.f90:257
> #6  0x43070A in paramagnetic_correction_aug_ at nmr_routines.f90:375
> #4  0x412E66 in cgsolve_all_ at cgsolve_all.f90:148 (discriminator 2)
> #6  0x43070A in paramagnetic_correction_aug_ at nmr_routines.f90:375
> #5  0x411E6A in greenfunction_ at greenfunction.f90:257
> #4  0x412E66 in cgsolve_all_ at cgsolve_all.f90:148 (discriminator 2)
> #5  0x411E6A in greenfunction_ at greenfunction.f90:257
> #4  0x412E66 in cgsolve_all_ at cgsolve_all.f90:148 (discriminator 2)
> #4  0x412E66 in cgsolve_all_ at cgsolve_all.f90:148 (discriminator 2)
> #5  0x411E6A in greenfunction_ at greenfunction.f90:257
> #5  0x411E6A in greenfunction_ at greenfunction.f90:257
> #4  0x412E66 in cgsolve_all_ at cgsolve_all.f90:148 (discriminator 2)
> #7  0x41E9C9 in suscept_crystal_inner_qzero at suscept_crystal.f90:469
> #5  0x411E6A in greenfunction_ at greenfunction.f90:257
> #8  0x4053F2 in MAIN__ at gipaw_main.f90:155
> #5  0x411E6A in greenfunction_ at greenfunction.f90:257
> #4  0x412E66 in cgsolve_all_ at cgsolve_all.f90:148 (discriminator 2)
> #6  0x43070A in paramagnetic_correction_aug_ at nmr_routines.f90:375
> #7  0x41E9C9 in suscept_crystal_inner_qzero at suscept_crystal.f90:469
> #6  0x43070A in paramagnetic_correction_aug_ at nmr_routines.f90:375
> #5  0x411E6A in greenfunction_ at greenfunction.f90:257
> #6  0x43070A in paramagnetic_correction_aug_ at nmr_routines.f90:375
> #8  0x4053F2 in MAIN__ at gipaw_main.f90:155
> #6  0x43070A in paramagnetic_correction_aug_ at nmr_routines.f90:375
> #6  0x43070A in paramagnetic_correction_aug_ at nmr_routines.f90:375
> #6  0x43070A in paramagnetic_correction_aug_ at nmr_routines.f90:375
> #6  0x43070A in paramagnetic_correction_aug_ at nmr_routines.f90:375
> #7  0x41E9C9 in suscept_crystal_inner_qzero at suscept_crystal.f90:469
> #6  0x43070A in paramagnetic_correction_aug_ at nmr_routines.f90:375
> #7  0x41E9C9 in suscept_crystal_inner_qzero at suscept_crystal.f90:469
> #7  0x41E9C9 in suscept_crystal_inner_qzero at suscept_crystal.f90:469
> #8  0x4053F2 in MAIN__ at gipaw_main.f90:155
> #8  0x4053F2 in MAIN__ at gipaw_main.f90:155
> #8  0x4053F2 in MAIN__ at gipaw_main.f90:155
> #7  0x41E9C9 in suscept_crystal_inner_qzero at suscept_crystal.f90:469
> #7  0x41E9C9 in suscept_crystal_inner_qzero at suscept_crystal.f90:469
> #7  0x41E9C9 in suscept_crystal_inner_qzero at suscept_crystal.f90:469
> #7  #8  0x4053F2 in MAIN__ at gipaw_main.f90:155
> #8  0x4053F2 in MAIN__ at gipaw_main.f90:155
> 0x41E9C9 in #8  0x4053F2 in MAIN__ at gipaw_main.f90:155
> suscept_crystal_inner_qzero at suscept_crystal.f90:469
> #7  0x41E9C9 in suscept_crystal_inner_qzero at suscept_crystal.f90:469
> #8  0x#8  0x4053F2 in MAIN__ at gipaw_main.f90:155
> 4053F2 in MAIN__ at gipaw_main.f90:155
>
> Solution 2: I used the existing Netlib math libs compiled everything as is using GNU compilers without any change to the greenfunction.f90.
>
> This worked fine and I managed to get the magres file.
>
> Of course I prefer to use intel mkl since in benchmark cases I ran it was 3 to 5 times faster than Netlib.
>
> Solution 4:   Add  the line            -Dzdotc=zdotc_wrapper           to DFLAGS in make.inc then compile with GNU, intel mkl and fftw3
>
> This solution worked perfectly without any problems and produced the same result of solution 2.  Thanks again for your help.
>
> One small question; to use the FFT library of intel mkl what libraries should i add to the make.inc file?
>
> Yasser
>
> ________________________________________
> From: pw_forum-bounces at pwscf.org [pw_forum-bounces at pwscf.org] on behalf of Paolo Giannozzi [p.giannozzi at gmail.com]
> Sent: Sunday, January 29, 2017 1:19 PM
> To: PWSCF Forum
> Subject: Re: [Pw_forum] qe-gipaw 6.0 Segmentation fault
>
> 3) re-link with -lmkl_gf_lp64 instead of -lmkl_intel_lp64, or
> 4) re-compile with preprocessing option -Dzdotc=zdotc_wrapper added to DFLAGS
>
> Paolo
>
> On Sun, Jan 29, 2017 at 10:07 AM, Davide Ceresoli
> <davide.ceresoli at cnr.it> wrote:
>> Dear Yasser,
>>      can you tell us the compiler version, and send the generated
>> make.inc? The crash is on the very first call to 'zdotc' which
>> returns a complex (128 bit) and I suspect an incompatibility
>> between GNU compilers and Intel MKL.
>>
>> While I'm checking this, could you try two things?
>> 1) modify greenfunction.f90:209 from:
>> eprec (ibnd) = 1.35d0 * zdotc (npw, evq (1, ibnd), 1, work, 1)
>> to:
>> eprec (ibnd) = 1.35d0 * sum( conjg(evq(1:npw,inbd)) * work(1:npw) )
>>
>> 2) recompile without mkl? configure with --with-netlib or with any
>> other blas/lapack implementation.
>>
>> Best,
>>      Davide
>>
>>
>>
>>
>> On 01/27/2017 03:23 PM, Yasser Fowad AlWahedi wrote:
>>> Dear all,
>>>
>>> I have compiled QE 6.0 with GNU compilers, intel mkl for all libraries except
>>> fft which i used fftw3 for it. I used MPICH for the MPI libs.
>>>
>>> I also compiled successfully gipaw version 6.0 using the same libraries. I
>>> started running the quartz example. SCF went successfully without any problems.
>>>
>>> GIPAW failed and gave the following error:
>>>
>>> /Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
>>>
>>> Backtrace for this error:
>>> #0  0x7F050C81AE08
>>> #1  0x7F050C819F90
>>> #2  0x7F050BD2F4AF
>>> #3  0x7F050F6CF2CD
>>> #4  0x411A17 in greenfunction_ at greenfunction.f90:209
>>> #5  0x43069A in paramagnetic_correction_aug_ at nmr_routines.f90:375
>>> #6  0x41E959 in suscept_crystal_inner_qzero at suscept_crystal.f90:469
>>> #7  0x4053F2 in MAIN__ at gipaw_main.f90:155
>>> Segmentation fault (core dumped)/
>>>
>>> I ran using 1 core and 10 cores, both gave similar errors.
>>>
>>> I checked other posts and it seems a similar problem existed before in version
>>> 5.4. I assumed it was fixed in version 6.0. Can you please advise on what can i
>>> do to address this problem?
>>>
>>> Yasser Al Wahedi
>>> Assistant Professor
>>> Petroleum Institute
>>>
>> _______________________________________________
>> Pw_forum mailing list
>> Pw_forum at pwscf.org
>> http://pwscf.org/mailman/listinfo/pw_forum
>
>
>
> --
> Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
> Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
> Phone +39-0432-558216, fax +39-0432-558222
> _______________________________________________
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://pwscf.org/mailman/listinfo/pw_forum
>

-- 
+--------------------------------------------------------------+
   Davide Ceresoli
   CNR Institute of Molecular Science and Technology (CNR-ISTM)
   c/o University of Milan, via Golgi 19, 20133 Milan, Italy
   Email: davide.ceresoli at cnr.it
   Phone: +39-02-50314276, +39-347-1001570 (mobile)
   Skype: dceresoli
   Website: http://sites.google.com/site/dceresoli/
+--------------------------------------------------------------+



More information about the users mailing list