[QE-users] segfault with HSE06

Fabrizio Ferrari ferrariruffino.fz at gmail.com
Tue Jul 9 13:40:10 CEST 2019


Yes!

Il mar 9 lug 2019, 13:39 Michal Krompiec <michal.krompiec at gmail.com> ha
scritto:

> Hi Fabrizio,
> Thanks. Do you mean replacing this: (lines 466-475)
>    IF(DoLoc) then
> !$omp parallel do collapse(3) default(shared)
> firstprivate(npol,nrxxs,nkqs,ibnd_buff_start,ibnd_buff_end)
> private(ir,ibnd,ikq,ipol)
>       DO ikq=1,SIZE(locbuff,3)
>          DO ibnd=1, x_nbnd_occ
>             DO ir=1,nrxxs*npol
>                locbuff(ir,ibnd,ikq)=0.0_DP
>             ENDDO
>          ENDDO
>       ENDDO
>     ELSE
>
> with the following:
>
>     IF(DoLoc) then
>  IF(gamma_only) then
> !$omp parallel do collapse(3) default(shared)
> firstprivate(npol,nrxxs,nkqs,ibnd_buff_start,ibnd_buff_end)
> private(ir,ibnd,ikq,ipol)
>         DO ikq=1,SIZE(locbuff,3)
>            DO ibnd=1, x_nbnd_occ
>               DO ir=1,nrxxs*npol
>                  locbuff(ir,ibnd,ikq)=0.0_DP
>               ENDDO
>            ENDDO
>         ENDDO
>  END IF
>     ELSE
>
> ?
>
> On Tue, 9 Jul 2019 at 11:37, Fabrizio Ferrari <ferrariruffino.fz at gmail.com>
> wrote:
>
>> Hello,
>> if you put the loop at line 467 of PW/src/exx.f90 inside an
>> 'IF(gamma_only)', the segfault should disappear. I'm just checking if that
>> is the only fix needed.
>>
>> Fabrizio
>>
>> On Tue, Jul 9, 2019 at 12:27 PM Michal Krompiec <
>> michal.krompiec at gmail.com> wrote:
>>
>>> I got a similar segfault using a fresh installation of QE 6.4.1, on a
>>> different HPC, this time with Intel 2018 compilers and ELPA, with the same
>>> input file and pseudos (SG15) as previously.
>>> I noticed that my molecule is a bit too high in the simulation cell (vs.
>>> the potential added by assume_isolated), but increasing the size of the
>>> cell in the Z direction changed nothing.
>>> Switching off assume_isolated also didn't help.
>>>
>>> from stdout:
>>>
>>> forrtl: severe (174): SIGSEGV, segmentation fault occurred
>>> Image              PC                Routine            Line
>>>  Source
>>> pw.x               00000000056A6D8D  Unknown               Unknown
>>>  Unknown
>>> libpthread-2.17.s  00002AC70727E5E0  Unknown               Unknown
>>>  Unknown
>>> pw.x               00000000005B3618  Unknown               Unknown
>>>  Unknown
>>> pw.x               00000000005AE060  Unknown               Unknown
>>>  Unknown
>>> pw.x               000000000040ADFD  Unknown               Unknown
>>>  Unknown
>>> pw.x               000000000059D83D  Unknown               Unknown
>>>  Unknown
>>> pw.x               00000000004086C9  Unknown               Unknown
>>>  Unknown
>>> pw.x               000000000040851E  Unknown               Unknown
>>>  Unknown
>>> libc-2.17.so       00002AC7077AEC05  __libc_start_main     Unknown
>>>  Unknown
>>> pw.x               0000000000408429  Unknown               Unknown
>>>  Unknown
>>> forrtl: severe (174): SIGSEGV, segmentation fault occurred
>>>
>>> Last few lines of the log file:
>>>
>>>      convergence has been achieved in   9 iterations
>>>
>>>      Using localization algorithm with threshold:   0.50D-02
>>>
>>>      Using ACE for calculation of exact exchange
>>>
>>>      EXX grid:  1427071 G-vectors     FFT dimensions: (  90,  90, 375)
>>>
>>>  NBands =           15  nks =            1  nkqs =            9
>>>      Canonical Orbitals
>>>
>>> Any suggestions?
>>>
>>> Thanks,
>>> Michal
>>>
>>>
>>> On Mon, 8 Jul 2019 at 13:27, Michal Krompiec <michal.krompiec at gmail.com>
>>> wrote:
>>>
>>>> Hello,
>>>> I'm getting a segmentation fault when trying to run a HSE06 SCF
>>>> calculation in QE 6.4rc (built with gcc and OpenMPI). I got the same result
>>>> regardless of number of OMP threads (1-2) or MPI processes, it is also not
>>>> because I'm running out of memory. Increasing OMP_STACK_SIZE didn't help.
>>>> I'm using SG15 norm-conserving pseudopotentials.
>>>> This is the error message:
>>>>
>>>> Program received signal SIGSEGV: Segmentation fault - invalid memory
>>>> reference.
>>>>
>>>> Backtrace for this error:
>>>>
>>>> Backtrace for this error:
>>>> #0  0x2ba41caf6607 in ???
>>>> #1  0x2ba41caf586d in ???
>>>> #2  0x2ba41d9d5fdf in ???
>>>> #1  0x2ba41caf586d in ???
>>>> #2  0x2ba41d9d5fdf in ???
>>>> #3  0x481031 in __exx_MOD_exxinit._omp_fn.41
>>>>         at
>>>> /home/hpcadmin/Cluster-packages/Apps/QE/q-e-qe-6.4-rc/PW/src/exx.f90:471
>>>>
>>>>
>>>> And this is the input file:
>>>> &CONTROL
>>>>    nstep            = 150
>>>>    prefix           = 'pz'
>>>>    calculation = 'scf'
>>>> /
>>>> &SYSTEM
>>>>    ecutwfc          = 60
>>>>    ecutrho          = 240
>>>>    occupations      = 'smearing'
>>>>    degauss          = 0.03
>>>>    smearing         = 'marzari-vanderbilt'
>>>>    assume_isolated  = '2D'
>>>>    ntyp             = 3
>>>>    nat              = 10
>>>>    ibrav            = 0
>>>>    vdw_corr='dft-d3'
>>>>    nosym = .true.
>>>>    input_dft = 'hse'
>>>>    localization_thr = 0.005
>>>>
>>>> /
>>>> &ELECTRONS
>>>> electron_maxstep = 1000
>>>> mixing_mode  = 'plain'
>>>> mixing_beta = 0.3
>>>> mixing_ndim = 10
>>>> /
>>>> &IONS
>>>> ion_dynamics = 'bfgs'
>>>> /
>>>>
>>>> ATOMIC_SPECIES
>>>> H 1 H_ONCV_PBE-1.0.upf
>>>> C 12 C_ONCV_PBE-1.0.upf
>>>> N 14 N_ONCV_PBE-1.0.upf
>>>>
>>>> K_POINTS automatic
>>>> 3 3 1  0 0 0
>>>>
>>>> CELL_PARAMETERS angstrom
>>>> 9.45000000000000 0.00000000000000 0.0
>>>> 0.00000000000000 8.90954544295050 0.0
>>>> 0.00000000000000 0.00000000000000 40.0
>>>>
>>>> ATOMIC_POSITIONS angstrom
>>>> N        3.220157348   4.070243213   7.161850862
>>>> C        2.057591707   4.250681378   7.817575738
>>>> C        4.333273229   4.171146830   7.913403134
>>>> C        2.009345068   4.525337360   9.193044239
>>>> C        4.285082283   4.446195607   9.288766365
>>>> N        3.122476937   4.626559339   9.944547282
>>>> H        1.139008253   4.174089393   7.227702259
>>>> H        5.290517910   4.028527369   7.402554504
>>>> H        1.052120461   4.668162403   9.703968061
>>>> H        5.203598139   4.523590493   9.878655120
>>>>
>>>>
>>>> I would be grateful for any suggestions. In the meantime, we are
>>>> upgrading to 6.4.1 to see if this helps.
>>>>
>>>> Best regards,
>>>>
>>>> Michal Krompiec
>>>>
>>>> Merck KGaA, Darmstadt, Germany & University of Southampton
>>>>
>>> _______________________________________________
>>> Quantum ESPRESSO is supported by MaX (www.max-centre.eu/quantum-espresso
>>> )
>>> users mailing list users at lists.quantum-espresso.org
>>> https://lists.quantum-espresso.org/mailman/listinfo/users
>>
>> _______________________________________________
>> Quantum ESPRESSO is supported by MaX (www.max-centre.eu/quantum-espresso)
>> users mailing list users at lists.quantum-espresso.org
>> https://lists.quantum-espresso.org/mailman/listinfo/users
>
> _______________________________________________
> Quantum ESPRESSO is supported by MaX (www.max-centre.eu/quantum-espresso)
> users mailing list users at lists.quantum-espresso.org
> https://lists.quantum-espresso.org/mailman/listinfo/users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20190709/fed237b0/attachment.html>


More information about the users mailing list