[Pw_forum] Crash on wf_collect in multi-pool spin-polarized calcs

Paolo Giannozzi paolo.giannozzi at uniud.it
Fri Feb 6 17:17:27 CET 2015


Thanks for reporting this. Please replace PW/src/pw_restart.f90
with the newest version that you find here (thanks to Andrea Dal
Corso who spotted the bug):
http://www.qe-forge.org/gf/project/q-e/scmsvn/?action=browse&path=%2F%
2Acheckout%2A%2Ftrunk%2Fespresso%2FPW%2Fsrc%
2Fpw_restart.f90&revision=11365

Paolo
On Wed, 2015-02-04 at 15:05 -0600, Peter Scherpelz wrote:

> No, I've observed no similar crashes in the spin-unpolarized case, and I 
> just re-checked two spin-unpolarized toy models and saw no errors (2 
> k-points / 2 pools, also 7 k-points / 7 pools).
> 
> Peter
> 
> On 02/04/2015 02:46 PM, Paolo Giannozzi wrote:
> > It looks like yet another case of the usual mess with wavefunctions
> > a. kept in memory, b. stored to a memory buffer, c. written to file.
> > With one k-point per pool we are in case a. but apparently the code
> > thinks to be in case b. that falls back to case c.. Does this happen
> > also in the spin-unpolarized case with # of k-points = # of points?
> >
> > Paolo
> >
> > On Wed, 2015-02-04 at 12:59 -0600, Peter Scherpelz wrote:
> >> Hello,
> >>
> >> I'm hitting a crash that I've traced to a fairly particular set of
> >> circumstances, and want to check if this is a known and/or reproducible
> >> bug beyond what I've found.
> >>
> >> In detail: I've been running parallelized, spin-polarized pw.x
> >> calculations (scf and relax). Quantum-espresso v5.1, using the MPI
> >> version on either a single node or cluster. I find that quantum espresso
> >> crashes with a davcio error, during the wf_collect stage of the
> >> computation, only if the number of pools I'm using is equal to the total
> >> number of k-points after spin-polarization is considered (e.g., gamma
> >> only with 2 pools, or 2 distinct k-point locations with 4 pools).
> >>
> >> If I run on half that many pools, I do not get a crash. If I run on an
> >> equal number of pools but double the number of k-points, I also do not
> >> get a crash. If I set wf_collect to false, I also do not get a crash.
> >>
> >> I've attached a toy model (Si crystal) that exhibits this behavior; and
> >> can include the successful runs with the alternate configurations if
> >> that helps.
> >>
> >> Thanks in advance for your help!  And thanks overall to the developers
> >> for the program - I'm a fairly new user and it's been working great
> >> otherwise.
> >>
> >> Best,
> >> Peter Scherpelz
> >> _______________________________________________
> >> Pw_forum mailing list
> >> Pw_forum at pwscf.org
> >> http://pwscf.org/mailman/listinfo/pw_forum
> >
> > _______________________________________________
> > Pw_forum mailing list
> > Pw_forum at pwscf.org
> > http://pwscf.org/mailman/listinfo/pw_forum
> 
> _______________________________________________
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://pwscf.org/mailman/listinfo/pw_forum





More information about the users mailing list