[Pw_forum] Crash on wf_collect in multi-pool spin-polarized calcs

Peter Scherpelz pscherpelz at uchicago.edu
Thu Feb 12 00:17:42 CET 2015


A belated thank-you for this fix -- I finally got around to compiling 
the newest version, and it seems good.

Peter

On 02/06/2015 10:17 AM, Paolo Giannozzi wrote:
> Thanks for reporting this. Please replace PW/src/pw_restart.f90
> with the newest version that you find here (thanks to Andrea Dal
> Corso who spotted the bug):
> http://www.qe-forge.org/gf/project/q-e/scmsvn/?action=browse&path=%2F%
> 2Acheckout%2A%2Ftrunk%2Fespresso%2FPW%2Fsrc%
> 2Fpw_restart.f90&revision=11365
>
> Paolo
> On Wed, 2015-02-04 at 15:05 -0600, Peter Scherpelz wrote:
>
>> No, I've observed no similar crashes in the spin-unpolarized case, and I
>> just re-checked two spin-unpolarized toy models and saw no errors (2
>> k-points / 2 pools, also 7 k-points / 7 pools).
>>
>> Peter
>>
>> On 02/04/2015 02:46 PM, Paolo Giannozzi wrote:
>>> It looks like yet another case of the usual mess with wavefunctions
>>> a. kept in memory, b. stored to a memory buffer, c. written to file.
>>> With one k-point per pool we are in case a. but apparently the code
>>> thinks to be in case b. that falls back to case c.. Does this happen
>>> also in the spin-unpolarized case with # of k-points = # of points?
>>>
>>> Paolo
>>>
>>> On Wed, 2015-02-04 at 12:59 -0600, Peter Scherpelz wrote:
>>>> Hello,
>>>>
>>>> I'm hitting a crash that I've traced to a fairly particular set of
>>>> circumstances, and want to check if this is a known and/or reproducible
>>>> bug beyond what I've found.
>>>>
>>>> In detail: I've been running parallelized, spin-polarized pw.x
>>>> calculations (scf and relax). Quantum-espresso v5.1, using the MPI
>>>> version on either a single node or cluster. I find that quantum espresso
>>>> crashes with a davcio error, during the wf_collect stage of the
>>>> computation, only if the number of pools I'm using is equal to the total
>>>> number of k-points after spin-polarization is considered (e.g., gamma
>>>> only with 2 pools, or 2 distinct k-point locations with 4 pools).
>>>>
>>>> If I run on half that many pools, I do not get a crash. If I run on an
>>>> equal number of pools but double the number of k-points, I also do not
>>>> get a crash. If I set wf_collect to false, I also do not get a crash.
>>>>
>>>> I've attached a toy model (Si crystal) that exhibits this behavior; and
>>>> can include the successful runs with the alternate configurations if
>>>> that helps.
>>>>
>>>> Thanks in advance for your help!  And thanks overall to the developers
>>>> for the program - I'm a fairly new user and it's been working great
>>>> otherwise.
>>>>
>>>> Best,
>>>> Peter Scherpelz
>>>> _______________________________________________
>>>> Pw_forum mailing list
>>>> Pw_forum at pwscf.org
>>>> http://pwscf.org/mailman/listinfo/pw_forum
>>> _______________________________________________
>>> Pw_forum mailing list
>>> Pw_forum at pwscf.org
>>> http://pwscf.org/mailman/listinfo/pw_forum
>> _______________________________________________
>> Pw_forum mailing list
>> Pw_forum at pwscf.org
>> http://pwscf.org/mailman/listinfo/pw_forum
>
> _______________________________________________
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://pwscf.org/mailman/listinfo/pw_forum




More information about the users mailing list