<div dir="ltr"><div>I was about to write the same, before noticing that the crash occurs randomly (one run completes, a subsequent one doesn't). Unless some regularity is found (that is: under conditions xyz, the code always crashes) it will be impossible to locate the origin of the problem. Note that the origin of the problem might well be in mathematical libraries, or in MPI. I am 100% sure that in at least some cases diagonalization failures were due to some misbehavior of mathematical libraries (but this was many years ago, on machines that do not exist any longer). Also: a frequent source of random crashes in parallel execution is explained in sec.7.3 of the developer manual, <a href="http://www.quantum-espresso.org/Doc/developer_man/developer_man.html#SECTION00080000000000000000">http://www.quantum-espresso.org/Doc/developer_man/developer_man.html#SECTION00080000000000000000</a></div><div><br></div><div>Paolo<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Jun 12, 2019 at 2:03 PM Davide Ceresoli <<a href="mailto:davide.ceresoli@cnr.it">davide.ceresoli@cnr.it</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Dear Lorenzo,<br>
is your QE compiled with a decent compiler and with decent libraries?<br>
Your inputs works perfectly for me, with no crashes.<br>
<br>
HTH.<br>
D.<br>
<br>
<br>
<br>
On 6/12/19 12:29 PM, Lorenzo Monacelli wrote:<br>
> Dear QE developers,<br>
> <br>
> I think I found a bad bug in the non self-consistent calculation of pw.x<br>
> <br>
> While the self consistent calculation ends properly, when running a non <br>
> self-consistent calculation results in a crash with the error:<br>
> <br>
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%<br>
> task # 0<br>
> from cdiaghg : error # 40<br>
> S matrix not positive definite<br>
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%<br>
> <br>
> I checked the cdiaghg subroutine, the S matrix should be the overlap matrix for <br>
> the eigenvalue problem Hv = eSv<br>
> <br>
> That, in case of local Norm Conserving pseudo of Hydrogen (my calculation) I <br>
> suppose it should be the identity, however, if I enforce it to be the indentity <br>
> at the beginning of cdiaghg the code says that it is not able to converge the <br>
> scf caclulation either.<br>
> <br>
> I attach the input of the scf calculation (that converges) and the one of the <br>
> non-self-consistent calculation (that produces this output).<br>
> <br>
> I also tried to switch the diagonalization method to cg as suggested as fix, but <br>
> nothing changes.<br>
> <br>
> I modified also the cdiaghg subroutine, to print the S matrix, that you find <br>
> attached (random numbers, seems to be uninitialized).<br>
> <br>
> In both the diagonalization methods if I enforce S to be the identity matrix the <br>
> code crashes by saying that it was not able to converge.<br>
> <br>
> The problem seems to arise especially if I request for more bands with the nbnd <br>
> flag in system (but sometimes it occurs even if no extra band is required).<br>
> <br>
> The QE version I used is the current version in the develop branch of gitlab, <br>
> but I noticed the same error occurring also with 6.3 and 6.2 in other cases.<br>
> <br>
> If I ask for exactly the same input file a scf calculation (instead of a nscf) <br>
> everything goes fine (same K points, same diagonalization, same number of <br>
> extrabands), but indeed, this is not what I would like to do...<br>
> <br>
> I I run the nscf calculation after a scf calculation with exactly the same input <br>
> (that works), the nscf calculation fails (this means that the crash is not <br>
> caused by a bad starting point for the density).<br>
> <br>
> All these make me really think of a bug in the nscf calculation, rather than a <br>
> wrong input.<br>
> <br>
> Best regards,<br>
> <br>
> Lorenzo Monacelli<br>
> <br>
> <br>
> P.S.<br>
> <br>
> In the attached file the pw_* are the nscf input and output, the scf* are the <br>
> scf input and output. I run<br>
> <br>
> <br>
> <br>
> _______________________________________________<br>
> developers mailing list<br>
> <a href="mailto:developers@lists.quantum-espresso.org" target="_blank">developers@lists.quantum-espresso.org</a><br>
> <a href="https://lists.quantum-espresso.org/mailman/listinfo/developers" rel="noreferrer" target="_blank">https://lists.quantum-espresso.org/mailman/listinfo/developers</a><br>
> <br>
<br>
-- <br>
+--------------------------------------------------------------+<br>
Davide Ceresoli<br>
CNR Institute of Molecular Science and Technology (CNR-ISTM)<br>
c/o University of Milan, via Golgi 19, 20133 Milan, Italy<br>
Email: <a href="mailto:davide.ceresoli@cnr.it" target="_blank">davide.ceresoli@cnr.it</a><br>
Phone: +39-02-50314276, +39-347-1001570 (mobile)<br>
Skype: dceresoli<br>
Website: <a href="http://sites.google.com/site/dceresoli/" rel="noreferrer" target="_blank">http://sites.google.com/site/dceresoli/</a><br>
+--------------------------------------------------------------+<br>
_______________________________________________<br>
developers mailing list<br>
<a href="mailto:developers@lists.quantum-espresso.org" target="_blank">developers@lists.quantum-espresso.org</a><br>
<a href="https://lists.quantum-espresso.org/mailman/listinfo/developers" rel="noreferrer" target="_blank">https://lists.quantum-espresso.org/mailman/listinfo/developers</a><br>
</blockquote></div><br clear="all"><br>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div>Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,<br>Univ. Udine, via delle Scienze 208, 33100 Udine, Italy<br>Phone +39-0432-558216, fax +39-0432-558222<br><br></div></div></div></div></div>