<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>Dear all,</p>
<p>Thank you for your replies.<br>
</p>
<p>I have two different versions on two different machines. The one
I sent you my results was compiled with gfortran and standard
lapack/blas as provided from the ubuntu-software library (16.04).
The other is compiled with the intel compiler and MKL and runs on
a cluster. Both of them experienced the issue quite randomly. I
attach my make.inc of both compilations.</p>
<p>The same issue, at least on a different input file, was
experienced by the pw.x (v 6.2) already pre-installed in the
Spanish MARENOSTRUM cluster, that I assume was correctly compiled.<br>
</p>
<p>I noticed that a way to reproduce the error is asking for many
bands in the nscf calculation in a system with many atoms (with
few symmetries) in the cell ( with 96 atoms it almost impossible
for me to run a nscf calculation). <br>
</p>
<p>It is possible that the different behavior on different machines
is actually suggesting that the bug could be located in some
variable ill-initialized (that its automatic initialization is
maybe demanded to the compiler)?</p>
<p>Another question: How does cdiaghg work? I assumed that the S
matrix should be the identity for local norm conserving pseudos
and GGA xc functionals, but if I enforce it to be the identity at
the begining of the subroutine the code is no more able to
converge any calculation (even in the scf, where now it works). I
am a bit skeptical thinking that this is just an error of LAPACK
or MPI: why does SCF with the same input (that should solve the
same problem as the nscf but many times) works very well (even
with many atoms and even if I ask many bands)?<br>
</p>
<p>Bests,</p>
<p>Lorenzo</p>
<p><br>
</p>
<div class="moz-cite-prefix">On 12/06/19 14:37, Paolo Giannozzi
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAPMgbCuhjji2Lw4Z91ENZXGxntEmeEMsQ-m=FqSoX2XBibvO=Q@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div>I was about to write the same, before noticing that the
crash occurs randomly (one run completes, a subsequent one
doesn't). Unless some regularity is found (that is: under
conditions xyz, the code always crashes) it will be impossible
to locate the origin of the problem. Note that the origin of
the problem might well be in mathematical libraries, or in
MPI. I am 100% sure that in at least some cases
diagonalization failures were due to some misbehavior of
mathematical libraries (but this was many years ago, on
machines that do not exist any longer). Also: a frequent
source of random crashes in parallel execution is explained in
sec.7.3 of the developer manual, <a
href="http://www.quantum-espresso.org/Doc/developer_man/developer_man.html#SECTION00080000000000000000"
moz-do-not-send="true">http://www.quantum-espresso.org/Doc/developer_man/developer_man.html#SECTION00080000000000000000</a></div>
<div><br>
</div>
<div>Paolo<br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed, Jun 12, 2019 at 2:03
PM Davide Ceresoli <<a href="mailto:davide.ceresoli@cnr.it"
moz-do-not-send="true">davide.ceresoli@cnr.it</a>> wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Dear
Lorenzo,<br>
is your QE compiled with a decent compiler and with
decent libraries?<br>
Your inputs works perfectly for me, with no crashes.<br>
<br>
HTH.<br>
D.<br>
<br>
<br>
<br>
On 6/12/19 12:29 PM, Lorenzo Monacelli wrote:<br>
> Dear QE developers,<br>
> <br>
> I think I found a bad bug in the non self-consistent
calculation of pw.x<br>
> <br>
> While the self consistent calculation ends properly, when
running a non <br>
> self-consistent calculation results in a crash with the
error:<br>
> <br>
>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%<br>
> task # 0<br>
> from cdiaghg : error # 40<br>
> S matrix not positive definite<br>
>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%<br>
> <br>
> I checked the cdiaghg subroutine, the S matrix should be
the overlap matrix for <br>
> the eigenvalue problem Hv = eSv<br>
> <br>
> That, in case of local Norm Conserving pseudo of Hydrogen
(my calculation) I <br>
> suppose it should be the identity, however, if I enforce
it to be the indentity <br>
> at the beginning of cdiaghg the code says that it is not
able to converge the <br>
> scf caclulation either.<br>
> <br>
> I attach the input of the scf calculation (that
converges) and the one of the <br>
> non-self-consistent calculation (that produces this
output).<br>
> <br>
> I also tried to switch the diagonalization method to cg
as suggested as fix, but <br>
> nothing changes.<br>
> <br>
> I modified also the cdiaghg subroutine, to print the S
matrix, that you find <br>
> attached (random numbers, seems to be uninitialized).<br>
> <br>
> In both the diagonalization methods if I enforce S to be
the identity matrix the <br>
> code crashes by saying that it was not able to converge.<br>
> <br>
> The problem seems to arise especially if I request for
more bands with the nbnd <br>
> flag in system (but sometimes it occurs even if no extra
band is required).<br>
> <br>
> The QE version I used is the current version in the
develop branch of gitlab, <br>
> but I noticed the same error occurring also with 6.3 and
6.2 in other cases.<br>
> <br>
> If I ask for exactly the same input file a scf
calculation (instead of a nscf) <br>
> everything goes fine (same K points, same
diagonalization, same number of <br>
> extrabands), but indeed, this is not what I would like to
do...<br>
> <br>
> I I run the nscf calculation after a scf calculation with
exactly the same input <br>
> (that works), the nscf calculation fails (this means that
the crash is not <br>
> caused by a bad starting point for the density).<br>
> <br>
> All these make me really think of a bug in the nscf
calculation, rather than a <br>
> wrong input.<br>
> <br>
> Best regards,<br>
> <br>
> Lorenzo Monacelli<br>
> <br>
> <br>
> P.S.<br>
> <br>
> In the attached file the pw_* are the nscf input and
output, the scf* are the <br>
> scf input and output. I run<br>
> <br>
> <br>
> <br>
> _______________________________________________<br>
> developers mailing list<br>
> <a href="mailto:developers@lists.quantum-espresso.org"
target="_blank" moz-do-not-send="true">developers@lists.quantum-espresso.org</a><br>
> <a
href="https://lists.quantum-espresso.org/mailman/listinfo/developers"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://lists.quantum-espresso.org/mailman/listinfo/developers</a><br>
> <br>
<br>
-- <br>
+--------------------------------------------------------------+<br>
Davide Ceresoli<br>
CNR Institute of Molecular Science and Technology
(CNR-ISTM)<br>
c/o University of Milan, via Golgi 19, 20133 Milan, Italy<br>
Email: <a href="mailto:davide.ceresoli@cnr.it"
target="_blank" moz-do-not-send="true">davide.ceresoli@cnr.it</a><br>
Phone: +39-02-50314276, +39-347-1001570 (mobile)<br>
Skype: dceresoli<br>
Website: <a href="http://sites.google.com/site/dceresoli/"
rel="noreferrer" target="_blank" moz-do-not-send="true">http://sites.google.com/site/dceresoli/</a><br>
+--------------------------------------------------------------+<br>
_______________________________________________<br>
developers mailing list<br>
<a href="mailto:developers@lists.quantum-espresso.org"
target="_blank" moz-do-not-send="true">developers@lists.quantum-espresso.org</a><br>
<a
href="https://lists.quantum-espresso.org/mailman/listinfo/developers"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://lists.quantum-espresso.org/mailman/listinfo/developers</a><br>
</blockquote>
</div>
<br clear="all">
<br>
-- <br>
<div dir="ltr" class="gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">
<div>Paolo Giannozzi, Dip. Scienze Matematiche
Informatiche e Fisiche,<br>
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy<br>
Phone +39-0432-558216, fax +39-0432-558222<br>
<br>
</div>
</div>
</div>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
developers mailing list
<a class="moz-txt-link-abbreviated" href="mailto:developers@lists.quantum-espresso.org">developers@lists.quantum-espresso.org</a>
<a class="moz-txt-link-freetext" href="https://lists.quantum-espresso.org/mailman/listinfo/developers">https://lists.quantum-espresso.org/mailman/listinfo/developers</a>
</pre>
</blockquote>
</body>
</html>