<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>Thank you Paolo for the explanation. I misunderstood what that
function was doing.</p>
<p>Still the thing that makes me worry is that my workaround to have
band structure with many bands and many atoms that crashes with S
matrix error in nscf calculation is to run the same input file as
with calculation = "scf". It takes 10 times more (it does almost
10 iterations to converge the self-consistency) but it never
experience the S matrix issue... this looks strange. As far as I
understand, this numerical issue should affect the diagonalization
in the same way both the self consistent and the non self
consistent calculation.</p>
<p>Bests,</p>
<p>Lorenzo<br>
</p>
<p><br>
</p>
<div class="moz-cite-prefix">On 12/06/19 22:20, Paolo Giannozzi
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAPMgbCtiw_eJj53Pow1dB+cj0crXzMMGwf_VjKJENC_+NVw_og@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">- The overlap matrix is not the identity matrix
because corrections vectors are not orthogonal to trial vectors
and not orthogonal among them.<br>
<div>- Iterative algorithm is not the best way to solve for many
eigenvectors: it is devised to solve a number of eigenvalues
that is a small fraction of the matrix dimension. The more
eigenvalues, the less convenient and the more unstable it
becomes.<br>
</div>
<div>- I am quite sure that there is no "true" bug
(uninitialized variables and the like) and that the algorithm
is "analytically" correct, so to speak. Under some
circumstances, the overlap matrix has a very small negative
eigenvalue. An "analytically computed" overlap matrix can't
have a negative eigenvalue, by construction. One can find a
workaround, but you need one for each of the many cases: real,
hermitian, with serial or parallel subspace diagonalization,
... I would prefer to understand what triggers the appearance
of a negative eigenvalue, but it is not that simple.<br>
</div>
<div><br>
</div>
<div>Paolo<br>
</div>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed, Jun 12, 2019 at 3:55
PM Lorenzo Monacelli <<a
href="mailto:mesonepigreco@gmail.com"
moz-do-not-send="true">mesonepigreco@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p>Dear all,</p>
<p>Thank you for your replies.<br>
</p>
<p>I have two different versions on two different
machines. The one I sent you my results was compiled
with gfortran and standard lapack/blas as provided from
the ubuntu-software library (16.04). The other is
compiled with the intel compiler and MKL and runs on a
cluster. Both of them experienced the issue quite
randomly. I attach my make.inc of both compilations.</p>
<p>The same issue, at least on a different input file, was
experienced by the pw.x (v 6.2) already pre-installed in
the Spanish MARENOSTRUM cluster, that I assume was
correctly compiled.<br>
</p>
<p>I noticed that a way to reproduce the error is asking
for many bands in the nscf calculation in a system with
many atoms (with few symmetries) in the cell ( with 96
atoms it almost impossible for me to run a nscf
calculation). <br>
</p>
<p>It is possible that the different behavior on different
machines is actually suggesting that the bug could be
located in some variable ill-initialized (that its
automatic initialization is maybe demanded to the
compiler)?</p>
<p>Another question: How does cdiaghg work? I assumed that
the S matrix should be the identity for local norm
conserving pseudos and GGA xc functionals, but if I
enforce it to be the identity at the begining of the
subroutine the code is no more able to converge any
calculation (even in the scf, where now it works). I am
a bit skeptical thinking that this is just an error of
LAPACK or MPI: why does SCF with the same input (that
should solve the same problem as the nscf but many
times) works very well (even with many atoms and even if
I ask many bands)?<br>
</p>
<p>Bests,</p>
<p>Lorenzo</p>
<p><br>
</p>
<div class="gmail-m_4542761726295617225moz-cite-prefix">On
12/06/19 14:37, Paolo Giannozzi wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>I was about to write the same, before noticing
that the crash occurs randomly (one run completes, a
subsequent one doesn't). Unless some regularity is
found (that is: under conditions xyz, the code
always crashes) it will be impossible to locate the
origin of the problem. Note that the origin of the
problem might well be in mathematical libraries, or
in MPI. I am 100% sure that in at least some cases
diagonalization failures were due to some
misbehavior of mathematical libraries (but this was
many years ago, on machines that do not exist any
longer). Also: a frequent source of random crashes
in parallel execution is explained in sec.7.3 of the
developer manual, <a
href="http://www.quantum-espresso.org/Doc/developer_man/developer_man.html#SECTION00080000000000000000"
target="_blank" moz-do-not-send="true">http://www.quantum-espresso.org/Doc/developer_man/developer_man.html#SECTION00080000000000000000</a></div>
<div><br>
</div>
<div>Paolo<br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed, Jun 12, 2019
at 2:03 PM Davide Ceresoli <<a
href="mailto:davide.ceresoli@cnr.it"
target="_blank" moz-do-not-send="true">davide.ceresoli@cnr.it</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px
0px 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">Dear Lorenzo,<br>
is your QE compiled with a decent compiler and
with decent libraries?<br>
Your inputs works perfectly for me, with no crashes.<br>
<br>
HTH.<br>
D.<br>
<br>
<br>
<br>
On 6/12/19 12:29 PM, Lorenzo Monacelli wrote:<br>
> Dear QE developers,<br>
> <br>
> I think I found a bad bug in the non
self-consistent calculation of pw.x<br>
> <br>
> While the self consistent calculation ends
properly, when running a non <br>
> self-consistent calculation results in a crash
with the error:<br>
> <br>
>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%<br>
> task # 0<br>
> from cdiaghg : error # 40<br>
> S matrix not positive definite<br>
>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%<br>
> <br>
> I checked the cdiaghg subroutine, the S matrix
should be the overlap matrix for <br>
> the eigenvalue problem Hv = eSv<br>
> <br>
> That, in case of local Norm Conserving pseudo
of Hydrogen (my calculation) I <br>
> suppose it should be the identity, however, if
I enforce it to be the indentity <br>
> at the beginning of cdiaghg the code says that
it is not able to converge the <br>
> scf caclulation either.<br>
> <br>
> I attach the input of the scf calculation (that
converges) and the one of the <br>
> non-self-consistent calculation (that produces
this output).<br>
> <br>
> I also tried to switch the diagonalization
method to cg as suggested as fix, but <br>
> nothing changes.<br>
> <br>
> I modified also the cdiaghg subroutine, to
print the S matrix, that you find <br>
> attached (random numbers, seems to be
uninitialized).<br>
> <br>
> In both the diagonalization methods if I
enforce S to be the identity matrix the <br>
> code crashes by saying that it was not able to
converge.<br>
> <br>
> The problem seems to arise especially if I
request for more bands with the nbnd <br>
> flag in system (but sometimes it occurs even if
no extra band is required).<br>
> <br>
> The QE version I used is the current version in
the develop branch of gitlab, <br>
> but I noticed the same error occurring also
with 6.3 and 6.2 in other cases.<br>
> <br>
> If I ask for exactly the same input file a scf
calculation (instead of a nscf) <br>
> everything goes fine (same K points, same
diagonalization, same number of <br>
> extrabands), but indeed, this is not what I
would like to do...<br>
> <br>
> I I run the nscf calculation after a scf
calculation with exactly the same input <br>
> (that works), the nscf calculation fails (this
means that the crash is not <br>
> caused by a bad starting point for the
density).<br>
> <br>
> All these make me really think of a bug in the
nscf calculation, rather than a <br>
> wrong input.<br>
> <br>
> Best regards,<br>
> <br>
> Lorenzo Monacelli<br>
> <br>
> <br>
> P.S.<br>
> <br>
> In the attached file the pw_* are the nscf
input and output, the scf* are the <br>
> scf input and output. I run<br>
> <br>
> <br>
> <br>
> _______________________________________________<br>
> developers mailing list<br>
> <a
href="mailto:developers@lists.quantum-espresso.org"
target="_blank" moz-do-not-send="true">developers@lists.quantum-espresso.org</a><br>
> <a
href="https://lists.quantum-espresso.org/mailman/listinfo/developers"
rel="noreferrer" target="_blank"
moz-do-not-send="true">https://lists.quantum-espresso.org/mailman/listinfo/developers</a><br>
> <br>
<br>
-- <br>
+--------------------------------------------------------------+<br>
Davide Ceresoli<br>
CNR Institute of Molecular Science and Technology
(CNR-ISTM)<br>
c/o University of Milan, via Golgi 19, 20133
Milan, Italy<br>
Email: <a href="mailto:davide.ceresoli@cnr.it"
target="_blank" moz-do-not-send="true">davide.ceresoli@cnr.it</a><br>
Phone: +39-02-50314276, +39-347-1001570 (mobile)<br>
Skype: dceresoli<br>
Website: <a
href="http://sites.google.com/site/dceresoli/"
rel="noreferrer" target="_blank"
moz-do-not-send="true">http://sites.google.com/site/dceresoli/</a><br>
+--------------------------------------------------------------+<br>
_______________________________________________<br>
developers mailing list<br>
<a
href="mailto:developers@lists.quantum-espresso.org"
target="_blank" moz-do-not-send="true">developers@lists.quantum-espresso.org</a><br>
<a
href="https://lists.quantum-espresso.org/mailman/listinfo/developers"
rel="noreferrer" target="_blank"
moz-do-not-send="true">https://lists.quantum-espresso.org/mailman/listinfo/developers</a><br>
</blockquote>
</div>
<br clear="all">
<br>
-- <br>
<div dir="ltr"
class="gmail-m_4542761726295617225gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">
<div>Paolo Giannozzi, Dip. Scienze Matematiche
Informatiche e Fisiche,<br>
Univ. Udine, via delle Scienze 208, 33100
Udine, Italy<br>
Phone +39-0432-558216, fax +39-0432-558222<br>
<br>
</div>
</div>
</div>
</div>
</div>
<br>
<fieldset
class="gmail-m_4542761726295617225mimeAttachmentHeader"></fieldset>
<pre class="gmail-m_4542761726295617225moz-quote-pre">_______________________________________________
developers mailing list
<a class="gmail-m_4542761726295617225moz-txt-link-abbreviated" href="mailto:developers@lists.quantum-espresso.org" target="_blank" moz-do-not-send="true">developers@lists.quantum-espresso.org</a>
<a class="gmail-m_4542761726295617225moz-txt-link-freetext" href="https://lists.quantum-espresso.org/mailman/listinfo/developers" target="_blank" moz-do-not-send="true">https://lists.quantum-espresso.org/mailman/listinfo/developers</a>
</pre>
</blockquote>
</div>
_______________________________________________<br>
developers mailing list<br>
<a href="mailto:developers@lists.quantum-espresso.org"
target="_blank" moz-do-not-send="true">developers@lists.quantum-espresso.org</a><br>
<a
href="https://lists.quantum-espresso.org/mailman/listinfo/developers"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://lists.quantum-espresso.org/mailman/listinfo/developers</a><br>
</blockquote>
</div>
<br clear="all">
<br>
-- <br>
<div dir="ltr" class="gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">
<div>Paolo Giannozzi, Dip. Scienze Matematiche
Informatiche e Fisiche,<br>
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy<br>
Phone +39-0432-558216, fax +39-0432-558222<br>
<br>
</div>
</div>
</div>
</div>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
developers mailing list
<a class="moz-txt-link-abbreviated" href="mailto:developers@lists.quantum-espresso.org">developers@lists.quantum-espresso.org</a>
<a class="moz-txt-link-freetext" href="https://lists.quantum-espresso.org/mailman/listinfo/developers">https://lists.quantum-espresso.org/mailman/listinfo/developers</a>
</pre>
</blockquote>
</body>
</html>