<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p>Thank you Paolo for the explanation. I misunderstood what that
      function was doing.</p>
    <p>Still the thing that makes me worry is that my workaround to have
      band structure with many bands and many atoms that crashes with S
      matrix error in nscf calculation is to run the same input file as
      with calculation = "scf". It takes 10 times more (it does almost
      10 iterations to converge the self-consistency)  but it never
      experience the S matrix issue... this looks strange. As far as I
      understand, this numerical issue should affect the diagonalization
      in the same way both the self consistent and the non self
      consistent calculation.</p>
    <p>Bests,</p>
    <p>Lorenzo<br>
    </p>
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 12/06/19 22:20, Paolo Giannozzi
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAPMgbCtiw_eJj53Pow1dB+cj0crXzMMGwf_VjKJENC_+NVw_og@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">- The overlap matrix is not the identity matrix
        because corrections vectors are not orthogonal to trial vectors
        and not orthogonal among them.<br>
        <div>- Iterative algorithm is not the best way to solve for many
          eigenvectors: it is devised to solve a number of eigenvalues
          that is a small fraction of the matrix dimension. The more
          eigenvalues, the less convenient and the more unstable it
          becomes.<br>
        </div>
        <div>- I am quite sure that there is no "true" bug
          (uninitialized variables and the like) and that the algorithm
          is "analytically" correct, so to speak. Under some
          circumstances, the overlap matrix has a very small negative
          eigenvalue. An "analytically computed" overlap matrix can't
          have a negative eigenvalue, by construction. One can find a
          workaround, but you need one for each of the many cases: real,
          hermitian, with serial or parallel subspace diagonalization,
          ... I would prefer to understand what triggers the appearance
          of a negative eigenvalue, but it is not that simple.<br>
        </div>
        <div><br>
        </div>
        <div>Paolo<br>
        </div>
        <div class="gmail_quote">
          <div dir="ltr" class="gmail_attr">On Wed, Jun 12, 2019 at 3:55
            PM Lorenzo Monacelli <<a
              href="mailto:mesonepigreco@gmail.com"
              moz-do-not-send="true">mesonepigreco@gmail.com</a>>
            wrote:<br>
          </div>
          <blockquote class="gmail_quote" style="margin:0px 0px 0px
            0.8ex;border-left:1px solid
            rgb(204,204,204);padding-left:1ex">
            <div bgcolor="#FFFFFF">
              <p>Dear all,</p>
              <p>Thank you for your replies.<br>
              </p>
              <p>I have two different versions on two different
                machines. The one I sent you my results was compiled
                with gfortran and standard lapack/blas as provided from
                the ubuntu-software library (16.04). The other is
                compiled with the intel compiler and MKL and runs on a
                cluster. Both of them experienced the issue quite
                randomly. I attach my make.inc of both compilations.</p>
              <p>The same issue, at least on a different input file, was
                experienced by the pw.x (v 6.2) already pre-installed in
                the Spanish MARENOSTRUM cluster, that I assume was
                correctly compiled.<br>
              </p>
              <p>I noticed that a way to reproduce the error is asking
                for many bands in the nscf calculation in a system with
                many atoms (with few symmetries) in the cell ( with 96
                atoms it almost impossible for me to run a nscf
                calculation). <br>
              </p>
              <p>It is possible that the different behavior on different
                machines is actually suggesting that the bug could be
                located in some variable ill-initialized (that its
                automatic initialization is maybe demanded to the
                compiler)?</p>
              <p>Another question: How does cdiaghg work? I assumed that
                the S matrix should be the identity for local norm
                conserving pseudos and GGA xc functionals, but if I
                enforce it to be the identity at the begining of the
                subroutine the code is no more able to converge any
                calculation (even in the scf, where now it works). I am
                a bit skeptical thinking that this is just an error of
                LAPACK or MPI: why does SCF with the same input (that
                should solve the same problem as the nscf but many
                times) works very well (even with many atoms and even if
                I ask many bands)?<br>
              </p>
              <p>Bests,</p>
              <p>Lorenzo</p>
              <p><br>
              </p>
              <div class="gmail-m_4542761726295617225moz-cite-prefix">On
                12/06/19 14:37, Paolo Giannozzi wrote:<br>
              </div>
              <blockquote type="cite">
                <div dir="ltr">
                  <div>I was about to write the same, before noticing
                    that the crash occurs randomly (one run completes, a
                    subsequent one doesn't). Unless some regularity is
                    found (that is: under conditions xyz, the code
                    always crashes) it will be impossible to locate the
                    origin of the problem. Note that the origin of the
                    problem might well be in mathematical libraries, or
                    in MPI. I am 100% sure that in at least some cases
                    diagonalization failures were due to some
                    misbehavior of mathematical libraries (but this was
                    many years ago, on machines that do not exist any
                    longer). Also: a frequent source of random crashes
                    in parallel execution is explained in sec.7.3 of the
                    developer manual, <a
href="http://www.quantum-espresso.org/Doc/developer_man/developer_man.html#SECTION00080000000000000000"
                      target="_blank" moz-do-not-send="true">http://www.quantum-espresso.org/Doc/developer_man/developer_man.html#SECTION00080000000000000000</a></div>
                  <div><br>
                  </div>
                  <div>Paolo<br>
                  </div>
                </div>
                <br>
                <div class="gmail_quote">
                  <div dir="ltr" class="gmail_attr">On Wed, Jun 12, 2019
                    at 2:03 PM Davide Ceresoli <<a
                      href="mailto:davide.ceresoli@cnr.it"
                      target="_blank" moz-do-not-send="true">davide.ceresoli@cnr.it</a>>
                    wrote:<br>
                  </div>
                  <blockquote class="gmail_quote" style="margin:0px 0px
                    0px 0.8ex;border-left:1px solid
                    rgb(204,204,204);padding-left:1ex">Dear Lorenzo,<br>
                         is your QE compiled with a decent compiler and
                    with decent libraries?<br>
                    Your inputs works perfectly for me, with no crashes.<br>
                    <br>
                    HTH.<br>
                    D.<br>
                    <br>
                    <br>
                    <br>
                    On 6/12/19 12:29 PM, Lorenzo Monacelli wrote:<br>
                    > Dear QE developers,<br>
                    > <br>
                    > I think I found a bad bug in the non
                    self-consistent calculation of pw.x<br>
                    > <br>
                    > While the self consistent calculation ends
                    properly, when running a non <br>
                    > self-consistent calculation results in a crash
                    with the error:<br>
                    > <br>
                    > 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%<br>
                    >       task #         0<br>
                    >       from cdiaghg : error #        40<br>
                    >       S matrix not positive definite<br>
                    > 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%<br>
                    > <br>
                    > I checked the cdiaghg subroutine, the S matrix
                    should be the overlap matrix for <br>
                    > the eigenvalue problem Hv = eSv<br>
                    > <br>
                    > That, in case of local Norm Conserving pseudo
                    of Hydrogen (my calculation)  I <br>
                    > suppose it should be the identity, however, if
                    I enforce it to be the indentity <br>
                    > at the beginning of cdiaghg the code says that
                    it is not able to converge the <br>
                    > scf caclulation either.<br>
                    > <br>
                    > I attach the input of the scf calculation (that
                    converges) and the one of the <br>
                    > non-self-consistent calculation (that produces
                    this output).<br>
                    > <br>
                    > I also tried to switch the diagonalization
                    method to cg as suggested as fix, but <br>
                    > nothing changes.<br>
                    > <br>
                    > I modified also the cdiaghg subroutine, to
                    print the S matrix, that you find <br>
                    > attached (random numbers, seems to be
                    uninitialized).<br>
                    > <br>
                    > In both the diagonalization methods if I
                    enforce S to be the identity matrix the <br>
                    > code crashes by saying that it was not able to
                    converge.<br>
                    > <br>
                    > The problem seems to arise especially if I
                    request for more bands with the nbnd <br>
                    > flag in system (but sometimes it occurs even if
                    no extra band is required).<br>
                    > <br>
                    > The QE version I used is the current version in
                    the develop branch of gitlab, <br>
                    > but I noticed the same error occurring also
                    with 6.3 and 6.2 in other cases.<br>
                    > <br>
                    > If I ask for exactly the same input file a scf
                    calculation (instead of a nscf) <br>
                    > everything goes fine (same K points, same
                    diagonalization, same number of <br>
                    > extrabands), but indeed, this is not what I
                    would like to do...<br>
                    > <br>
                    > I I run the nscf calculation after a scf
                    calculation with exactly the same input <br>
                    > (that works), the nscf calculation fails (this
                    means that the crash is not <br>
                    > caused by a bad starting point for the
                    density).<br>
                    > <br>
                    > All these make me really think of a bug in the
                    nscf calculation, rather than a <br>
                    > wrong input.<br>
                    > <br>
                    > Best regards,<br>
                    > <br>
                    > Lorenzo Monacelli<br>
                    > <br>
                    > <br>
                    > P.S.<br>
                    > <br>
                    > In the attached file the pw_* are the nscf
                    input and output, the scf* are the <br>
                    > scf input and output. I run<br>
                    > <br>
                    > <br>
                    > <br>
                    > _______________________________________________<br>
                    > developers mailing list<br>
                    > <a
                      href="mailto:developers@lists.quantum-espresso.org"
                      target="_blank" moz-do-not-send="true">developers@lists.quantum-espresso.org</a><br>
                    > <a
                      href="https://lists.quantum-espresso.org/mailman/listinfo/developers"
                      rel="noreferrer" target="_blank"
                      moz-do-not-send="true">https://lists.quantum-espresso.org/mailman/listinfo/developers</a><br>
                    > <br>
                    <br>
                    -- <br>
+--------------------------------------------------------------+<br>
                       Davide Ceresoli<br>
                       CNR Institute of Molecular Science and Technology
                    (CNR-ISTM)<br>
                       c/o University of Milan, via Golgi 19, 20133
                    Milan, Italy<br>
                       Email: <a href="mailto:davide.ceresoli@cnr.it"
                      target="_blank" moz-do-not-send="true">davide.ceresoli@cnr.it</a><br>
                       Phone: +39-02-50314276, +39-347-1001570 (mobile)<br>
                       Skype: dceresoli<br>
                       Website: <a
                      href="http://sites.google.com/site/dceresoli/"
                      rel="noreferrer" target="_blank"
                      moz-do-not-send="true">http://sites.google.com/site/dceresoli/</a><br>
+--------------------------------------------------------------+<br>
                    _______________________________________________<br>
                    developers mailing list<br>
                    <a
                      href="mailto:developers@lists.quantum-espresso.org"
                      target="_blank" moz-do-not-send="true">developers@lists.quantum-espresso.org</a><br>
                    <a
                      href="https://lists.quantum-espresso.org/mailman/listinfo/developers"
                      rel="noreferrer" target="_blank"
                      moz-do-not-send="true">https://lists.quantum-espresso.org/mailman/listinfo/developers</a><br>
                  </blockquote>
                </div>
                <br clear="all">
                <br>
                -- <br>
                <div dir="ltr"
                  class="gmail-m_4542761726295617225gmail_signature">
                  <div dir="ltr">
                    <div>
                      <div dir="ltr">
                        <div>Paolo Giannozzi, Dip. Scienze Matematiche
                          Informatiche e Fisiche,<br>
                          Univ. Udine, via delle Scienze 208, 33100
                          Udine, Italy<br>
                          Phone +39-0432-558216, fax +39-0432-558222<br>
                          <br>
                        </div>
                      </div>
                    </div>
                  </div>
                </div>
                <br>
                <fieldset
                  class="gmail-m_4542761726295617225mimeAttachmentHeader"></fieldset>
                <pre class="gmail-m_4542761726295617225moz-quote-pre">_______________________________________________
developers mailing list
<a class="gmail-m_4542761726295617225moz-txt-link-abbreviated" href="mailto:developers@lists.quantum-espresso.org" target="_blank" moz-do-not-send="true">developers@lists.quantum-espresso.org</a>
<a class="gmail-m_4542761726295617225moz-txt-link-freetext" href="https://lists.quantum-espresso.org/mailman/listinfo/developers" target="_blank" moz-do-not-send="true">https://lists.quantum-espresso.org/mailman/listinfo/developers</a>
</pre>
              </blockquote>
            </div>
            _______________________________________________<br>
            developers mailing list<br>
            <a href="mailto:developers@lists.quantum-espresso.org"
              target="_blank" moz-do-not-send="true">developers@lists.quantum-espresso.org</a><br>
            <a
              href="https://lists.quantum-espresso.org/mailman/listinfo/developers"
              rel="noreferrer" target="_blank" moz-do-not-send="true">https://lists.quantum-espresso.org/mailman/listinfo/developers</a><br>
          </blockquote>
        </div>
        <br clear="all">
        <br>
        -- <br>
        <div dir="ltr" class="gmail_signature">
          <div dir="ltr">
            <div>
              <div dir="ltr">
                <div>Paolo Giannozzi, Dip. Scienze Matematiche
                  Informatiche e Fisiche,<br>
                  Univ. Udine, via delle Scienze 208, 33100 Udine, Italy<br>
                  Phone +39-0432-558216, fax +39-0432-558222<br>
                  <br>
                </div>
              </div>
            </div>
          </div>
        </div>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <pre class="moz-quote-pre" wrap="">_______________________________________________
developers mailing list
<a class="moz-txt-link-abbreviated" href="mailto:developers@lists.quantum-espresso.org">developers@lists.quantum-espresso.org</a>
<a class="moz-txt-link-freetext" href="https://lists.quantum-espresso.org/mailman/listinfo/developers">https://lists.quantum-espresso.org/mailman/listinfo/developers</a>
</pre>
    </blockquote>
  </body>
</html>