<html>
  <head>
    <meta content="text/html; charset=windows-1252"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <p>I meanwhile had a discussion with Lorenzo Paulatto about a
      similar problem.<br>
    </p>
    <p>I think that it might be a rather specific problem. As soon as I
      parallelize only over<br>
      q points using start_q and last_q there is no problem - also for
      restarting.</p>
    <p>Using images I can, in principle, even create the full dvscf
      files, without having to<br>
      rerun the calculation without images, using split and cat on the
      different dvscf files<br>
      in the different temp folders. It's tedious but it works. Yet, in
      future I will use only<br>
      the parallelization over q points for the calculation of the
      dvscf.</p>
    <p>In summary, the parallelization for PH is not straightforward and
      I think that it<br>
      might help to store, e.g., the dvscf files for different
      representations separately.<br>
      But Lorenzo mentioned that system administrators complain if the
      number of<br>
      written files is large... It could be helpful if there would be a
      kind of summary<br>
      what can be done using images and what not... I.e. dvscf (and
      el-ph) does not<br>
      work if image parallelization is used, especially if the different
      representations<br>
      of one q point are split across different images. For el-ph the
      code does not<br>
      start, but maybe a similar check can be added for the dvscf files?</p>
    <p>Well, or maybe not, I don't know :)<br>
    </p>
    <br>
    <div class="moz-cite-prefix">On 09/23/2016 04:24 PM, Paolo Giannozzi
      wrote:<br>
    </div>
    <blockquote
cite="mid:CAPMgbCsV0BYAcyU87tVjuKOFGowBHmBzR2yQRuNJ5GbD-HO5kQ@mail.gmail.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html;
        charset=windows-1252">
      <div dir="ltr">has anybody any idea? P.<br>
      </div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">On Wed, Sep 14, 2016 at 1:30 PM, Thomas
          Brumme <span dir="ltr"><<a moz-do-not-send="true"
              href="mailto:thomas.brumme@mpsd.mpg.de" target="_blank">thomas.brumme@mpsd.mpg.de</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">Dear all,<br>
            <br>
            I think I found a bug in the image parallelization of PH -
            or I'm doing<br>
            something wrong.<br>
            I used the version 5.4 but the problem is also there if I
            use the 6.0 beta.<br>
            Maybe someone remembers my email few days ago to the normal
            email list<br>
            concerning<br>
            the parallelization using the GRID technique - the problem I
            encounter<br>
            here is essentially<br>
            the same. As an example, I use a modified run_example_1 of
            the<br>
            Recover_example<br>
            directory of PH.<br>
            <br>
            Description of the problem:<br>
            <br>
            0. (Following the example) I did an scf calculation using 2
            CPUs with:<br>
            <br>
              &control<br>
                 calculation='scf'<br>
                 restart_mode='from_scratch',<br>
                 prefix='aluminum',<br>
                 pseudo_dir = './',<br>
                 outdir='./tempdir/'<br>
              /<br>
              &system<br>
                 ibrav=  2, celldm(1) =7.5, nat= 1, ntyp= 1,<br>
                 ecutwfc =15.0,<br>
                 occupations='smearing', smearing='methfessel-paxton',
            degauss=0.05,<br>
                 la2F = .true.,<br>
              /<br>
              &electrons<br>
                 conv_thr =  1.0d-8<br>
                 mixing_beta = 0.7<br>
              /<br>
            ATOMIC_SPECIES<br>
              Al  26.98 Al.pz-vbc.UPF<br>
            ATOMIC_POSITIONS<br>
              Al 0.00 0.00 0.00<br>
            K_POINTS {automatic}<br>
              16 16 16  0 0 0<br>
            <br>
            <br>
            1. I'll do the scf calculation using 2 CPUS and:<br>
            <br>
              &control<br>
                 calculation='scf'<br>
                 restart_mode='from_scratch',<br>
                 prefix='aluminum',<br>
                 pseudo_dir = './',<br>
                 outdir='./tempdir/'<br>
              /<br>
              &system<br>
                 ibrav=  2, celldm(1) =7.5, nat= 1, ntyp= 1,<br>
                 ecutwfc =15.0,<br>
                 occupations='smearing', smearing='methfessel-paxton',
            degauss=0.05<br>
              /<br>
              &electrons<br>
                 conv_thr =  1.0d-8<br>
                 mixing_beta = 0.7<br>
              /<br>
            ATOMIC_SPECIES<br>
              Al  26.98 Al.pz-vbc.UPF<br>
            ATOMIC_POSITIONS<br>
              Al 0.00 0.00 0.00<br>
            K_POINTS {automatic}<br>
              8 8 8  0 0 0<br>
            <br>
            <br>
            2. I'll do a phonon calculation including storing the dvscf
            files and<br>
            using images.<br>
            More specifically I used:<br>
            <br>
            mpirun -np 4 ph.x -ni 2 < <a moz-do-not-send="true"
              href="http://al.elph.in" rel="noreferrer" target="_blank">al.elph.in</a><br>
            <br>
            with <a moz-do-not-send="true" href="http://al.elph.in"
              rel="noreferrer" target="_blank">al.elph.in</a> given by:<br>
            <br>
            Electron-phonon coefficients for Al<br>
              &inputph<br>
               tr2_ph=1.0d-10,<br>
               prefix='aluminum',<br>
               fildvscf='aldv',<br>
               amass(1)=26.98,<br>
               outdir='./tempdir/',<br>
               fildyn='al.dyn',<br>
            !  electron_phonon='interpolated'<wbr>,<br>
            !  el_ph_sigma=0.005,<br>
            !  el_ph_nsigma=10,<br>
            !  recover=.true.<br>
            !  trans=.false.,<br>
               ldisp=.true.<br>
               max_seconds=6,<br>
               nq1=4, nq2=4, nq3=4<br>
              /<br>
            <br>
            I used max_seconds in order to simulate the finite run time
            we have on<br>
            our HPC.<br>
            Restarting with recover=.true. works fine... I.e. I used:<br>
            <br>
            Electron-phonon coefficients for Al<br>
              &inputph<br>
               tr2_ph=1.0d-10,<br>
               prefix='aluminum',<br>
               fildvscf='aldv',<br>
               amass(1)=26.98,<br>
               outdir='./tempdir/',<br>
               fildyn='al.dyn',<br>
            !  electron_phonon='interpolated'<wbr>,<br>
            !  el_ph_sigma=0.005,<br>
            !  el_ph_nsigma=10,<br>
               recover=.true.<br>
            !  trans=.false.,<br>
               ldisp=.true.<br>
               max_seconds=6,<br>
               nq1=4, nq2=4, nq3=4<br>
              /<br>
            <br>
            <br>
            3. Now I want to collect all data using no images:<br>
            <br>
            mpirun -np 2 ph.x < <a moz-do-not-send="true"
              href="http://al.elph.in" rel="noreferrer" target="_blank">al.elph.in</a><br>
            <br>
            with the same input file as given in 2.<br>
            <br>
            I'll get the error "Possibly too few bands at point ..."
            once the code<br>
            wants to<br>
            recalculate the wave functions for the q points which were
            calculated<br>
            only on<br>
            the second image, i.e., for q points 6, 7, and 8.<br>
            <br>
            If I check the charge_density.dat files in the subfolders of
            the q<br>
            points in the<br>
            _ph0 directory I find that they're empty. Thus, I copied the
            q<br>
            subfolders of the<br>
            second image by hand to the folder of the first image using:<br>
            <br>
            cp -r _ph1/aluminum.q_* _ph0/<br>
            <br>
            If I now restart without images, using the input of 2. it
            works...<br>
            Everything is fine...<br>
            <br>
            <br>
            4. Now I can also calculate the el-ph parameters using the
            input:<br>
            <br>
            Electron-phonon coefficients for Al<br>
              &inputph<br>
               tr2_ph=1.0d-10,<br>
               prefix='aluminum',<br>
               fildvscf='aldv',<br>
               amass(1)=26.98,<br>
               outdir='./tempdir/',<br>
               fildyn='al.dyn',<br>
               electron_phonon='interpolated'<wbr>,<br>
               el_ph_sigma=0.005,<br>
               el_ph_nsigma=10,<br>
            !  recover=.true.<br>
               trans=.false.,<br>
               ldisp=.true.<br>
            !  max_seconds=6,<br>
               nq1=4, nq2=4, nq3=4<br>
              /<br>
            <br>
            <br>
            5. Another problem I encounter is the following... Suppose
            the run time<br>
            is not enough to<br>
            finish the el-ph calculations, i.e., instead of the input in
            4. I use:<br>
            <br>
            Electron-phonon coefficients for Al<br>
              &inputph<br>
               tr2_ph=1.0d-10,<br>
               prefix='aluminum',<br>
               fildvscf='aldv',<br>
               amass(1)=26.98,<br>
               outdir='./tempdir/',<br>
               fildyn='al.dyn',<br>
               electron_phonon='interpolated'<wbr>,<br>
               el_ph_sigma=0.005,<br>
               el_ph_nsigma=10,<br>
            !  recover=.true.<br>
               trans=.false.,<br>
               ldisp=.true.<br>
               max_seconds=6,<br>
               nq1=4, nq2=4, nq3=4<br>
              /<br>
            <br>
            The code will stop at a certain point (in my case the 4th q
            point). If I<br>
            now restart the calculation<br>
            using:<br>
            <br>
            Electron-phonon coefficients for Al<br>
              &inputph<br>
               tr2_ph=1.0d-10,<br>
               prefix='aluminum',<br>
               fildvscf='aldv',<br>
               amass(1)=26.98,<br>
               outdir='./tempdir/',<br>
               fildyn='al.dyn',<br>
               electron_phonon='interpolated'<wbr>,<br>
               el_ph_sigma=0.005,<br>
               el_ph_nsigma=10,<br>
               recover=.true.<br>
               trans=.false.,<br>
               ldisp=.true.<br>
            !  max_seconds=6,<br>
               nq1=4, nq2=4, nq3=4<br>
              /<br>
            <br>
            I get (again) the error message "Possibly too few bands at
            point ..."<br>
            once the code wants to calculate<br>
            the wave functions for the 4th q point (the one it stopped
            before)...<br>
            All other points are fine...<br>
            <br>
            <br>
            I think that the whole problem is related to the storing of
            the wave<br>
            functions and the charge density.<br>
            Maybe I'm doing something really wrong, but I don't see any
            obvious<br>
            error in the input... Also I don't<br>
            see any input variable for ph which influences the saving of
            wave<br>
            functions...<br>
            <br>
            Regards<br>
            <br>
            Thomas<br>
            <br>
            --<br>
            Dr. rer. nat. Thomas Brumme<br>
            Max Planck Institute for the Structure and Dynamics of
            Matter<br>
            Luruper Chaussee 149<br>
            22761 Hamburg<br>
            <br>
            Tel:  +49 (0)40 8998 6557<br>
            <br>
            email: <a moz-do-not-send="true"
              href="mailto:Thomas.Brumme@mpsd.mpg.de">Thomas.Brumme@mpsd.mpg.de</a><br>
            <br>
            ______________________________<wbr>_________________<br>
            Q-e-developers mailing list<br>
            <a moz-do-not-send="true"
              href="mailto:Q-e-developers@qe-forge.org">Q-e-developers@qe-forge.org</a><br>
            <a moz-do-not-send="true"
              href="http://qe-forge.org/mailman/listinfo/q-e-developers"
              rel="noreferrer" target="_blank">http://qe-forge.org/mailman/<wbr>listinfo/q-e-developers</a><br>
          </blockquote>
        </div>
        <br>
        <br clear="all">
        <br>
        -- <br>
        <div class="gmail_signature" data-smartmail="gmail_signature">
          <div dir="ltr">
            <div>
              <div dir="ltr">
                <div>Paolo Giannozzi, Dip. Scienze Matematiche
                  Informatiche e Fisiche,<br>
                  Univ. Udine, via delle Scienze 208, 33100 Udine, Italy<br>
                  Phone +39-0432-558216, fax +39-0432-558222<br>
                  <br>
                </div>
              </div>
            </div>
          </div>
        </div>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
Q-e-developers mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Q-e-developers@qe-forge.org">Q-e-developers@qe-forge.org</a>
<a class="moz-txt-link-freetext" href="http://qe-forge.org/mailman/listinfo/q-e-developers">http://qe-forge.org/mailman/listinfo/q-e-developers</a>
</pre>
    </blockquote>
    <br>
    <pre class="moz-signature" cols="72">-- 
Dr. rer. nat. Thomas Brumme
Max Planck Institute for the Structure and Dynamics of Matter
Luruper Chaussee 149
22761 Hamburg

Tel:  +49 (0)40 8998 6557

email: <a class="moz-txt-link-abbreviated" href="mailto:Thomas.Brumme@mpsd.mpg.de">Thomas.Brumme@mpsd.mpg.de</a>
</pre>
  </body>
</html>