<html>

  <head>

    <meta content="text/html; charset=windows-1252"

      http-equiv="Content-Type">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <div class="moz-cite-prefix">The only parallelization that i see in

      bands is the basic one over R & G. If it is different from the

      parallelization used previously you should use wf_collect.<br>

      the code computes the overlap between the orbital at k and k+dk in

      order to decide how to connect them. it's an nbnd^2 operation done

      band by band. not very efficient evidently but it should not take

      hours.<br>

      you can use wf_collect=.true. and increase the number of

      processors.<br>

       <br>

      stefano<br>

      <br>

      <br>

      On 05/12/2015 12:57, Maxim Skripnik wrote:<br>

    </div>

    <blockquote

      cite="mid:53cdc7c4e767a41e.5662d119@limbe.rz.uni-konstanz.de"

      type="cite">Thank you for the information. Yes, at the beginning

      of the pw.x output it says:<br>

           Parallel version (MPI), running on    64 processors<br>

           R & G space division:  proc/nbgrp/npool/nimage =      64<br>

      <br>

      Is bands.x parallelized at all? If so, where can I find

      information on that? There's nothing mentioned in the

      documentation:<br>

<a class="moz-txt-link-freetext" href="http://www.quantum-espresso.org/wp-content/uploads/Doc/pp_user_guide.pdf">http://www.quantum-espresso.org/wp-content/uploads/Doc/pp_user_guide.pdf</a><br>

<a class="moz-txt-link-freetext" href="http://www.quantum-espresso.org/wp-content/uploads/Doc/INPUT_BANDS.html">http://www.quantum-espresso.org/wp-content/uploads/Doc/INPUT_BANDS.html</a><br>

      <br>

      What could be the reason for bands.x taking many hours to

      calculate the bands? The foregoing pw.x calculation has already

      determined the energy for each k-point along a path (Gamma -> K

      -> M -> Gamma). There are 61 k-points and 129 bands. So what

      is bands.x actaully doing beside reformating that data? The input

      file job.bands looks like this:<br>

       &bands<br>

          prefix   = 'st1'<br>

          outdir   = './tmp'<br>

      /<br>

      The calculation is initiated by<br>

      mpirun -np 64 bands.x < job.bands<br>

      <br>

      Maxim Skripnik<br>

      Department of Physics<br>

      University of Konstanz<br>

      <br>

      Am Samstag, 05. Dezember 2015 02:37 CET, stefano de gironcoli

      <a class="moz-txt-link-rfc2396E" href="mailto:degironc@sissa.it"><degironc@sissa.it></a> schrieb:<br>

       

      <blockquote type="cite" cite="56623FC2.9070705@sissa.it"> </blockquote>

      <meta content="text/html; charset=windows-1252"

        http-equiv="Content-Type">

      <div class="moz-cite-prefix">On 04/12/2015 22:53, Maxim Skripnik

        wrote:</div>

      <blockquote

        cite="mid:bbc1c3d2d4083244.56620b59@limbe.rz.uni-konstanz.de"

        type="cite">Hello,<br>

        <br>

        I'm a bit confused by the parallelization scheme of QE. First of

        all, I run calculations on a cluster with usually 1 to 8 nodes,

        each of which has 16 cores. There is a very good scaling of pw.x

        e.g. for structural relaxation jobs. I do not specify any

        particular parallelization scheme as mentioned in the

        documentation, i.e. I start the calculations with<br>

        mpirun -np 128 pw.x < job.pw<br>

        on 8 nodes, 16 cores each. According to the documentation ni=1,

        nk=1 and nt=1. So in which respect are the calculations

        parallelized by default? Why do the calculations scale so well

        without specifying ni, nk, nt, nd?</blockquote>

      R and G parallelization is performed.<br>

      wavefunctions' planewaves, density planewaves and slices of real

      space objects are distributed across 128 processors. A report of

      how this is done is given at the beginning of the output.<br>

      Did you had a look to it ?<br>

       

      <blockquote

        cite="mid:bbc1c3d2d4083244.56620b59@limbe.rz.uni-konstanz.de"

        type="cite">Second question is, whether one can speed up bands.x

        calculations. Up to now I start these this way:<br>

        mpirun -np 64 bands.x < job.bands<br>

        on 4 nodes, 16 cores each. Does it make sense to define nb for

        bands.x? If yes, what would be reasonable values?</blockquote>

      expect no gain. band parallelization is not implemented in bands.<br>

      <br>

      stefano<br>

      <br>

      <br>

      <br>

      <br>

      <br>

      <br>

       

      <blockquote

        cite="mid:bbc1c3d2d4083244.56620b59@limbe.rz.uni-konstanz.de"

        type="cite">The systems of interest consist of typically ~50

        atoms with periodic boundaries.<br>

        <br>

        Maxim Skripnik<br>

        Department of Physics<br>

        University of Konstanz

        <fieldset class="mimeAttachmentHeader"> </fieldset>

         

        <pre wrap="">_______________________________________________

Pw_forum mailing list

<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:Pw_forum@pwscf.org">Pw_forum@pwscf.org</a>

<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://pwscf.org/mailman/listinfo/pw_forum">http://pwscf.org/mailman/listinfo/pw_forum</a></pre>

      </blockquote>

      <br>

      <br>

      <br>

       

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

Pw_forum mailing list

<a class="moz-txt-link-abbreviated" href="mailto:Pw_forum@pwscf.org">Pw_forum@pwscf.org</a>

<a class="moz-txt-link-freetext" href="http://pwscf.org/mailman/listinfo/pw_forum">http://pwscf.org/mailman/listinfo/pw_forum</a></pre>

    </blockquote>

    <br>

  </body>

</html>