<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><br><div><div>Il giorno 23/ago/2010, alle ore 16.32, Nicki Frank Hinsche ha scritto:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div>Hi there,<br><br>I am currently doing calculations of iso-energy surfaces on doped  <br>semiconductors. Therefore I generate with an external program a quite  <br>big k-point mesh for which I want to determine the eigenvalues and  <br>later on construct the iso-energy surface with a tetrahedron method.  <br>My problem is the running time of the bandstructure calculation.<br><br>The size of the unit (super)-cells is in the order of 30-50 atoms,  <br>containing 1 or 2 different atomic species. For k-points in the order  <br>of 4000-6000 the eigenvalues have to be calculated (most often around  <br>50-100 ev's for each k-point).<br><br><br>After the scf-calculation is done quite fast, I am running the nscf  <br>bandstructure calc. with the command<br><br><br>mpirun -np 32 pw.x -npool 4 -diag 16<br></div></blockquote><div><br></div><div><br></div>The correct version of this command line should be </div><div><br></div><div>mpirun -np 32 pw.x -npool 4 -ndiag 16</div><div><br></div><div>but this would not work either since ndiag must be smaller or equal to nproc/npool (which is 8 in your case). So you should either use</div><div><br></div><div><div>mpirun -np 32 pw.x -npool 4 -ndiag 4</div><div><br></div><div>or</div><div><br></div><div><div>mpirun -np 64 pw.x -npool 4 -ndiag 16</div><div><br></div><div><br></div></div></div><div>I suggest you to use this in combination with scalapack libraries, since it may improve a lot the speed when the number of bands is very large (as in your case).</div><div><br></div><div>If this isn't fast enough for your needs, probably you'll have to switch to another method, such as Shirley's interpolation for instance (see <a href="http://arxiv.org/abs/0908.3876">http://arxiv.org/abs/0908.3876</a>).</div><div><br></div><div><br></div><div>HTH</div><div><br></div><div>GS</div><div><br></div><div><blockquote type="cite"><div><br><br>but the calculation isn't done parallel, as the output says:<br><br>      Parallel version (MPI), running on    32 processors<br>      K-points division:     npool     =    4<br>      R & G space division:  proc/pool =    8<br><br>      Subspace diagonalization in iterative solution of the eigenvalue  <br>problem:<br>      a serial algorithm will be used<br></div></blockquote></div><br><div>

<span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><div><span class="Apple-style-span" style="color: rgb(126, 126, 126); font-size: 16px; font-style: italic; "><br class="Apple-interchange-newline">§ Gabriele Sclauzero, EPFL SB ITP CSEA</span></div><div><font class="Apple-style-span" color="#7E7E7E"><i>   PH H2 462, Station 3, CH-1015 Lausanne</i></font></div></span>

</div>

<br></body></html>