<div dir="ltr"><div><div>Parallelization option "-ndiag" is effective only on dense-matrix diagonalization and related matrix-matrix operations, involving M x M matrices, with M = number of bands x a small factor (1 to 4 at most).<br></div>For M smaller than a few hundreds, the cost of such operations is small and the advantage of linear-algebra diagonalization negligible. When M is in the thousands, though, you need it, not only to speed up the calculation but also to distribute those large matrices. <br><br>You can find how much time you spend in parallelizable linear-algebra operation in items "rdiaghg" or "cdiaghg", in the final time report.<br><br></div><div>Paolo<br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Nov 22, 2017 at 12:20 PM, balabi <span dir="ltr"><<a href="mailto:balabi@qq.com" target="_blank">balabi@qq.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div>
<div style="font-family:"\005fae\008f6f\0096c5\009ed1";font-size:14px;color:#000000;line-height:1.7">
<div>
<div>Dear developers,</div><div> I tried to use -ndiag options in scf process. Because I read tutorials which says "most CPU time spent in linear-algebra operations, implemented in BLAS and LAPACK libraries, and in FFT". And what linear algebra parallelization does is to "Distribute and parallelize matrix diagonalization and matrix- matrix multiplications needed in iterative diagonalization (SCF) ". So I thought adding -ndiag should have an obvious speedup. To test, I artificially construct a 2x2 supercell of copper which contains 8 atoms, and run</div><div><br></div><div>mpiexec.hydra -n 12 pw.x -ndiag 9 -in <a href="http://cu_supercell.scf.in" target="_blank">cu_supercell.scf.in</a> > <span style="line-height:23.8px">cu_supercell.scf.out</span></div><div><span style="line-height:23.8px"><br></span></div><div><span style="line-height:23.8px">But I didn't see any improvement over no -ndiag run. The timing is almost the same. I check the output file, it does show that</span></div><div><span style="line-height:23.8px"><br></span></div><div>Subspace diagonalization in iterative solution of the eigenvalue problem:</div><div> one sub-group per band group will be used</div><div> scalapack distributed-memory algorithm (size of sub-group: 3* 3 procs)</div><div><span style="line-height:23.8px"><br></span></div><div><span style="line-height:23.8px">I also test a more practical case with a material of 23 atoms in primitive cell, and see no improvement. So I am wondering what is wrong? Is -ndiag not effective for scf? </span></div><div><span style="line-height:23.8px"><br></span></div><div><span style="line-height:23.8px"><br></span></div><div><span style="line-height:23.8px">best regards</span></div><div><span style="line-height:23.8px"><br></span></div><div><span style="line-height:23.8px"><br></span></div><div> </div><div id="m_-504595836601090975ntes-pcmail-signature" style="font-family:'\005fae\008f6f\0096c5\009ed1'"><font style="padding:0;margin:0"> </font>
</div><br>
</div>
</div>
</div>
<br>______________________________<wbr>_________________<br>
Pw_forum mailing list<br>
<a href="mailto:Pw_forum@pwscf.org">Pw_forum@pwscf.org</a><br>
<a href="http://pwscf.org/mailman/listinfo/pw_forum" rel="noreferrer" target="_blank">http://pwscf.org/mailman/<wbr>listinfo/pw_forum</a><br></blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div>Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,<br>Univ. Udine, via delle Scienze 208, 33100 Udine, Italy<br>Phone +39-0432-558216, fax +39-0432-558222<br><br></div></div></div></div></div>
</div>