<div dir="ltr"><div dir="ltr"><div>Parallelization over k-points does very little communication but it is not as effective as plane-wave parallelization in distributing memory. I also noticed that on a typical multi-core processor the performances of k-point parallelization are often less good than those of plane-wave parallelization and sometimes much less good, for reasons that are not completely clear to me.</div><div><br></div><div>A factor to be considered is how your machine distributes the pools across the nodes: each of the 4 pools of 32 processors should stay on one of the nodes, but I wouldn't be too sure that this is what is really happening.<br></div><div><br></div><div>In your test, there is an anomaly, though: most of the time of "c_bands" (computing the band structure) should be spent in "cegterg" (iterative diagonalization). With 4*8 processors:</div><div> c_bands : 14153.20s CPU 14557.65s WALL ( 461 calls)</div><div> Called by c_bands:<br> init_us_2 : 102.63s CPU 105.55s WALL ( 1952 calls)<br> cegterg : 12700.70s CPU 13083.44s WALL ( 943 calls)</div><div>only 10% of the time is spent somewhere else, while with 4*32 processors:<br></div><div> c_bands : 18068.08s CPU 18219.06s WALL ( 454 calls)<br> Called by c_bands:<br> init_us_2 : 26.53s CPU 27.06s WALL ( 1924 calls)<br> cegterg : 2422.03s CPU 2451.72s WALL</div><div>75% of the time is not accounted for.</div><div><br></div><div>Paolo<br></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Feb 12, 2021 at 5:01 AM Christoph Wolf <wolf.christoph@qns.science> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Dear all,<div><br></div><div>I tested k-point parallelization and I wonder if the following results can be normal or if my cluster has some serious problems...</div><div><br></div><div>the system has 74 atoms and a 2x2x1 k-point grid resulting in 4 k-points</div><div><br></div><div> number of k points= 4 Fermi-Dirac smearing, width (Ry)= 0.0050<br> cart. coord. in units 2pi/alat<br> k( 1) = ( 0.0000000 0.0000000 0.0000000), wk = 0.2500000<br> k( 2) = ( 0.3535534 -0.3535534 0.0000000), wk = 0.2500000<br> k( 3) = ( 0.0000000 -0.7071068 0.0000000), wk = 0.2500000<br> k( 4) = ( -0.3535534 -0.3535534 0.0000000), wk = 0.2500000<br></div><div><br></div><div><br></div><div>1) run on 1 node x 32 CPUs with -nk 4</div><div> Parallel version (MPI), running on 32 processors<br><br> MPI processes distributed on 1 nodes<br> K-points division: npool = 4<br> R & G space division: proc/nbgrp/npool/nimage = 8<br> Fft bands division: nmany = 1<br></div><div><br></div><div> PWSCF : 5h42m CPU 6h 3m WALL</div><div><br></div><div><br></div><div>2) run on 4 nodes x 32 CPUs with -nk 4</div><div> Parallel version (MPI), running on 128 processors<br><br> MPI processes distributed on 4 nodes<br> K-points division: npool = 4<br> R & G space division: proc/nbgrp/npool/nimage = 32<br> Fft bands division: nmany = 1<br></div><div><br></div><div><div> PWSCF : 6h32m CPU 6h36m WALL<br></div></div><div><br></div><div>I compiled my pwscf with intel 19 MKL, MPI and OpenMP. If I understood correctly, -nk parallelization should work well as there is not much communication between nodes but this does not seem to work for me at all... detailed timing logs are attached!</div><div><br></div><div>TIA!</div><div>Chris<br clear="all"><div><br></div>-- <div dir="ltr"><div dir="ltr">IBS Center for Quantum Nanoscience<br>Seoul, South Korea<blockquote type="cite" style="font-size:12.8px"><div dir="ltr"><div><div dir="ltr"></div></div></div></blockquote></div></div></div></div>
_______________________________________________<br>
Quantum ESPRESSO is supported by MaX (<a href="http://www.max-centre.eu" rel="noreferrer" target="_blank">www.max-centre.eu</a>)<br>
users mailing list <a href="mailto:users@lists.quantum-espresso.org" target="_blank">users@lists.quantum-espresso.org</a><br>
<a href="https://lists.quantum-espresso.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.quantum-espresso.org/mailman/listinfo/users</a></blockquote></div><br clear="all"><br>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div>Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,<br>Univ. Udine, via delle Scienze 206, 33100 Udine, Italy<br>Phone +39-0432-558216, fax +39-0432-558222<br><br></div></div></div></div></div>