<div dir="ltr"><div>I think that the statement "npools must be a divisor of the total number of k-points" in the slides is inaccurate.</div><div>npools needs to be less than or equal to the total number of k-points because every pool must have at least 1 k-point to work on.</div><div>npool not being a divisor doesn't cause a correctness issue.<br><div>A non-divisor causes more imbalance and reduces efficiency because some pools work on less number of k-points and become idle at pool synchronization points once their assigned k-points are completed.</div></div><div>In practice, the workload of each k-point differs, even if npool is a divisor, there is additional imbalance in the calculation.</div><div>So select <span class="gmail-LI gmail-ng gmail-Vt gmail-Vs">npool as a divisor is a recommendation for getting better performance instead of a requirement.<br></span></div><div><br></div><div>Ye<br></div><div></div><div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr">===================<br>
Ye Luo, Ph.D.<br>Computational Science Division & Leadership Computing Facility<br>
Argonne National Laboratory</div></div></div></div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Jan 5, 2021 at 5:16 PM Andrew Xu <<a href="mailto:andrewaccount@gmail.com">andrewaccount@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>Hi users,</div><div><br></div><div>Does npool need to divide the number of k-points after symmetry operations are performed? In a tutorial I saw online (<a href="https://materials.prace-ri.eu/497/7/QE__main_strategies_of_parallelization_and_levels_of_parallelisms.pdf" target="_blank">https://materials.prace-ri.eu/497/7/QE__main_strategies_of_parallelization_and_levels_of_parallelisms.pdf</a>): "By definition, npools must be a divisor of the total number of k-points."</div><div><br></div><div>In a calculation I ran (relevant output below), I accidentally sent npool = 16, which does not divide the total number of k points after symmetry operations (35), but I got no errors. Am I misunderstanding something here?<br></div><div><br></div><div>Best,</div><div>Andrew</div><div><br></div><div>------------------------------<br><br> Parallel version (MPI & OpenMP), running on 80 processor cores<br> Number of MPI processes: 80<br> Threads/MPI process: 1<br> K-points division: npool = 16<br> R & G space division: proc/nbgrp/npool/nimage = 5<br> Reading input from <a href="http://pw_oncv_pbe0.in" target="_blank">pw_oncv_pbe0.in</a><br><br> Current dimensions of program PWSCF are:<br> Max number of different atomic species (ntypx) = 10<br> Max number of k-points (npk) = 40000<br> Max angular momentum in pseudopotentials (lmaxx) = 3<br><br> IMPORTANT: XC functional enforced from input :<br> Exchange-correlation = PBE0 ( 6 4 8 4 0 0)<br> EXX-fraction = 0.25<br> Any further DFT definition will be discarded<br> Please, verify this is what you really want<br><br><br> Subspace diagonalization in iterative solution of the eigenvalue problem:<br> a serial algorithm will be used<br><br> EXX: setup a grid of 512 q-points centered on each k-point<br> (set verbosity='high' to see the list)<br> <br> Parallelization info<br> --------------------<br> sticks: dense smooth PW G-vecs: dense smooth PW<br> Min 194 194 58 4548 4548 756<br> Max 196 196 59 4550 4550 759<br> Sum 973 973 293 22743 22743 3791<br> <br><br><br> bravais-lattice index = 1<br> lattice parameter (alat) = 7.1240 a.u.<br> unit-cell volume = 361.5528 (a.u.)^3<br> number of atoms/cell = 4<br> number of atomic types = 2<br> number of electrons = 46.00<br> number of Kohn-Sham states= 26<br> kinetic-energy cutoff = 60.0000 Ry<br> charge density cutoff = 240.0000 Ry<br> cutoff for Fock operator = 240.0000 Ry<br> convergence threshold = 1.0E-08<br> mixing beta = 0.7000<br> number of iterations used = 8 plain mixing<br> Exchange-correlation = PBE0 ( 6 4 8 4 0 0)<br> EXX-fraction = 0.25<br></div><div>...<br><br> atomic species valence mass pseudopotential<br> O 6.00 15.99940 O ( 1.00)<br> W 28.00 183.85000 W ( 1.00)<br><br> 48 Sym. Ops., with inversion, found<br><br><br><br> Cartesian axes<br><br> site n. atom positions (alat units)<br> 1 W tau( 1) = ( 0.0000000 0.0000000 0.0000000 )<br> 2 O tau( 2) = ( 0.0000000 0.5000000 0.0000000 )<br> 3 O tau( 3) = ( 0.5000000 0.0000000 0.0000000 )<br> 4 O tau( 4) = ( 0.0000000 0.0000000 0.5000000 )<br><br> number of k points= 35<br> cart. coord. in units 2pi/alat<br> k( 1) = ( 0.0000000 0.0000000 0.0000000), wk = 0.0039062<br> k( 2) = ( 0.0000000 0.0000000 0.1250000), wk = 0.0234375<br> k( 3) = ( 0.0000000 0.0000000 0.2500000), wk = 0.0234375<br> k( 4) = ( 0.0000000 0.0000000 0.3750000), wk = 0.0234375<br> k( 5) = ( 0.0000000 0.0000000 -0.5000000), wk = 0.0117188<br> k( 6) = ( 0.0000000 0.1250000 0.1250000), wk = 0.0468750<br> k( 7) = ( 0.0000000 0.1250000 0.2500000), wk = 0.0937500<br> k( 8) = ( 0.0000000 0.1250000 0.3750000), wk = 0.0937500<br> k( 9) = ( 0.0000000 0.1250000 -0.5000000), wk = 0.0468750<br> k( 10) = ( 0.0000000 0.2500000 0.2500000), wk = 0.0468750<br> k( 11) = ( 0.0000000 0.2500000 0.3750000), wk = 0.0937500<br> k( 12) = ( 0.0000000 0.2500000 -0.5000000), wk = 0.0468750<br> k( 13) = ( 0.0000000 0.3750000 0.3750000), wk = 0.0468750<br> k( 14) = ( 0.0000000 0.3750000 -0.5000000), wk = 0.0468750<br> k( 15) = ( 0.0000000 -0.5000000 -0.5000000), wk = 0.0117188<br> k( 16) = ( 0.1250000 0.1250000 0.1250000), wk = 0.0312500<br> k( 17) = ( 0.1250000 0.1250000 0.2500000), wk = 0.0937500<br> k( 18) = ( 0.1250000 0.1250000 0.3750000), wk = 0.0937500<br> k( 19) = ( 0.1250000 0.1250000 -0.5000000), wk = 0.0468750<br> k( 20) = ( 0.1250000 0.2500000 0.2500000), wk = 0.0937500<br> k( 21) = ( 0.1250000 0.2500000 0.3750000), wk = 0.1875000<br> k( 22) = ( 0.1250000 0.2500000 -0.5000000), wk = 0.0937500<br> k( 23) = ( 0.1250000 0.3750000 0.3750000), wk = 0.0937500<br> k( 24) = ( 0.1250000 0.3750000 -0.5000000), wk = 0.0937500<br> k( 25) = ( 0.1250000 -0.5000000 -0.5000000), wk = 0.0234375<br> k( 26) = ( 0.2500000 0.2500000 0.2500000), wk = 0.0312500<br> k( 27) = ( 0.2500000 0.2500000 0.3750000), wk = 0.0937500<br> k( 28) = ( 0.2500000 0.2500000 -0.5000000), wk = 0.0468750<br> k( 29) = ( 0.2500000 0.3750000 0.3750000), wk = 0.0937500<br> k( 30) = ( 0.2500000 0.3750000 -0.5000000), wk = 0.0937500<br> k( 31) = ( 0.2500000 -0.5000000 -0.5000000), wk = 0.0234375<br> k( 32) = ( 0.3750000 0.3750000 0.3750000), wk = 0.0312500<br> k( 33) = ( 0.3750000 0.3750000 -0.5000000), wk = 0.0468750<br> k( 34) = ( 0.3750000 -0.5000000 -0.5000000), wk = 0.0234375<br> k( 35) = ( -0.5000000 -0.5000000 -0.5000000), wk = 0.0039062<br><br> Dense grid: 22743 G-vectors FFT dimensions: ( 36, 36, 36)<br><br> Estimated max dynamical RAM per process > 14760.85MB<br><br> Estimated total allocated dynamical RAM > 1180867.90MB</div><div>....<br></div></div>
_______________________________________________<br>
Quantum ESPRESSO is supported by MaX (<a href="http://www.max-centre.eu" rel="noreferrer" target="_blank">www.max-centre.eu</a>)<br>
users mailing list <a href="mailto:users@lists.quantum-espresso.org" target="_blank">users@lists.quantum-espresso.org</a><br>
<a href="https://lists.quantum-espresso.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.quantum-espresso.org/mailman/listinfo/users</a></blockquote></div>