[QE-users] efficient parallelization on a system without Infiniband

Wed May 27 17:58:31 CEST 2020

On Wed, May 27, 2020 at 4:27 PM Michal Krompiec <michal.krompiec at gmail.com>
wrote:

> Hello,
> How can I minimize inter-node MPI communication in a pw.x run? My
> system doesn't have Infiniband and inter-node MPI can easily become
> the bottleneck.
> Let's say, I'm running a calculation with 4 k-points, on 4 nodes, with
> 56 MPI tasks per node. I would then use -npool 4 to create 4 pools for
> the k-point parallelization. However, it seems that the
> diagonalization is by default parallelized imperfectly (or isn't it?):
>      Subspace diagonalization in iterative solution of the eigenvalue
> problem:
>      one sub-group per band group will be used
>      scalapack distributed-memory algorithm (size of sub-group:  7*  7
> procs)
> So far, speedup on 4 nodes vs 1 node is 3.26x. Is it normal or does it
> look like it can be improved?
>
> Best regards,
>
> Michal Krompiec
> Merck KGaA
> Southampton, UK
> _______________________________________________
> Quantum ESPRESSO is supported by MaX (www.max-centre.eu/quantum-espresso)
> users mailing list users at lists.quantum-espresso.org
> https://lists.quantum-espresso.org/mailman/listinfo/users
>

-- 
Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
Phone +39-0432-558216, fax +39-0432-558222
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20200527/8b875867/attachment.html>