[QE-developers] Parallelizing linear response calculations over optimal number of pools

Mon Jul 20 16:10:05 CEST 2020

Dear QE developers,
I have a question concerning the pool-parallelization for linear response calculations, in particular in ph.x and hp.x. 

Due to symmetry, every q-point has potentially a different number of k-points (and k+q), and there are different maximal number of pools one could parallelize over for the associated nscf calculation. Ideally, one would choose -npools equal to the number of k-points of every individual q. However, currently -npools is limited to the number of k-points of the preceding SCF calculation, which usually has the lowest number of kpoints. 

In fact, the bottleneck stems from when the wavefunction is read from the SCF calculation through "PW/src/read_file_new.f90”. There, "divide_et_impera” is called which fails if -npools is larger than nkstot from the SCF. Later, for the nscf calculation, the nkstot increases and “divide_et_impera” will happily parallelize over larger -npools. 

Now my question is, can we circumvent the limitation of the number of pools when reading the wave function? Or can we call that specific part with a lower number of pools (i.e., min(npools, nkstot)), and later switch back to the intended pool size for every q-point?

I believe implementing this feature would help run linear response calculations much more efficiently with very large number of MPI tasks.

Thanks you very much

Max Amsler
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/developers/attachments/20200720/30ae55e7/attachment.html>