[Pw_forum] Parallelization

Paolo Giannozzi paolo.giannozzi at uniud.it
Wed Jun 12 18:30:23 CEST 2013

On Tue, 2013-06-11 at 19:57 +0000, vijaya subramanian wrote:

> Hi Paolo

you know, there are 1605 subscribed user on the pw_forum mailing list.
Even if part of them are actually disabled, it is a lot of people. 
Why do you address to me?

Your unit cells are quite large, your cutoff is not small, and you
use spin-orbit, a feature that increases the memory footprint and 
is less optimized than "plain-vanilla" calculations. In order to 
run such large jobs, one needs to know quite a bit about the
inner working of parallelization, which arrays are distributed,
which are not ... The following arrays, for instance:

>         Each <psi_i|beta_j> matrix    350.63 Mb     (   5440, 2, 2112)

are not distributed. This is the kind of arrays that causes bottlenecks.
If you have N mpi processes per node, you have N such arrays filling
the same physical memory. Reducing the number of MPI processes per node
and using OpenMP instead might be a good strategy.

 Paolo Giannozzi, Dept. Chemistry&Physics&Environment, 
 Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
 Phone +39-0432-558216, fax +39-0432-558222 

More information about the users mailing list