[Pw_forum] k-points parallelization in pwscf 4.2.1

Davide Sangalli davide.sangalli at gmail.com
Mon Feb 14 10:59:05 CET 2011


Dear forum,
I did some check on the pwscf parallelization (version 4.2.1).

Here the results for the same calculations:
1 - serial
2 - np=6, npool=1 (i.e. fft grid parallelization)
3 - np=6, npool=6 (i.e. kpts parallelization)
(all processoros belong to the same node)

The fft parallelization is quite efficient.
It takes 1m56.75s vs 7m19.59s in the serial run.
On the other hand the kpts parallelization is not working.
It takes more than the serial run: 11m14.40s vs 7m19.59s

What could my problem be?

Best regards,
Davide


Here you can find some more details:
**************************************
1 - serial:
      Parallel version (MPI), running on     1 processors
      ...
      Proc/  planes cols     G    planes cols    G      columns  G
      Pool       (dense grid)       (smooth grid)      (wavefct grid)
         1    80   4469   230753   48   1633    51175    457     7513
      ...
      number of k points=     6  gaussian broad. (Ry)=  0.0200     
ngauss =   0
                        cart. coord. in units 2pi/a_0
         k(    1) = (   0.1250000   0.1250000   0.1213003), wk =   0.2500000
         k(    2) = (   0.1250000   0.1250000   0.3639010), wk =   0.2500000
         k(    3) = (   0.1250000   0.3750000   0.1213003), wk =   0.5000000
         k(    4) = (   0.1250000   0.3750000   0.3639010), wk =   0.5000000
         k(    5) = (   0.3750000   0.3750000   0.1213003), wk =   0.2500000
         k(    6) = (   0.3750000   0.3750000   0.3639010), wk =   0.2500000
      ...
      PWSCF        :  7m16.35s CPU time,     7m19.59s WALL time

*******************************************************************************
2 - fft parallelization:
      Parallel version (MPI), running on     6 processors
      R & G space division:  proc/pool =    6
      ...
      Proc/  planes cols     G    planes cols    G      columns  G
      Pool       (dense grid)       (smooth grid)      (wavefct grid)
         1    14    744    38458    8    272     8538     77     1253
         2    14    745    38459    8    272     8544     76     1252
         3    13    745    38459    8    273     8525     76     1252
         4    13    745    38459    8    272     8516     76     1252
         5    13    745    38459    8    272     8520     76     1252
         6    13    745    38459    8    272     8532     76     1252
      tot     80   4469   230753   48   1633    51175    457     7513
      ...
      number of k points=     6  gaussian broad. (Ry)=  0.0200     
ngauss =   0
                        cart. coord. in units 2pi/a_0
         k(    1) = (   0.1250000   0.1250000   0.1213003), wk =   0.2500000
         k(    2) = (   0.1250000   0.1250000   0.3639010), wk =   0.2500000
         k(    3) = (   0.1250000   0.3750000   0.1213003), wk =   0.5000000
         k(    4) = (   0.1250000   0.3750000   0.3639010), wk =   0.5000000
         k(    5) = (   0.3750000   0.3750000   0.1213003), wk =   0.2500000
         k(    6) = (   0.3750000   0.3750000   0.3639010), wk =   0.2500000
      ...
      PWSCF        :  1m49.61s CPU time,     1m56.75s WALL time

3 - kpts parallelization
      Parallel version (MPI), running on     6 processors
      K-points division:     npool     =    6
      ...
      Proc/  planes cols     G    planes cols    G      columns  G
      Pool       (dense grid)       (smooth grid)      (wavefct grid)
         1    80   4469   230753   48   1633    51175    457     7513
      ...
      number of k points=     6  gaussian broad. (Ry)=  0.0200     
ngauss =   0
                        cart. coord. in units 2pi/a_0
         k(    1) = (   0.1250000   0.1250000   0.1213003), wk =   0.2500000
         k(    2) = (   0.1250000   0.1250000   0.3639010), wk =   0.2500000
         k(    3) = (   0.1250000   0.3750000   0.1213003), wk =   0.5000000
         k(    4) = (   0.1250000   0.3750000   0.3639010), wk =   0.5000000
         k(    5) = (   0.3750000   0.3750000   0.1213003), wk =   0.2500000
         k(    6) = (   0.3750000   0.3750000   0.3639010), wk =   0.2500000
      ...
      PWSCF        : 10m57.44s CPU time,    11m14.40s WALL time




More information about the users mailing list