[Pw_forum] Parallel bandstructure calculations

hqzhou hqzhou at nju.edu.cn
Tue Aug 24 04:02:37 CEST 2010


You have 32 CPU cores that devided into 4 pools, so you have only
8 CPU cores for each pool. 8 is not a square of any integer, and
it's why it used serial algorithm.

you need to run

mpirun -np 32 pw.x -npool 2 -ndiag 4

BUT, you'd better make tests to determine the best combination.
For my case of a 56 atom system with spin polarization, I found
the best one (parallel efficiency 79%) is 16 CPU cores (Xeon 5550) 
for each pool, serial algorithm, 64 CPU cores in total, that is,

mpirun -np 64 pw.x -npool 4

huiqun zhou
@earth sciences, nanjing university, china

----- 原始邮件 -----
发件人: "Nicki Frank Hinsche" <nicvok at freenet.de>
收件人: "pw forum" <pw_forum at pwscf.org>
发送时间: 星期一, 2010年 8 月 23日 下午 10:32:13
主题: [Pw_forum] Parallel bandstructure calculations

Hi there,

I am currently doing calculations of iso-energy surfaces on doped  
semiconductors. Therefore I generate with an external program a quite  
big k-point mesh for which I want to determine the eigenvalues and  
later on construct the iso-energy surface with a tetrahedron method.  
My problem is the running time of the bandstructure calculation.

The size of the unit (super)-cells is in the order of 30-50 atoms,  
containing 1 or 2 different atomic species. For k-points in the order  
of 4000-6000 the eigenvalues have to be calculated (most often around  
50-100 ev's for each k-point).

After the scf-calculation is done quite fast, I am running the nscf  
bandstructure calc. with the command

mpirun -np 32 pw.x -npool 4 -diag 16

but the calculation isn't done parallel, as the output says:

      Parallel version (MPI), running on    32 processors
      K-points division:     npool     =    4
      R & G space division:  proc/pool =    8

      Subspace diagonalization in iterative solution of the eigenvalue  
      a serial algorithm will be used

due to this, the calculation runs much longer than 72 hours...to long  
for me and our cluster system

So is there a possibility to parallelize the bandstructure calculation  
efficiently and to reduce tje calculation time?

thanks in advance,


Nicki Frank Hinsche, Dipl. Phys.
Institute of physics - Theoretical physics,
Martin-Luther-University Halle-Wittenberg,
Von-Seckendorff-Platz 1, Room 1.07
D-06120 Halle/Saale, Germany
Tel.: ++49 345 5525462
Fellow of the International Max Planck Re-
search School-MPI for Microstructure Physics

Pw_forum mailing list
Pw_forum at pwscf.org

More information about the users mailing list