[Pw_forum] gipaw parallel run is too slow

Lorenzo Paulatto Lorenzo.Paulatto at impmc.upmc.fr
Fri Nov 18 10:54:56 CET 2011


On Fri, 18 Nov 2011 06:44:47 +0100, Ren PJ <renpj at dicp.ac.cn> wrote:
> mpirun -np 160 gipaw.x -npool 5 <input>output
> It can't finished after 20 hours.
> ...
> mpirun -np 8 gipaw.x <input>output
> Each kpoint calculation only need 4h30min and takes 24h in total.

Dear Pengju Ren,
when you increase the numebr of processors you increase the available  
computing power, but also the amount of communications on the network  
connectign those nodes. As the network has a finite speed, there will be a  
point where the code looses more time exchanging information between the  
cpus than doing any calculation.

Possible solutions:
1. use fewer cpus (between 8 and 160 there is plenty of space)
2. try with a different number of pools (you could try  
nkpoints*ncores_per_node)
3. buy some faster network hardware

best regards

-- 
Lorenzo Paulatto IdR @ IMPMC/CNRS & Université Paris 6
phone: +33 (0)1 44275 084 / skype: paulatz
www:   http://www-int.impmc.upmc.fr/~paulatto/
mail:  23-24/4é16 Boîte courrier 115, 4 place Jussieu 75252 Paris Cédex 05



More information about the users mailing list