[Pw_forum] gipaw parallel run is too slow
Lorenzo Paulatto
Lorenzo.Paulatto at impmc.upmc.fr
Fri Nov 18 10:54:56 CET 2011
On Fri, 18 Nov 2011 06:44:47 +0100, Ren PJ <renpj at dicp.ac.cn> wrote:
> mpirun -np 160 gipaw.x -npool 5 <input>output
> It can't finished after 20 hours.
> ...
> mpirun -np 8 gipaw.x <input>output
> Each kpoint calculation only need 4h30min and takes 24h in total.
Dear Pengju Ren,
when you increase the numebr of processors you increase the available
computing power, but also the amount of communications on the network
connectign those nodes. As the network has a finite speed, there will be a
point where the code looses more time exchanging information between the
cpus than doing any calculation.
Possible solutions:
1. use fewer cpus (between 8 and 160 there is plenty of space)
2. try with a different number of pools (you could try
nkpoints*ncores_per_node)
3. buy some faster network hardware
best regards
--
Lorenzo Paulatto IdR @ IMPMC/CNRS & Université Paris 6
phone: +33 (0)1 44275 084 / skype: paulatz
www: http://www-int.impmc.upmc.fr/~paulatto/
mail: 23-24/4é16 Boîte courrier 115, 4 place Jussieu 75252 Paris Cédex 05
More information about the users
mailing list