[Pw_forum] Large difference between the CPU and wall time in the c_bands and sum_bands routines

Paolo Giannozzi p.giannozzi at gmail.com
Thu Jun 29 11:52:16 CEST 2017


Not sure it is a MPI problem: "fft_scatter" is where most of the
communications take place, but its wall and cpu time are not so
different. I think it is a problem of "swapping": the code requires
(much) more memory than it is available, spending most of the time
reading from disk the arrays it needs,  writing to disk those it
doesn't need any longer. If disk_io='high', it might also be a problem
of I/O.

Paolo

On Thu, Jun 29, 2017 at 10:48 AM, Lorenzo Paulatto
<lorenzo.paulatto at impmc.upmc.fr> wrote:
> [re-sending to mailing list as I answered privately by mistake]
>
> Hello,
>
> On 29/06/17 09:57, Harsha Vardhan wrote:
>> I have observed that the c_bands and sum_bands routines are taking up
>> a huge amount of wall time, as compared to the CPU time. I am
>> attaching the time report for the completed calculation below:
>>
>
> the high wall times indicates a lot of MPI communication, which means
> that your simulation will probably run faster with less CPUs. Are you
> using as many pools as possible? Pool parallelism requires less
> communication. Here is an example syntax:
>
> mpirun -np 16 pw.x -npool 16 -in input
>
> The number of pool must be smaller than the number of CPUs and of the
> number of k-points.
>
> Also, having npool = n_kpoints - small_number is not a good idea, as
> most CPUs will have one k-point, while only small_number will have two,
> slowing everyone down (it would be more efficient to use less CPUS, i.e.
> npool=ncpus=n_kpoints/2)
>
> If you are already at maximum number of pools, you can try to reduce the
> number of MPI process and use openmp instead, be sure that the code is
> compiled with the --enable-openmp option and set the variable
> OMP_NUM_THREADS to the ratio ncpus/n_mpi_processes, e.g. with 16 CPUs:
>
> export OMP_NUM_THREADS=4
>
> mpirun -x OMP_NUM_THREADS -np 4 pw.x -npool 4 -in input
>
>
> Finally, are you sure that you need a 9x9x1 grid of kpoints for an 8x8x1
> supercell of graphene? This would be equivalent to using a 72x72x1 grid
> in the unit cell, which is quite enormous.
>
>
> hth
>
> --
> Dr. Lorenzo Paulatto
> IdR @ IMPMC -- CNRS & Université Paris 6
> phone: +33 (0)1 442 79822 / skype: paulatz
> www:   http://www-int.impmc.upmc.fr/~paulatto/
> mail:  23-24/423 Boîte courrier 115, 4 place Jussieu 75252 Paris Cédex 05
>
> _______________________________________________
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://pwscf.org/mailman/listinfo/pw_forum



-- 
Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
Phone +39-0432-558216, fax +39-0432-558222




More information about the users mailing list