<div dir="ltr">Dear Prof. Paolo and Prof. Lorenzo,<div><br></div><div>Thank you for your thoughtful replies. I have carefully examined the program, and run it using the processor pools as well. It turns out that the memory swapping was the culprit; the total memory available for my usage was 32 GB. However, the program was using a virtual memory of over 90 GB, which was the bottleneck.</div><div><br></div><div>Thankfully, I was able to reduce the memory usage of the program by using the disk_io='high' and changing the mixing_ndim flag to 4, and now, the program seems to run fine.</div><div><br></div><div>I would like to express my gratitude to you for helping me in this matter.</div><div><br></div><div>Yours Sincerely,</div><div>M Harshavardhan</div><div>Fourth Year Undergraduate</div><div>Engineering Physics</div><div>IIT Madras</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Jun 29, 2017 at 3:22 PM, Paolo Giannozzi <span dir="ltr"><<a href="mailto:p.giannozzi@gmail.com" target="_blank">p.giannozzi@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Not sure it is a MPI problem: "fft_scatter" is where most of the<br>

communications take place, but its wall and cpu time are not so<br>

different. I think it is a problem of "swapping": the code requires<br>

(much) more memory than it is available, spending most of the time<br>

reading from disk the arrays it needs,  writing to disk those it<br>

doesn't need any longer. If disk_io='high', it might also be a problem<br>

of I/O.<br>

<br>

Paolo<br>

<div class="HOEnZb"><div class="h5"><br>

On Thu, Jun 29, 2017 at 10:48 AM, Lorenzo Paulatto<br>

<<a href="mailto:lorenzo.paulatto@impmc.upmc.fr">lorenzo.paulatto@impmc.upmc.<wbr>fr</a>> wrote:<br>

> [re-sending to mailing list as I answered privately by mistake]<br>

><br>

> Hello,<br>

><br>

> On 29/06/17 09:57, Harsha Vardhan wrote:<br>

>> I have observed that the c_bands and sum_bands routines are taking up<br>

>> a huge amount of wall time, as compared to the CPU time. I am<br>

>> attaching the time report for the completed calculation below:<br>

>><br>

><br>

> the high wall times indicates a lot of MPI communication, which means<br>

> that your simulation will probably run faster with less CPUs. Are you<br>

> using as many pools as possible? Pool parallelism requires less<br>

> communication. Here is an example syntax:<br>

><br>

> mpirun -np 16 pw.x -npool 16 -in input<br>

><br>

> The number of pool must be smaller than the number of CPUs and of the<br>

> number of k-points.<br>

><br>

> Also, having npool = n_kpoints - small_number is not a good idea, as<br>

> most CPUs will have one k-point, while only small_number will have two,<br>

> slowing everyone down (it would be more efficient to use less CPUS, i.e.<br>

> npool=ncpus=n_kpoints/2)<br>

><br>

> If you are already at maximum number of pools, you can try to reduce the<br>

> number of MPI process and use openmp instead, be sure that the code is<br>

> compiled with the --enable-openmp option and set the variable<br>

> OMP_NUM_THREADS to the ratio ncpus/n_mpi_processes, e.g. with 16 CPUs:<br>

><br>

> export OMP_NUM_THREADS=4<br>

><br>

> mpirun -x OMP_NUM_THREADS -np 4 pw.x -npool 4 -in input<br>

><br>

><br>

> Finally, are you sure that you need a 9x9x1 grid of kpoints for an 8x8x1<br>

> supercell of graphene? This would be equivalent to using a 72x72x1 grid<br>

> in the unit cell, which is quite enormous.<br>

><br>

><br>

> hth<br>

><br>

> --<br>

> Dr. Lorenzo Paulatto<br>

> IdR @ IMPMC -- CNRS & Université Paris 6<br>

> phone: +33 (0)1 442 79822 / skype: paulatz<br>

> www:   <a href="http://www-int.impmc.upmc.fr/~paulatto/" rel="noreferrer" target="_blank">http://www-int.impmc.upmc.fr/~<wbr>paulatto/</a><br>

> mail:  23-24/423 Boîte courrier 115, 4 place Jussieu 75252 Paris Cédex 05<br>

><br>

> ______________________________<wbr>_________________<br>

> Pw_forum mailing list<br>

> <a href="mailto:Pw_forum@pwscf.org">Pw_forum@pwscf.org</a><br>

> <a href="http://pwscf.org/mailman/listinfo/pw_forum" rel="noreferrer" target="_blank">http://pwscf.org/mailman/<wbr>listinfo/pw_forum</a><br>

<br>

<br>

<br>

</div></div><span class="HOEnZb"><font color="#888888">--<br>

Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,<br>

Univ. Udine, via delle Scienze 208, 33100 Udine, Italy<br>

Phone +39-0432-558216, fax +39-0432-558222<br>

</font></span><div class="HOEnZb"><div class="h5"><br>

______________________________<wbr>_________________<br>

Pw_forum mailing list<br>

<a href="mailto:Pw_forum@pwscf.org">Pw_forum@pwscf.org</a><br>

<a href="http://pwscf.org/mailman/listinfo/pw_forum" rel="noreferrer" target="_blank">http://pwscf.org/mailman/<wbr>listinfo/pw_forum</a></div></div></blockquote></div><br></div>