[Pw_forum] How to speed up the phonon calculation for a system of 320 atoms

Tue Jun 21 10:23:59 CEST 2016

Dear Quantum Espresso Developers and Users,

I need to run a calculation for a system of 320 atoms. However, I can use
only 4 cores (mpirun -np 4) on 2 nodes (Memory in Total: 63*2=126Gb),
otherwise ph.x will abort due to "insufficient virtual memory". The
following result is for one representation on 2 nodes.

Alpha used in Ewald sum =   2.0000
> PHONON       :    14h23m CPU       14h41m WALL
>
>
> Electric Fields Calculation
>
>  iter #   1 total cpu time : 68861.7 secs   av.it.:   7.0
>  thresh= 1.000E-02 alpha_mix =  0.700 |ddv_scf|^2 =  1.313E-09
>
>  iter #   2 total cpu time : 75961.1 secs   av.it.:  18.3
>  thresh= 3.623E-06 alpha_mix =  0.700 |ddv_scf|^2 =  1.249E-09
>
>  iter #   3 total cpu time : 82678.0 secs   av.it.:  17.3
>  thresh= 3.534E-06 alpha_mix =  0.700 |ddv_scf|^2 =  4.125E-11
>

Is it possible to reduce memory requirement thus I can use more cores to
speed up the calculation?

According to *Calculation of Phonon Dispersions on the GRID using Quantum
ESPRESSO*, R. di Meo, A. Dal Corso, P. Giannozzi, and S. Cozzini, in
Chemistry and Material Science Applications on Grid Infrastructures,
editors: S. Cozzini, A. Lagan`a, ICTP Lecture Notes Series, Vol. 24,
pp.165-183 (2009), we can use the shared memory approach.

> We finally note that, being the multicore/SMP architecture widespread as
> computational resources in Grid infrastructure it would be desirable to be
> able to run each process on SMP node, enabling thus parallel computation of
> ph.x using the shared memory approach (the same way we performed the
> client/server experiment on HPC platform).
>

However, when I search in the source code folder by `grep -r
MPI_Comm_alloc_mem ./`, none is found. I also notice if more processes are
launched, there will be additional memory requirement for the error
"insufficient virtual memory" will thrown by ph.x. Does it mean the shared
memory approach hasn't been implemented yet? I know the GRID concept in the
above paper has been implemented for I've successfully used `ph -nimage` to
make advantages of thousands of cores to speed up phonon calculation for a
smaller system thanks to the guidance of Dr. Ye Luo from Argonne National
Laboratory.

-- 
*Best regards,*

*Coiby*

*School of Earth and Space Sciences, USTC*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20160621/586f3aed/attachment.html>