[Pw_forum] How to parallelize phonons

Wed Mar 28 21:47:27 CEST 2012

Dear Alejandro,

The images in phonon calculations are groupings of the irreducible representations of the phonon modes.  Hence, the maximum number of images is the number of representations.  

There is a speedup as one increases the number of images parallelized over, but the speedup increases irregularly with image number since different representations require different amounts of time to calculate.  In my limited experience, the time taken by single representations can vary at least up to a factor of two.

The second step does generate all of the right dynamical matrix entries, but I have been unable to get the recover step to complete successfully yet, though I am currently working on this (if anyone else on the list has any comments on this, please speak up).   

Running on parallel with one image but parallelizing over k-points and FFT grid works properly.  I don't have any experience with the GRID method so perhaps someone else can say something about that.  

I'm not aware of any more documentation beyond what you quoted.

A simple example I run my tests on is two-atom Si with a 2x1x1 q-mesh (two q-points, six representations), parallelizing over two images.
With only one CPU per k-point pool, the commands look like:

mpiexec -n 1 pw.x -npool 1 -inp si.scf.cg.in >& si.scf.cg.out
mpiexec -n 2 ph.x -nimage 2 -npool 1 -inp si.ph.in >& si.ph.out
mpiexec -n 1 ph.x -nimage 1 -npool 1 -inp si.ph_recover.in >& si.ph_recover.out

where "diff si.ph.in si.ph_recover.in" returns "recover = .true."

--William

On Mar 28, 2012, at 1:34 PM, Alejandro Rébola wrote:

> Dear all,
> 
> I'm trying to run phonon calculations using ph.w on a cluster (NERSC), and
> since I have a big number of atoms I wanted to parallelize it as much and
> efficiently as possible. I've been reading the documentation, and I've
> seen the GRID example, I was going to use this but then I saw the
> following at the end of the INPUT_PH.html file:
> 
> On parallel machines the q point and the irreps calculations can be split
> automatically. The procedure is the following:
> 
> " [...]
> 
> 1) run pw.x with nproc processors and npools pools.
> 2) run ph.x with nproc*nimage processors, npools pools and nimage images.
> 3) run ph.x with the same input and recover=.true. on nproc processors
>   and npools pools and only one image.
> 
> During the first ph.x run the phonon code split the total amount of
> work into nimage copies. Each image runs with different q and/or
> representations.  The second run of ph.x is the final run that
> collects all the data calculated by the images and writes the files
> with the dynamical matrices."
> 
> Since I'm not very familiar with the terminology used here (I'm completely
> lost) I would like to ask some questions:
> 1) If I'm running on a cluster like NERSC, would I get any advantage from
> using the GRID method or should I just use the method outlined above?
> 2) In that case, what does it mean by images for a phonon calculation (I'm
> just familiar with images for NEB). Where could I find more documentation
> about this or some example?
> Thank you in advance,
> 
> Alejandro Rébola
> 
> 
> _______________________________________________
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://www.democritos.it/mailman/listinfo/pw_forum

*********************************************************
  William D. Parker                 phone: (630) 252-4834
  Computational Postdoctoral Fellow   fax: (630) 252-4798
  MSD-212, Rm. C-215
  Argonne National Laboratory
  9700 S. Cass Ave.
  Argonne, IL 60439
*********************************************************