[Pw_forum] parallel scaling in PWSCF

Axel Kohlmeyer akohlmey at vitae.cmm.upenn.edu
Sat May 13 00:07:44 CEST 2006


On Thu, 11 May 2006, Nichols A. Romero wrote:

NR> Hi,

hi nichols,

NR> As I was digging through PWSCF and I noticed that in para.f90, the
NR> maximum number of processors is coded to 128. But that there is a
NR> switch that can be used to set the maximum to 2048.
NR> 
NR> -D__QK_USER__

this flag is for the cray xt3 (and red storm) machines.

NR> This is going to sound like a dumb question, but is this safe. Will
NR> the diagonalization routines on a gamma point calculation have issues
NR> if one goes up to 256 or 512. I you had many k-points and
NR> spin-polarization that wouldn't so much of an issue I think.

exactly. however, the xt3 is a special beast. since you have no
swap, no local disc, things are very different. the diagonalization
(at least when i cranked up the max nodes number) didn't work well
with too many nodes per k-point, but due to the machine design
issues, you need to go across many nodes, you you have additional
memory to buffer the wavefunction file access (and get a decent
performance at all).

for most machines however it makes no sense to go up to so many
nodes, hence the define. nevertheless, the beauty of open source
software lies in the fact, that you have access to the source code
and can tune it to your own needs. feel free to crank up the 
maximum allowed number of processors and tell up about your 
experiences.

best regards,
    axel.


NR> 
NR> Any comments?


NR> 
NR> On 4/26/06, carlo sbraccia <sbraccia at sissa.it> wrote:
NR> > Hi,
NR> >
NR> > beware using the CVS version of PWSCF: the parallel Davidson has not yet
NR> > been fully tested and seems to be buggy.
NR> > For this reason we are going to disable it until we shall be able to
NR> > solve the problem. To avoid the serial bottleneck in the diagonalization
NR> > you can use conjugate-gradient.
NR> >
NR> > carlo
NR> >
NR> > Andrea Ferretti wrote:
NR> >
NR> > >Hi everybody,
NR> > >
NR> > >I am currently running a Copper surface with 140 Cu atoms + a molecule...
NR> > >the system has 1642 electrons and (due to metallicity) the calculation is
NR> > >performed for 985 bands (few kpt, like 4)...
NR> > >due to the 11 electrons for each Cu atom, I have a huge number of bands in
NR> > >a (relatively) small cell, and so a (relatively) low number of PWs respect
NR> > >to nbnd.
NR> > >taking a look at the dimension of wfc, no problem with memory in
NR> > >principle, even if, due to the weird
NR> > >dimensions of the system, non-scalable memory is quite large, around 1Gb.
NR> > >
NR> > >on a IBM Sp5 machine I observed a severe limit in the scaling passing from
NR> > >32 to 64 procs using both espresso 2.1.x and espresso 3.0...
NR> > >( anyway, I succeeded in performing a "relax" calculation for the system
NR> > >!!!! )
NR> > >
NR> > >as far as I know, this problem might be connected to a serial part in the
NR> > >diagonalization which has been parallelized in the current CVS version
NR> > >(as already pointed out by Axel)...
NR> > >At the moment I am testing this CVS version against my system, I will let
NR> > >you know the results as soon as possible...
NR> > >
NR> > >cheers
NR> > >andrea
NR> > >
NR> > >
NR> > >
NR> > >
NR> >
NR> > _______________________________________________
NR> > Pw_forum mailing list
NR> > Pw_forum at pwscf.org
NR> > http://www.democritos.it/mailman/listinfo/pw_forum
NR> >
NR> 
NR> 
NR> 

-- 
=======================================================================
Axel Kohlmeyer   akohlmey at cmm.chem.upenn.edu   http://www.cmm.upenn.edu
   Center for Molecular Modeling   --   University of Pennsylvania
Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323
tel: 1-215-898-1582,  fax: 1-215-573-6233,  office-tel: 1-215-898-5425
=======================================================================
If you make something idiot-proof, the universe creates a better idiot.





More information about the users mailing list