[Pw_forum] Parallelization

vijaya subramanian vijaya65 at hotmail.com
Wed Jun 12 16:24:32 CEST 2013


Hi
Thanks for your response. I am willing to accept help from anyone:)
Paolo had helped me earlier regarding parallelization-
I'll try what you said.
Vijaya
UNM
> From: ttduyle at gmail.com
> Date: Wed, 12 Jun 2013 10:14:11 -0400
> To: pw_forum at pwscf.org
> Subject: Re: [Pw_forum] Parallelization
> 
> Just want to be clear, I am not Paolo !!!
> 
> If you need more memory, you should not increase number of cores to a
> huge number. Instead, you can ask for more nodes but use less number
> of cores per node.
> 
> For instant, you can ask for 16 nodes and use 6 cores per node. Check
> you environment, but it is highly that you need to use something like
> size = 192
> aprun -n 96 -N 6 pw.x ...
> 
> If you regret to waste 1/2 node, check OPENMP for options.
> 
> 
> ----------------------------------------------------
> Duy Le
> Postdoctoral Associate
> Department of Physics
> University of Central Florida.
> Website: http://www.physics.ucf.edu/~dle
> 
> 
> On Tue, Jun 11, 2013 at 3:57 PM, vijaya subramanian
> <vijaya65 at hotmail.com> wrote:
> > Hi Paolo
> > I am running an scf calculation on gold slabs. I have somewhat limited
> > resources on a supercomputer
> > and would like to optimize my runs.  (Cray XT5 with 9,408 compute nodes
> > interconnected with the SeaStar router through HyperTransport. The SeaStars
> > are all interconnected in a 3-D torus topology. It is a massively parallel
> > processing (MPP) machine. Each compute node has two six-core 2.6 GHz AMD
> > Opterons for a total of 112,896 cores. All nodes have 16 Gbytes of DDR2
> > memory: 1.33 Gbytes of memory per core.)
> > A 54 gold atom slab scf calculation worked best with 120
> > processors/npool=2/ndiag=49/ntg6.
> > 240 processors and I get very good speed.  64 processors and I get an out of
> > memory issue.
> > When I use a larger unit cell I run into problems.
> > I have attached two files with different configurations of gold atoms in a
> > slab calculation with larger unit cells.
> > The unit cells are different, one has six layers of gold atoms (unit cell -
> > 16.12x48.36x60.8 in Bohr) and the other 2 layers of gold atoms (unit
> > cell-54.x43.x54.).
> > For some reason I  cannot get the 160 atom problem to work. (>2000 still
> > doesn't work). For the 6 layer 162 atom problem(nproc=720 works).   If I use
> > fewer number of processors I get an out of memory
> > problem.
> > Do you have any suggestions for what the problem may be?
> >
> > I have given partial output for the two calcs below:
> > 160 atoms-1200 processors-the run failed before the diagonalization began.
> >      Parallelization info
> >      --------------------
> >      sticks:   dense  smooth     PW     G-vecs:    dense   smooth
> >      Min         105      31      8                24383     3975
> >      Max         106      32      9                24398     4042
> >      Sum       75823   22755   5881             17559633  2885465  37
> >
> >
> >      bravais-lattice index     =            0
> >      lattice parameter (alat)  =      54.5658  a.u.
> >      unit-cell volume          =  129972.7994 (a.u.)^3
> >      number of atoms/cell      =          160
> >      number of atomic types    =            1
> >      number of electrons       =      1760.00
> >      number of Kohn-Sham states=         2112
> >      kinetic-energy cutoff     =      30.0000  Ry
> >      charge density cutoff     =     400.0000  Ry
> >      convergence threshold     =      1.0E-06
> >      mixing beta               =       0.7000
> >      number of iterations used =            8  plain     mixing
> >      Exchange-correlation      =  SLA  PW   PBX  PBC ( 1 4 3 4 0)
> >      EXX-fraction              =        0.00
> > ........
> >      Dense  grid: 17559633 G-vectors     FFT dimensions: ( 360, 288, 360)
> >
> >      Smooth grid:  2885465 G-vectors     FFT dimensions: ( 192, 160, 192)
> >
> >      Largest allocated arrays     est. size (Mb)     dimensions
> >         Kohn-Sham Wavefunctions        32.87 Mb     (   1020, 2112)
> >         NL pseudopotentials            42.33 Mb     (    510, 5440)
> >         Each V/rho on FFT grid          1.58 Mb     ( 103680)
> >         Each G-vector array             0.19 Mb     (  24385)
> >         G-vector shells                 0.09 Mb     (  11350)
> >      Largest temporary arrays     est. size (Mb)     dimensions
> >         Auxiliary wavefunctions       131.48 Mb     (   1020, 8448)
> >         Each subspace H/S matrix        3.36 Mb     ( 469, 469)
> >         Each <psi_i|beta_j> matrix    350.63 Mb     (   5440,   2, 2112)
> >         Arrays for rho mixing          12.66 Mb     ( 103680,   8)
> >
> >      Initial potential from superposition of free atoms
> >      Check: negative starting charge=   -0.028620
> >
> >      starting charge 1759.98221, renormalised to 1760.00000
> >
> >      negative rho (up, down):  0.286E-01 0.000E+00
> >      Starting wfc are 2880 randomized atomic wfcs
> > Application 5992317 exit signals: Killed
> >
> > 162 atom run:
> >      Parallelization info
> >      --------------------
> >      sticks:   dense  smooth     PW     G-vecs:    dense   smooth      PW
> >      Min          34      10      2                 8950     1450     178
> >      Max          35      11      3                 8981     1509     229
> >      Sum       24841    7453   2003              6454371  1060521  148169
> >
> >
> >      bravais-lattice index     =            0
> >      lattice parameter (alat)  =      16.1227  a.u.
> >      unit-cell volume          =   47776.5825 (a.u.)^3
> >      number of atoms/cell      =          162
> >      number of atomic types    =            1
> >      number of electrons       =      1782.00
> >      number of Kohn-Sham states=         2138
> >      kinetic-energy cutoff     =      30.0000  Ry
> >      charge density cutoff     =     400.0000  Ry
> >      convergence threshold     =      1.0E-06
> >      mixing beta               =       0.7000
> >      number of iterations used =            8  plain     mixing
> >      Exchange-correlation      =  SLA  PW   PBX  PBC ( 1 4 3 4 0)
> >      EXX-fraction              =        0.00
> >      Non magnetic calculation with spin-orbit
> >
> >
> > _______________________________________________
> > Pw_forum mailing list
> > Pw_forum at pwscf.org
> > http://pwscf.org/mailman/listinfo/pw_forum
> _______________________________________________
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://pwscf.org/mailman/listinfo/pw_forum
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20130612/d07cd129/attachment.html>


More information about the users mailing list