[Pw_forum] Parallelization
vijaya subramanian
vijaya65 at hotmail.com
Wed Jun 12 16:24:32 CEST 2013
Hi
Thanks for your response. I am willing to accept help from anyone:)
Paolo had helped me earlier regarding parallelization-
I'll try what you said.
Vijaya
UNM
> From: ttduyle at gmail.com
> Date: Wed, 12 Jun 2013 10:14:11 -0400
> To: pw_forum at pwscf.org
> Subject: Re: [Pw_forum] Parallelization
>
> Just want to be clear, I am not Paolo !!!
>
> If you need more memory, you should not increase number of cores to a
> huge number. Instead, you can ask for more nodes but use less number
> of cores per node.
>
> For instant, you can ask for 16 nodes and use 6 cores per node. Check
> you environment, but it is highly that you need to use something like
> size = 192
> aprun -n 96 -N 6 pw.x ...
>
> If you regret to waste 1/2 node, check OPENMP for options.
>
>
> ----------------------------------------------------
> Duy Le
> Postdoctoral Associate
> Department of Physics
> University of Central Florida.
> Website: http://www.physics.ucf.edu/~dle
>
>
> On Tue, Jun 11, 2013 at 3:57 PM, vijaya subramanian
> <vijaya65 at hotmail.com> wrote:
> > Hi Paolo
> > I am running an scf calculation on gold slabs. I have somewhat limited
> > resources on a supercomputer
> > and would like to optimize my runs. (Cray XT5 with 9,408 compute nodes
> > interconnected with the SeaStar router through HyperTransport. The SeaStars
> > are all interconnected in a 3-D torus topology. It is a massively parallel
> > processing (MPP) machine. Each compute node has two six-core 2.6 GHz AMD
> > Opterons for a total of 112,896 cores. All nodes have 16 Gbytes of DDR2
> > memory: 1.33 Gbytes of memory per core.)
> > A 54 gold atom slab scf calculation worked best with 120
> > processors/npool=2/ndiag=49/ntg6.
> > 240 processors and I get very good speed. 64 processors and I get an out of
> > memory issue.
> > When I use a larger unit cell I run into problems.
> > I have attached two files with different configurations of gold atoms in a
> > slab calculation with larger unit cells.
> > The unit cells are different, one has six layers of gold atoms (unit cell -
> > 16.12x48.36x60.8 in Bohr) and the other 2 layers of gold atoms (unit
> > cell-54.x43.x54.).
> > For some reason I cannot get the 160 atom problem to work. (>2000 still
> > doesn't work). For the 6 layer 162 atom problem(nproc=720 works). If I use
> > fewer number of processors I get an out of memory
> > problem.
> > Do you have any suggestions for what the problem may be?
> >
> > I have given partial output for the two calcs below:
> > 160 atoms-1200 processors-the run failed before the diagonalization began.
> > Parallelization info
> > --------------------
> > sticks: dense smooth PW G-vecs: dense smooth
> > Min 105 31 8 24383 3975
> > Max 106 32 9 24398 4042
> > Sum 75823 22755 5881 17559633 2885465 37
> >
> >
> > bravais-lattice index = 0
> > lattice parameter (alat) = 54.5658 a.u.
> > unit-cell volume = 129972.7994 (a.u.)^3
> > number of atoms/cell = 160
> > number of atomic types = 1
> > number of electrons = 1760.00
> > number of Kohn-Sham states= 2112
> > kinetic-energy cutoff = 30.0000 Ry
> > charge density cutoff = 400.0000 Ry
> > convergence threshold = 1.0E-06
> > mixing beta = 0.7000
> > number of iterations used = 8 plain mixing
> > Exchange-correlation = SLA PW PBX PBC ( 1 4 3 4 0)
> > EXX-fraction = 0.00
> > ........
> > Dense grid: 17559633 G-vectors FFT dimensions: ( 360, 288, 360)
> >
> > Smooth grid: 2885465 G-vectors FFT dimensions: ( 192, 160, 192)
> >
> > Largest allocated arrays est. size (Mb) dimensions
> > Kohn-Sham Wavefunctions 32.87 Mb ( 1020, 2112)
> > NL pseudopotentials 42.33 Mb ( 510, 5440)
> > Each V/rho on FFT grid 1.58 Mb ( 103680)
> > Each G-vector array 0.19 Mb ( 24385)
> > G-vector shells 0.09 Mb ( 11350)
> > Largest temporary arrays est. size (Mb) dimensions
> > Auxiliary wavefunctions 131.48 Mb ( 1020, 8448)
> > Each subspace H/S matrix 3.36 Mb ( 469, 469)
> > Each <psi_i|beta_j> matrix 350.63 Mb ( 5440, 2, 2112)
> > Arrays for rho mixing 12.66 Mb ( 103680, 8)
> >
> > Initial potential from superposition of free atoms
> > Check: negative starting charge= -0.028620
> >
> > starting charge 1759.98221, renormalised to 1760.00000
> >
> > negative rho (up, down): 0.286E-01 0.000E+00
> > Starting wfc are 2880 randomized atomic wfcs
> > Application 5992317 exit signals: Killed
> >
> > 162 atom run:
> > Parallelization info
> > --------------------
> > sticks: dense smooth PW G-vecs: dense smooth PW
> > Min 34 10 2 8950 1450 178
> > Max 35 11 3 8981 1509 229
> > Sum 24841 7453 2003 6454371 1060521 148169
> >
> >
> > bravais-lattice index = 0
> > lattice parameter (alat) = 16.1227 a.u.
> > unit-cell volume = 47776.5825 (a.u.)^3
> > number of atoms/cell = 162
> > number of atomic types = 1
> > number of electrons = 1782.00
> > number of Kohn-Sham states= 2138
> > kinetic-energy cutoff = 30.0000 Ry
> > charge density cutoff = 400.0000 Ry
> > convergence threshold = 1.0E-06
> > mixing beta = 0.7000
> > number of iterations used = 8 plain mixing
> > Exchange-correlation = SLA PW PBX PBC ( 1 4 3 4 0)
> > EXX-fraction = 0.00
> > Non magnetic calculation with spin-orbit
> >
> >
> > _______________________________________________
> > Pw_forum mailing list
> > Pw_forum at pwscf.org
> > http://pwscf.org/mailman/listinfo/pw_forum
> _______________________________________________
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://pwscf.org/mailman/listinfo/pw_forum
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20130612/d07cd129/attachment.html>
More information about the users
mailing list