Duy Le
ttduyle at gmail.com
Wed Jun 12 16:14:11 CEST 2013
Just want to be clear, I am not Paolo !!!
If you need more memory, you should not increase number of cores to a
huge number. Instead, you can ask for more nodes but use less number
of cores per node.
For instant, you can ask for 16 nodes and use 6 cores per node. Check
you environment, but it is highly that you need to use something like
size = 192
aprun -n 96 -N 6 pw.x ...
If you regret to waste 1/2 node, check OPENMP for options.
Duy Le
Postdoctoral Associate
Department of Physics
University of Central Florida.
Website: http://www.physics.ucf.edu/~dle
On Tue, Jun 11, 2013 at 3:57 PM, vijaya subramanian
<vijaya65 at hotmail.com> wrote:
> Hi Paolo
> I am running an scf calculation on gold slabs. I have somewhat limited
> resources on a supercomputer
> and would like to optimize my runs. (Cray XT5 with 9,408 compute nodes
> interconnected with the SeaStar router through HyperTransport. The SeaStars
> are all interconnected in a 3-D torus topology. It is a massively parallel
> processing (MPP) machine. Each compute node has two six-core 2.6 GHz AMD
> Opterons for a total of 112,896 cores. All nodes have 16 Gbytes of DDR2
> memory: 1.33 Gbytes of memory per core.)
> A 54 gold atom slab scf calculation worked best with 120
> processors/npool=2/ndiag=49/ntg6.
> 240 processors and I get very good speed. 64 processors and I get an out of
> memory issue.
> When I use a larger unit cell I run into problems.
> I have attached two files with different configurations of gold atoms in a
> slab calculation with larger unit cells.
> The unit cells are different, one has six layers of gold atoms (unit cell -
> 16.12x48.36x60.8 in Bohr) and the other 2 layers of gold atoms (unit
> cell-54.x43.x54.).
> For some reason I cannot get the 160 atom problem to work. (>2000 still
> doesn't work). For the 6 layer 162 atom problem(nproc=720 works). If I use
> fewer number of processors I get an out of memory
> problem.
> Do you have any suggestions for what the problem may be?
>
> I have given partial output for the two calcs below:
> 160 atoms-1200 processors-the run failed before the diagonalization began.
> Parallelization info
> --------------------
> sticks: dense smooth PW G-vecs: dense smooth
> Min 105 31 8 24383 3975
> Max 106 32 9 24398 4042
> Sum 75823 22755 5881 17559633 2885465 37
>
>
> bravais-lattice index = 0
> lattice parameter (alat) = 54.5658 a.u.
> unit-cell volume = 129972.7994 (a.u.)^3
> number of atoms/cell = 160
> number of atomic types = 1
> number of electrons = 1760.00
> number of Kohn-Sham states= 2112
> kinetic-energy cutoff = 30.0000 Ry
> charge density cutoff = 400.0000 Ry
> convergence threshold = 1.0E-06
> mixing beta = 0.7000
> number of iterations used = 8 plain mixing
> Exchange-correlation = SLA PW PBX PBC ( 1 4 3 4 0)
> EXX-fraction = 0.00
> ........
> Dense grid: 17559633 G-vectors FFT dimensions: ( 360, 288, 360)
>
> Smooth grid: 2885465 G-vectors FFT dimensions: ( 192, 160, 192)
>
> Largest allocated arrays est. size (Mb) dimensions
> Kohn-Sham Wavefunctions 32.87 Mb ( 1020, 2112)
> NL pseudopotentials 42.33 Mb ( 510, 5440)
> Each V/rho on FFT grid 1.58 Mb ( 103680)
> Each G-vector array 0.19 Mb ( 24385)
> G-vector shells 0.09 Mb ( 11350)
> Largest temporary arrays est. size (Mb) dimensions
> Auxiliary wavefunctions 131.48 Mb ( 1020, 8448)
> Each subspace H/S matrix 3.36 Mb ( 469, 469)
> Each <psi_i|beta_j> matrix 350.63 Mb ( 5440, 2, 2112)
> Arrays for rho mixing 12.66 Mb ( 103680, 8)
>
> Initial potential from superposition of free atoms
> Check: negative starting charge= -0.028620
>
> starting charge 1759.98221, renormalised to 1760.00000
>
> negative rho (up, down): 0.286E-01 0.000E+00
> Starting wfc are 2880 randomized atomic wfcs
> Application 5992317 exit signals: Killed
>
> 162 atom run:
> Parallelization info
> --------------------
> sticks: dense smooth PW G-vecs: dense smooth PW
> Min 34 10 2 8950 1450 178
> Max 35 11 3 8981 1509 229
> Sum 24841 7453 2003 6454371 1060521 148169
>
>
> bravais-lattice index = 0
> lattice parameter (alat) = 16.1227 a.u.
> unit-cell volume = 47776.5825 (a.u.)^3
> number of atoms/cell = 162
> number of atomic types = 1
> number of electrons = 1782.00
> number of Kohn-Sham states= 2138
> kinetic-energy cutoff = 30.0000 Ry
> charge density cutoff = 400.0000 Ry
> convergence threshold = 1.0E-06
> mixing beta = 0.7000
> number of iterations used = 8 plain mixing
> Exchange-correlation = SLA PW PBX PBC ( 1 4 3 4 0)
> EXX-fraction = 0.00
> Non magnetic calculation with spin-orbit
>
>
