[Pw_forum] Fwd: nstep, npool and FFT grid

Nicola Marzari marzari at MIT.EDU
Sat Aug 25 23:35:35 CEST 2007



Dear Bhagawan,

you raise several relevant questions - hopefully someone can help as 
well. Testing your specific system, on your clusterwith your 
communication devices, is in any case the most important strategy.

1) the "***" come from a format declaration (i3, likely). We should 
switch to i4, but you should be able to fish in pw where that line is
printed, and change the format output.

I'm not sure about about the fine details of FFT parallelization. 
Stefano de Gironcoli would probably be the person, but I think he is
currently on travel.

2) this is easily answered by looking at INPUT_PW.
Electronic minimizations (at fixed ions) are controlled by

electron_maxstep
                INTEGER ( default = 100 )
                maximum number of iterations in a scf step

conv_thr       REAL  ( default = 1.D-6 )
                Convergence threshold for selfconsistency:
                estimated energy error < conv_thr

For ionic relaxations, at each ionic step the electrons are relaxed as 
above, and the ionic relaxations are controlled by

etot_conv_thr  REAL ( default = 1.0D-4 )
                convergence threshold on total energy (a.u) for ionic
                minimization: the convergence criterion is satisfied
                when the total energy changes less than etot_conv_thr
                between two consecutive scf steps.
                See also forc_conv_thr - both criteria must be satisfied

forc_conv_thr  REAL ( default = 1.0D-3 )
                convergence threshold on forces (a.u) for ionic
                minimization: the convergence criterion is satisfied
                when all components of all forces are smaller than
                forc_conv_thr.
                See also etot_conv_thr - both criteria must be satisfied

The code will do a maximum of "nstep" ionic steps.

3) npools partitioning doesn't decrease the size of the memory each
processor needs, so it is usefull for small systems with many kpoints.
It is also very good for low communications. If you have 12 k-points,
run on 1, 2, 3, 4, 6, or 12 processors. G-parallelization is a tad
faster if your communications are excellent, but for larger and
larger systems it's difficult to keep perfect scaling. But a g-parallel
job is partitioned in smaller chuncks, so this parallelization is 
necessary if you study systems that do not fit on 1 processor alone.

Not sure about the other questions on fft performance etc.

			nicola


brsahu at physics.utexas.edu wrote:
> Dear PWSCF users,
> 
> I submitted a query few days ago. If you have any suggestions pl. let me know
> 
> Bhagawan
> 
> ----- Forwarded message from brsahu at physics.utexas.edu -----
>      Date: Wed, 22 Aug 2007 15:55:09 -0500
>      From: brsahu at physics.utexas.edu
> Reply-To: brsahu at physics.utexas.edu
>   Subject: nstep, npool and FFT grid
>        To: pw_forum at pwscf.org
> 
> Dear pwscf users,
> 
> I have two questions regarding
> 
> 1) FFT grid
> 
> In a system where the FFT-grids along x-, y- and z-directions are not
> equal,which one of these decides the PW parallelization efficiency?
> 
> I have a system with about 200 atoms. In the output, I get
> 
> G cutoff =  164.2214  (2183159 G-vectors)     FFT grid: ( 27,***,144)
> G cutoff =   43.7924  ( 301345 G-vectors)  smooth grid: ( 15,625, 72)
> 
> There are '***' printed for the y-dir FFT grid.
> 
> This latest version of the pwscf code is compiled on a linux cluster
> (cvs version).
> 
> Can something be changed (array declaration for FFT) so that it can
> print FFT grid for large systems so that I can tune the number of
> processors for
> G-vector and/or k-pt parallelization.
> 
> 2) nstep and npool
> 
> nstep (INPUT_PW file in Doc) says is the number of ionic+electronic
> steps which is
> 1 for scf, nscf, and bands  and 0 for neb etc. and 50 for "relax"
> vc-relax etc by default.
> 
> I have a run with "relax". I put nstep=250 does it mean it will do 250
> ionic steps
> ? Is there a default for how about the electronic steps? I could not
> find a separate tag
> for electronic steps.
> 
> Also How do one estimate whether k-point parallelization will be
> faster or G-vector(FFT)
> grid parallelization is faster for a given system. The suggestion for
> parallelization
> issues in the usersguide suggest that choose number of processors such
> that the third FFT
> grid is divisible by number of processors chosen. and choose number of
> pools of
> processors so that number of k-points are divisible by that. Is there
> a default for
> number of pools of processors that a running job assumes if not
> specified  in the "job"
> submitting script or one has to specify it explicitly?
> 
> Is number of pools of processors divides the number of processors
> specified in the "job"
> submitting and # of proc./# of pools = the number of processors that
> should be divisor of
> the third FFT grid?

-- 
---------------------------------------------------------------------
Prof Nicola Marzari   Department of Materials Science and Engineering
13-5066   MIT   77 Massachusetts Avenue   Cambridge MA 02139-4307 USA
tel 617.4522758 fax 2586534 marzari at mit.edu http://quasiamore.mit.edu



More information about the users mailing list