[Pw_forum] Crash on wf_collect in multi-pool spin-polarized calcs
Peter Scherpelz
pscherpelz at uchicago.edu
Wed Feb 4 19:59:52 CET 2015
I'm hitting a crash that I've traced to a fairly particular set of
circumstances, and want to check if this is a known and/or reproducible
bug beyond what I've found.
In detail: I've been running parallelized, spin-polarized pw.x
calculations (scf and relax). Quantum-espresso v5.1, using the MPI
version on either a single node or cluster. I find that quantum espresso
crashes with a davcio error, during the wf_collect stage of the
computation, only if the number of pools I'm using is equal to the total
number of k-points after spin-polarization is considered (e.g., gamma
only with 2 pools, or 2 distinct k-point locations with 4 pools).
If I run on half that many pools, I do not get a crash. If I run on an
equal number of pools but double the number of k-points, I also do not
get a crash. If I set wf_collect to false, I also do not get a crash.
I've attached a toy model (Si crystal) that exhibits this behavior; and
can include the successful runs with the alternate configurations if
that helps.
Thanks in advance for your help! And thanks overall to the developers
for the program - I'm a fairly new user and it's been working great
Peter Scherpelz
calculation = 'scf'
restart_mode = 'from_scratch'
prefix = 'T0011_np4'
pseudo_dir = './'
outdir = './out/'
wf_collect = .TRUE.
tstress = .TRUE.
tprnfor = .TRUE.
verbosity = 'high'
ibrav = 0
celldm(1) = 10.327
nat = 2
ntyp = 1
ecutwfc = 20
nosym = .FALSE.
occupations = 'smearing'
degauss = 0.001
starting_magnetization(1) = 0.3
nspin = 2
nbnd = 8
electron_maxstep = 600
conv_thr = 1.0d-6
mixing_mode = 'plain'
mixing_beta = 0.7
diagonalization = 'david'
startingpot = 'atomic'
startingwfc = 'atomic+random'
Si 28.086 Si.pbe-tm-new-gipaw-dc.UPF
K_POINTS automatic
2 1 1 0 0 0
0.0 0.5 0.5
0.5 0.0 0.5
0.5 0.5 0.0
Si 0.00 0.00 0.0
Si 0.25 0.25 0.25
Program PWSCF v.5.1 starts on 4Feb2015 at 11:56:32
This program is part of the open-source Quantum ESPRESSO suite
for quantum simulation of materials; please cite
"P. Giannozzi et al., J. Phys.:Condens. Matter 21 395502 (2009);
URL http://www.quantum-espresso.org",
in publications or presentations arising from this work. More details at
Parallel version (MPI), running on 4 processors
K-points division: npool = 4
Reading input from T0011_kpoints_np4.in
Current dimensions of program PWSCF are:
Max number of different atomic species (ntypx) = 10
Max number of k-points (npk) = 40000
Max angular momentum in pseudopotentials (lmaxx) = 3
WARNING: atomic wfc # 3 for atom type 1 has zero norm
WARNING: atomic wfc # 4 for atom type 1 has zero norm
Subspace diagonalization in iterative solution of the eigenvalue problem:
a serial algorithm will be used
G-vector sticks info
sticks: dense smooth PW G-vecs: dense smooth PW
Sum 301 301 91 3383 3383 561
Generating pointlists ...
new r_m : 0.1786 (alat units) 1.8446 (a.u.) for type 1
bravais-lattice index = 0
lattice parameter (alat) = 10.3270 a.u.
unit-cell volume = 275.3357 (a.u.)^3
number of atoms/cell = 2
number of atomic types = 1
number of electrons = 8.00
number of Kohn-Sham states= 8
kinetic-energy cutoff = 20.0000 Ry
charge density cutoff = 80.0000 Ry
convergence threshold = 1.0E-06
mixing beta = 0.7000
number of iterations used = 8 plain mixing
Exchange-correlation = SLA PW PBX PBC ( 1 4 3 4 0)
celldm(1)= 10.327000 celldm(2)= 0.000000 celldm(3)= 0.000000
celldm(4)= 0.000000 celldm(5)= 0.000000 celldm(6)= 0.000000
crystal axes: (cart. coord. in units of alat)
a(1) = ( 0.000000 0.500000 0.500000 )
a(2) = ( 0.500000 0.000000 0.500000 )
a(3) = ( 0.500000 0.500000 0.000000 )
reciprocal axes: (cart. coord. in units 2 pi/alat)
b(1) = ( -1.000000 1.000000 1.000000 )
b(2) = ( 1.000000 -1.000000 1.000000 )
b(3) = ( 1.000000 1.000000 -1.000000 )
PseudoPot. # 1 for Si read from file:
MD5 check sum: 1da412a37211db039ad8711d87301bf0
Pseudo is Norm-conserving, Zval = 4.0
Generated by new atomic code, or converted to UPF format
Using radial grid of 1525 points, 2 beta functions with:
l(1) = 0
l(2) = 1
atomic species valence mass pseudopotential
Si 4.00 28.08600 Si( 1.00)
Starting magnetic structure
atomic species magnetization
Si 0.300
48 Sym. Ops., with inversion, found (24 have fractional translation)
s frac. trans.
isym = 1 identity
cryst. s( 1) = ( 1 0 0 )
( 0 1 0 )
( 0 0 1 )
cart. s( 1) = ( 1.0000000 0.0000000 0.0000000 )
( 0.0000000 1.0000000 0.0000000 )
( 0.0000000 0.0000000 1.0000000 )
isym = 48 inv. 120 deg rotation - cart. axis [-1,-1,1]
cryst. s(48) = ( 0 1 0 ) f =( -0.2500000 )
( -1 1 0 ) ( -0.2500000 )
( 0 1 -1 ) ( -0.2500000 )
cart. s(48) = ( 0.0000000 0.0000000 1.0000000 ) f =( -0.2500000 )
( -1.0000000 0.0000000 0.0000000 ) ( -0.2500000 )
( 0.0000000 1.0000000 0.0000000 ) ( -0.2500000 )
point group O_h (m-3m)
there are 10 classes
the character table:
E 8C3 6C2' 6C4 3C2 i 6S4 8S6 3s_h 6s_d
A_1g 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
A_2g 1.00 1.00 -1.00 -1.00 1.00 1.00 -1.00 1.00 1.00 -1.00
E_g 2.00 -1.00 0.00 0.00 2.00 2.00 0.00 -1.00 2.00 0.00
T_1g 3.00 0.00 -1.00 1.00 -1.00 3.00 1.00 0.00 -1.00 -1.00
T_2g 3.00 0.00 1.00 -1.00 -1.00 3.00 -1.00 0.00 -1.00 1.00
A_1u 1.00 1.00 1.00 1.00 1.00 -1.00 -1.00 -1.00 -1.00 -1.00
A_2u 1.00 1.00 -1.00 -1.00 1.00 -1.00 1.00 -1.00 -1.00 1.00
E_u 2.00 -1.00 0.00 0.00 2.00 -2.00 0.00 1.00 -2.00 0.00
T_1u 3.00 0.00 -1.00 1.00 -1.00 -3.00 -1.00 0.00 1.00 1.00
T_2u 3.00 0.00 1.00 -1.00 -1.00 -3.00 1.00 0.00 1.00 -1.00
the symmetry operations in each class:
E 1
3C2 2 4 3
6C2' 5 6 14 13 10 9
6C4 7 8 15 16 12 11
8C3 17 19 20 18 24 21 22 23
i 25
3s_h 26 28 27
6s_d 29 30 38 37 34 33
6S4 31 32 39 40 36 35
8S6 41 43 44 42 48 45 46 47
Cartesian axes
site n. atom positions (alat units)
1 Si tau( 1) = ( 0.0000000 0.0000000 0.0000000 )
2 Si tau( 2) = ( 0.2500000 0.2500000 0.2500000 )
Crystallographic axes
site n. atom positions (cryst. coord.)
1 Si tau( 1) = ( 0.0000000 0.0000000 0.0000000 )
2 Si tau( 2) = ( 0.2500000 0.2500000 0.2500000 )
number of k points= 4 gaussian smearing, width (Ry)= 0.0010
cart. coord. in units 2pi/alat
k( 1) = ( 0.0000000 0.0000000 0.0000000), wk = 0.5000000
k( 2) = ( 0.5000000 -0.5000000 -0.5000000), wk = 0.5000000
k( 3) = ( 0.0000000 0.0000000 0.0000000), wk = 0.5000000
k( 4) = ( 0.5000000 -0.5000000 -0.5000000), wk = 0.5000000
cryst. coord.
k( 1) = ( 0.0000000 0.0000000 0.0000000), wk = 0.5000000
k( 2) = ( -0.5000000 0.0000000 0.0000000), wk = 0.5000000
k( 3) = ( 0.0000000 0.0000000 0.0000000), wk = 0.5000000
k( 4) = ( -0.5000000 0.0000000 0.0000000), wk = 0.5000000
Dense grid: 3383 G-vectors FFT dimensions: ( 24, 24, 24)
Largest allocated arrays est. size (Mb) dimensions
Kohn-Sham Wavefunctions 0.05 Mb ( 411, 8)
NL pseudopotentials 0.05 Mb ( 411, 8)
Each V/rho on FFT grid 0.42 Mb ( 13824, 2)
Each G-vector array 0.03 Mb ( 3383)
G-vector shells 0.00 Mb ( 75)
Largest temporary arrays est. size (Mb) dimensions
Auxiliary wavefunctions 0.20 Mb ( 411, 32)
Each subspace H/S matrix 0.02 Mb ( 32, 32)
Each <psi_i|beta_j> matrix 0.00 Mb ( 8, 8)
Arrays for rho mixing 1.69 Mb ( 13824, 8)
Initial potential from superposition of free atoms
starting charge 6.94077, renormalised to 8.00000
Starting wfc are 18 randomized atomic wfcs
total cpu time spent up to now is 0.4 secs
Self-consistent Calculation
iteration # 1 ecut= 20.00 Ry beta=0.70
Davidson diagonalization with overlap
ethr = 1.00E-02, avg # of iterations = 3.0
Magnetic moment per site:
atom: 1 charge: 1.9492 magn: 0.1817 constr: 0.0000
atom: 2 charge: 1.9492 magn: 0.1817 constr: 0.0000
total cpu time spent up to now is 0.7 secs
total energy = -15.04118241 Ry
Harris-Foulkes estimate = -15.00380666 Ry
estimated scf accuracy < 0.13703973 Ry
total magnetization = 0.00 Bohr mag/cell
absolute magnetization = 0.04 Bohr mag/cell
iteration # 2 ecut= 20.00 Ry beta=0.70
Davidson diagonalization with overlap
ethr = 1.71E-03, avg # of iterations = 1.0
Magnetic moment per site:
atom: 1 charge: 1.9203 magn: 0.0177 constr: 0.0000
atom: 2 charge: 1.9203 magn: 0.0177 constr: 0.0000
total cpu time spent up to now is 0.7 secs
total energy = -15.05232519 Ry
Harris-Foulkes estimate = -15.04664425 Ry
estimated scf accuracy < 0.00936651 Ry
total magnetization = 0.00 Bohr mag/cell
absolute magnetization = 0.02 Bohr mag/cell
iteration # 3 ecut= 20.00 Ry beta=0.70
Davidson diagonalization with overlap
ethr = 1.17E-04, avg # of iterations = 1.2
Magnetic moment per site:
atom: 1 charge: 1.9119 magn: -0.0001 constr: 0.0000
atom: 2 charge: 1.9119 magn: -0.0001 constr: 0.0000
total cpu time spent up to now is 0.7 secs
total energy = -15.05301626 Ry
Harris-Foulkes estimate = -15.05293304 Ry
estimated scf accuracy < 0.00009041 Ry
total magnetization = 0.00 Bohr mag/cell
absolute magnetization = 0.00 Bohr mag/cell
iteration # 4 ecut= 20.00 Ry beta=0.70
Davidson diagonalization with overlap
ethr = 1.13E-06, avg # of iterations = 2.8
Magnetic moment per site:
atom: 1 charge: 1.9122 magn: -0.0002 constr: 0.0000
atom: 2 charge: 1.9122 magn: -0.0002 constr: 0.0000
total cpu time spent up to now is 0.8 secs
total energy = -15.05307551 Ry
Harris-Foulkes estimate = -15.05307374 Ry
estimated scf accuracy < 0.00000183 Ry
total magnetization = 0.00 Bohr mag/cell
absolute magnetization = 0.00 Bohr mag/cell
iteration # 5 ecut= 20.00 Ry beta=0.70
Davidson diagonalization with overlap
ethr = 2.29E-08, avg # of iterations = 1.8
Magnetic moment per site:
atom: 1 charge: 1.9121 magn: 0.0001 constr: 0.0000
atom: 2 charge: 1.9121 magn: 0.0001 constr: 0.0000
total cpu time spent up to now is 0.8 secs
End of self-consistent calculation
------ SPIN UP ------------
k = 0.0000 0.0000 0.0000 ( 411 PWs) bands (ev):
-5.3214 6.6406 6.6406 6.6406 9.0184 9.0184 9.0184 9.8255
occupation numbers
1.0000 1.0000 1.0000 1.0000 0.0000 0.0000 0.0000 0.0000
k = 0.5000-0.5000-0.5000 ( 410 PWs) bands (ev):
-2.9700 -0.4024 5.4062 5.4062 8.0157 9.7642 9.7642 13.9399
occupation numbers
1.0000 1.0000 1.0000 1.0000 0.0000 0.0000 0.0000 0.0000
------ SPIN DOWN ----------
k = 0.0000 0.0000 0.0000 ( 411 PWs) bands (ev):
-5.3228 6.6390 6.6390 6.6390 9.0174 9.0174 9.0174 9.8263
occupation numbers
1.0000 1.0000 1.0000 1.0000 0.0000 0.0000 0.0000 0.0000
k = 0.5000-0.5000-0.5000 ( 410 PWs) bands (ev):
-2.9712 -0.4041 5.4047 5.4047 8.0149 9.7625 9.7625 13.9377
occupation numbers
1.0000 1.0000 1.0000 1.0000 0.0000 0.0000 0.0000 0.0000
the Fermi energy is 6.7232 ev
! total energy = -15.05307617 Ry
Harris-Foulkes estimate = -15.05307578 Ry
estimated scf accuracy < 0.00000015 Ry
The total energy is the sum of the following terms:
one-electron contribution = 5.13830774 Ry
hartree contribution = 1.46507432 Ry
xc contribution = -4.96453051 Ry
ewald contribution = -16.69192772 Ry
smearing contrib. (-TS) = 0.00000000 Ry
total magnetization = 0.00 Bohr mag/cell
absolute magnetization = 0.00 Bohr mag/cell
convergence has been achieved in 5 iterations
Forces acting on atoms (Ry/au):
atom 1 type 1 force = 0.00000000 0.00000000 0.00000000
atom 2 type 1 force = 0.00000000 0.00000000 0.00000000
The non-local contrib. to forces
atom 1 type 1 force = 0.00000000 0.00000000 0.00000000
atom 2 type 1 force = 0.00000000 0.00000000 0.00000000
The ionic contribution to forces
atom 1 type 1 force = 0.00000000 0.00000000 0.00000000
atom 2 type 1 force = 0.00000000 0.00000000 0.00000000
The local contribution to forces
atom 1 type 1 force = 0.00000000 0.00000000 0.00000000
atom 2 type 1 force = 0.00000000 0.00000000 0.00000000
The core correction contribution to forces
atom 1 type 1 force = 0.00000000 0.00000000 0.00000000
atom 2 type 1 force = 0.00000000 0.00000000 0.00000000
The Hubbard contrib. to forces
atom 1 type 1 force = 0.00000000 0.00000000 0.00000000
atom 2 type 1 force = 0.00000000 0.00000000 0.00000000
The SCF correction term to forces
atom 1 type 1 force = 0.00000000 0.00000000 0.00000000
atom 2 type 1 force = 0.00000000 0.00000000 0.00000000
Total force = 0.000000 Total SCF correction = 0.000000
entering subroutine stress ...
total stress (Ry/bohr**3) (kbar) P= 217.56
0.00147894 0.00000000 0.00000000 217.56 0.00 0.00
0.00000000 0.00147894 0.00000000 0.00 217.56 0.00
0.00000000 0.00000000 0.00147894 0.00 0.00 217.56
kinetic stress (kbar) 2590.08 0.00 0.00
0.00 2590.08 0.00
0.00 0.00 2590.08
local stress (kbar) -1309.54 0.00 0.00
0.00 -1309.54 0.00
0.00 0.00 -1309.54
nonloc. stress (kbar) 2464.95 0.00 0.00
0.00 2464.95 0.00
0.00 0.00 2464.95
hartree stress (kbar) 260.92 0.00 0.00
0.00 260.92 0.00
0.00 0.00 260.92
exc-cor stress (kbar) -816.16 0.00 0.00
0.00 -816.16 0.00
0.00 0.00 -816.16
corecor stress (kbar) 0.00 0.00 0.00
0.00 0.00 0.00
0.00 0.00 0.00
ewald stress (kbar) -2972.69 0.00 0.00
0.00 -2972.69 0.00
0.00 0.00 -2972.69
hubbard stress (kbar) 0.00 0.00 0.00
0.00 0.00 0.00
0.00 0.00 0.00
london stress (kbar) 0.00 0.00 0.00
0.00 0.00 0.00
0.00 0.00 0.00
XDM stress (kbar) 0.00 0.00 0.00
0.00 0.00 0.00
0.00 0.00 0.00
dft-nl stress (kbar) 0.00 0.00 0.00
0.00 0.00 0.00
0.00 0.00 0.00
Writing output data file T0011_np4.save
Error in routine davcio (10):
error while reading from file "/scratch/midway/pscherpelz/Bulk_Test/./out/T0011_np4.wfc1"
stopping ...
