[Q-e-developers] bug image parallelization in PHonon
Thomas Brumme
thomas.brumme at mpsd.mpg.de
Wed Sep 14 13:30:55 CEST 2016
Dear all,
I think I found a bug in the image parallelization of PH - or I'm doing
something wrong.
I used the version 5.4 but the problem is also there if I use the 6.0 beta.
Maybe someone remembers my email few days ago to the normal email list
concerning
the parallelization using the GRID technique - the problem I encounter
here is essentially
the same. As an example, I use a modified run_example_1 of the
Recover_example
directory of PH.
Description of the problem:
0. (Following the example) I did an scf calculation using 2 CPUs with:
&control
calculation='scf'
restart_mode='from_scratch',
prefix='aluminum',
pseudo_dir = './',
outdir='./tempdir/'
/
&system
ibrav= 2, celldm(1) =7.5, nat= 1, ntyp= 1,
ecutwfc =15.0,
occupations='smearing', smearing='methfessel-paxton', degauss=0.05,
la2F = .true.,
/
&electrons
conv_thr = 1.0d-8
mixing_beta = 0.7
/
ATOMIC_SPECIES
Al 26.98 Al.pz-vbc.UPF
ATOMIC_POSITIONS
Al 0.00 0.00 0.00
K_POINTS {automatic}
16 16 16 0 0 0
1. I'll do the scf calculation using 2 CPUS and:
&control
calculation='scf'
restart_mode='from_scratch',
prefix='aluminum',
pseudo_dir = './',
outdir='./tempdir/'
/
&system
ibrav= 2, celldm(1) =7.5, nat= 1, ntyp= 1,
ecutwfc =15.0,
occupations='smearing', smearing='methfessel-paxton', degauss=0.05
/
&electrons
conv_thr = 1.0d-8
mixing_beta = 0.7
/
ATOMIC_SPECIES
Al 26.98 Al.pz-vbc.UPF
ATOMIC_POSITIONS
Al 0.00 0.00 0.00
K_POINTS {automatic}
8 8 8 0 0 0
2. I'll do a phonon calculation including storing the dvscf files and
using images.
More specifically I used:
mpirun -np 4 ph.x -ni 2 < al.elph.in
with al.elph.in given by:
Electron-phonon coefficients for Al
&inputph
tr2_ph=1.0d-10,
prefix='aluminum',
fildvscf='aldv',
amass(1)=26.98,
outdir='./tempdir/',
fildyn='al.dyn',
! electron_phonon='interpolated',
! el_ph_sigma=0.005,
! el_ph_nsigma=10,
! recover=.true.
! trans=.false.,
ldisp=.true.
max_seconds=6,
nq1=4, nq2=4, nq3=4
/
I used max_seconds in order to simulate the finite run time we have on
our HPC.
Restarting with recover=.true. works fine... I.e. I used:
Electron-phonon coefficients for Al
&inputph
tr2_ph=1.0d-10,
prefix='aluminum',
fildvscf='aldv',
amass(1)=26.98,
outdir='./tempdir/',
fildyn='al.dyn',
! electron_phonon='interpolated',
! el_ph_sigma=0.005,
! el_ph_nsigma=10,
recover=.true.
! trans=.false.,
ldisp=.true.
max_seconds=6,
nq1=4, nq2=4, nq3=4
/
3. Now I want to collect all data using no images:
mpirun -np 2 ph.x < al.elph.in
with the same input file as given in 2.
I'll get the error "Possibly too few bands at point ..." once the code
wants to
recalculate the wave functions for the q points which were calculated
only on
the second image, i.e., for q points 6, 7, and 8.
If I check the charge_density.dat files in the subfolders of the q
points in the
_ph0 directory I find that they're empty. Thus, I copied the q
subfolders of the
second image by hand to the folder of the first image using:
cp -r _ph1/aluminum.q_* _ph0/
If I now restart without images, using the input of 2. it works...
Everything is fine...
4. Now I can also calculate the el-ph parameters using the input:
Electron-phonon coefficients for Al
&inputph
tr2_ph=1.0d-10,
prefix='aluminum',
fildvscf='aldv',
amass(1)=26.98,
outdir='./tempdir/',
fildyn='al.dyn',
electron_phonon='interpolated',
el_ph_sigma=0.005,
el_ph_nsigma=10,
! recover=.true.
trans=.false.,
ldisp=.true.
! max_seconds=6,
nq1=4, nq2=4, nq3=4
/
5. Another problem I encounter is the following... Suppose the run time
is not enough to
finish the el-ph calculations, i.e., instead of the input in 4. I use:
Electron-phonon coefficients for Al
&inputph
tr2_ph=1.0d-10,
prefix='aluminum',
fildvscf='aldv',
amass(1)=26.98,
outdir='./tempdir/',
fildyn='al.dyn',
electron_phonon='interpolated',
el_ph_sigma=0.005,
el_ph_nsigma=10,
! recover=.true.
trans=.false.,
ldisp=.true.
max_seconds=6,
nq1=4, nq2=4, nq3=4
/
The code will stop at a certain point (in my case the 4th q point). If I
now restart the calculation
using:
Electron-phonon coefficients for Al
&inputph
tr2_ph=1.0d-10,
prefix='aluminum',
fildvscf='aldv',
amass(1)=26.98,
outdir='./tempdir/',
fildyn='al.dyn',
electron_phonon='interpolated',
el_ph_sigma=0.005,
el_ph_nsigma=10,
recover=.true.
trans=.false.,
ldisp=.true.
! max_seconds=6,
nq1=4, nq2=4, nq3=4
/
I get (again) the error message "Possibly too few bands at point ..."
once the code wants to calculate
the wave functions for the 4th q point (the one it stopped before)...
All other points are fine...
I think that the whole problem is related to the storing of the wave
functions and the charge density.
Maybe I'm doing something really wrong, but I don't see any obvious
error in the input... Also I don't
see any input variable for ph which influences the saving of wave
functions...
Regards
Thomas
--
Dr. rer. nat. Thomas Brumme
Max Planck Institute for the Structure and Dynamics of Matter
Luruper Chaussee 149
22761 Hamburg
Tel: +49 (0)40 8998 6557
email: Thomas.Brumme at mpsd.mpg.de
More information about the developers
mailing list