[Q-e-developers] bug image parallelization in PHonon
Paolo Giannozzi
p.giannozzi at gmail.com
Fri Sep 23 16:24:48 CEST 2016
has anybody any idea? P.
On Wed, Sep 14, 2016 at 1:30 PM, Thomas Brumme <thomas.brumme at mpsd.mpg.de>
wrote:
> Dear all,
>
> I think I found a bug in the image parallelization of PH - or I'm doing
> something wrong.
> I used the version 5.4 but the problem is also there if I use the 6.0 beta.
> Maybe someone remembers my email few days ago to the normal email list
> concerning
> the parallelization using the GRID technique - the problem I encounter
> here is essentially
> the same. As an example, I use a modified run_example_1 of the
> Recover_example
> directory of PH.
>
> Description of the problem:
>
> 0. (Following the example) I did an scf calculation using 2 CPUs with:
>
> &control
> calculation='scf'
> restart_mode='from_scratch',
> prefix='aluminum',
> pseudo_dir = './',
> outdir='./tempdir/'
> /
> &system
> ibrav= 2, celldm(1) =7.5, nat= 1, ntyp= 1,
> ecutwfc =15.0,
> occupations='smearing', smearing='methfessel-paxton', degauss=0.05,
> la2F = .true.,
> /
> &electrons
> conv_thr = 1.0d-8
> mixing_beta = 0.7
> /
> ATOMIC_SPECIES
> Al 26.98 Al.pz-vbc.UPF
> ATOMIC_POSITIONS
> Al 0.00 0.00 0.00
> K_POINTS {automatic}
> 16 16 16 0 0 0
>
>
> 1. I'll do the scf calculation using 2 CPUS and:
>
> &control
> calculation='scf'
> restart_mode='from_scratch',
> prefix='aluminum',
> pseudo_dir = './',
> outdir='./tempdir/'
> /
> &system
> ibrav= 2, celldm(1) =7.5, nat= 1, ntyp= 1,
> ecutwfc =15.0,
> occupations='smearing', smearing='methfessel-paxton', degauss=0.05
> /
> &electrons
> conv_thr = 1.0d-8
> mixing_beta = 0.7
> /
> ATOMIC_SPECIES
> Al 26.98 Al.pz-vbc.UPF
> ATOMIC_POSITIONS
> Al 0.00 0.00 0.00
> K_POINTS {automatic}
> 8 8 8 0 0 0
>
>
> 2. I'll do a phonon calculation including storing the dvscf files and
> using images.
> More specifically I used:
>
> mpirun -np 4 ph.x -ni 2 < al.elph.in
>
> with al.elph.in given by:
>
> Electron-phonon coefficients for Al
> &inputph
> tr2_ph=1.0d-10,
> prefix='aluminum',
> fildvscf='aldv',
> amass(1)=26.98,
> outdir='./tempdir/',
> fildyn='al.dyn',
> ! electron_phonon='interpolated',
> ! el_ph_sigma=0.005,
> ! el_ph_nsigma=10,
> ! recover=.true.
> ! trans=.false.,
> ldisp=.true.
> max_seconds=6,
> nq1=4, nq2=4, nq3=4
> /
>
> I used max_seconds in order to simulate the finite run time we have on
> our HPC.
> Restarting with recover=.true. works fine... I.e. I used:
>
> Electron-phonon coefficients for Al
> &inputph
> tr2_ph=1.0d-10,
> prefix='aluminum',
> fildvscf='aldv',
> amass(1)=26.98,
> outdir='./tempdir/',
> fildyn='al.dyn',
> ! electron_phonon='interpolated',
> ! el_ph_sigma=0.005,
> ! el_ph_nsigma=10,
> recover=.true.
> ! trans=.false.,
> ldisp=.true.
> max_seconds=6,
> nq1=4, nq2=4, nq3=4
> /
>
>
> 3. Now I want to collect all data using no images:
>
> mpirun -np 2 ph.x < al.elph.in
>
> with the same input file as given in 2.
>
> I'll get the error "Possibly too few bands at point ..." once the code
> wants to
> recalculate the wave functions for the q points which were calculated
> only on
> the second image, i.e., for q points 6, 7, and 8.
>
> If I check the charge_density.dat files in the subfolders of the q
> points in the
> _ph0 directory I find that they're empty. Thus, I copied the q
> subfolders of the
> second image by hand to the folder of the first image using:
>
> cp -r _ph1/aluminum.q_* _ph0/
>
> If I now restart without images, using the input of 2. it works...
> Everything is fine...
>
>
> 4. Now I can also calculate the el-ph parameters using the input:
>
> Electron-phonon coefficients for Al
> &inputph
> tr2_ph=1.0d-10,
> prefix='aluminum',
> fildvscf='aldv',
> amass(1)=26.98,
> outdir='./tempdir/',
> fildyn='al.dyn',
> electron_phonon='interpolated',
> el_ph_sigma=0.005,
> el_ph_nsigma=10,
> ! recover=.true.
> trans=.false.,
> ldisp=.true.
> ! max_seconds=6,
> nq1=4, nq2=4, nq3=4
> /
>
>
> 5. Another problem I encounter is the following... Suppose the run time
> is not enough to
> finish the el-ph calculations, i.e., instead of the input in 4. I use:
>
> Electron-phonon coefficients for Al
> &inputph
> tr2_ph=1.0d-10,
> prefix='aluminum',
> fildvscf='aldv',
> amass(1)=26.98,
> outdir='./tempdir/',
> fildyn='al.dyn',
> electron_phonon='interpolated',
> el_ph_sigma=0.005,
> el_ph_nsigma=10,
> ! recover=.true.
> trans=.false.,
> ldisp=.true.
> max_seconds=6,
> nq1=4, nq2=4, nq3=4
> /
>
> The code will stop at a certain point (in my case the 4th q point). If I
> now restart the calculation
> using:
>
> Electron-phonon coefficients for Al
> &inputph
> tr2_ph=1.0d-10,
> prefix='aluminum',
> fildvscf='aldv',
> amass(1)=26.98,
> outdir='./tempdir/',
> fildyn='al.dyn',
> electron_phonon='interpolated',
> el_ph_sigma=0.005,
> el_ph_nsigma=10,
> recover=.true.
> trans=.false.,
> ldisp=.true.
> ! max_seconds=6,
> nq1=4, nq2=4, nq3=4
> /
>
> I get (again) the error message "Possibly too few bands at point ..."
> once the code wants to calculate
> the wave functions for the 4th q point (the one it stopped before)...
> All other points are fine...
>
>
> I think that the whole problem is related to the storing of the wave
> functions and the charge density.
> Maybe I'm doing something really wrong, but I don't see any obvious
> error in the input... Also I don't
> see any input variable for ph which influences the saving of wave
> functions...
>
> Regards
>
> Thomas
>
> --
> Dr. rer. nat. Thomas Brumme
> Max Planck Institute for the Structure and Dynamics of Matter
> Luruper Chaussee 149
> 22761 Hamburg
>
> Tel: +49 (0)40 8998 6557
>
> email: Thomas.Brumme at mpsd.mpg.de
>
> _______________________________________________
> Q-e-developers mailing list
> Q-e-developers at qe-forge.org
> http://qe-forge.org/mailman/listinfo/q-e-developers
>
--
Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
Phone +39-0432-558216, fax +39-0432-558222
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/developers/attachments/20160923/2f32030d/attachment.html>
More information about the developers
mailing list