[Q-e-developers] bug image parallelization in PHonon

Thomas Brumme thomas.brumme at mpsd.mpg.de
Mon Sep 26 12:01:58 CEST 2016


Dear Paolo,

the "patch" is working. Now I can't calculate the dvscf files using 
images... I need to use the GRID way...
Not really a solution but at least the problem won't occur in the 
official version :)

Yet, the 5th problem remains... Restarting an el-ph calculation does not 
work no matter if I use the
calculation from file (trans=.false.) or not. At least for me it does 
not work...

Regards

Thomas


On 09/23/2016 06:21 PM, Paolo Giannozzi wrote:
> Is the following patch doing (that is: stopping) the job?
> ---
> Index: /home/giannozz/trunk/espresso/PHonon/PH/phq_readin.f90
> ===================================================================
> --- /home/giannozz/trunk/espresso/PHonon/PH/phq_readin.f90 (revision 
> 13008)
> +++ /home/giannozz/trunk/espresso/PHonon/PH/phq_readin.f90 (working copy)
> @@ -679,6 +679,8 @@
>
>    IF(elph.and.nimage>1) call errore('phq_readin',&
>         'el-ph with images not implemented',1)
> +  IF( fildvscf /= ' ' .and. nimage > 1 ) call errore('phq_readin',&
> +       'saving dvscf to file images not implemented',1)
>
>    IF (elph.OR.fildvscf /= ' ') lqdir=.TRUE.
> ---
>
> Paolo
>
> On Fri, Sep 23, 2016 at 4:45 PM, Thomas Brumme 
> <thomas.brumme at mpsd.mpg.de <mailto:thomas.brumme at mpsd.mpg.de>> wrote:
>
>     I meanwhile had a discussion with Lorenzo Paulatto about a similar
>     problem.
>
>     I think that it might be a rather specific problem. As soon as I
>     parallelize only over
>     q points using start_q and last_q there is no problem - also for
>     restarting.
>
>     Using images I can, in principle, even create the full dvscf
>     files, without having to
>     rerun the calculation without images, using split and cat on the
>     different dvscf files
>     in the different temp folders. It's tedious but it works. Yet, in
>     future I will use only
>     the parallelization over q points for the calculation of the dvscf.
>
>     In summary, the parallelization for PH is not straightforward and
>     I think that it
>     might help to store, e.g., the dvscf files for different
>     representations separately.
>     But Lorenzo mentioned that system administrators complain if the
>     number of
>     written files is large... It could be helpful if there would be a
>     kind of summary
>     what can be done using images and what not... I.e. dvscf (and
>     el-ph) does not
>     work if image parallelization is used, especially if the different
>     representations
>     of one q point are split across different images. For el-ph the
>     code does not
>     start, but maybe a similar check can be added for the dvscf files?
>
>     Well, or maybe not, I don't know :)
>
>
>     On 09/23/2016 04:24 PM, Paolo Giannozzi wrote:
>>     has anybody any idea? P.
>>
>>     On Wed, Sep 14, 2016 at 1:30 PM, Thomas Brumme
>>     <thomas.brumme at mpsd.mpg.de <mailto:thomas.brumme at mpsd.mpg.de>> wrote:
>>
>>         Dear all,
>>
>>         I think I found a bug in the image parallelization of PH - or
>>         I'm doing
>>         something wrong.
>>         I used the version 5.4 but the problem is also there if I use
>>         the 6.0 beta.
>>         Maybe someone remembers my email few days ago to the normal
>>         email list
>>         concerning
>>         the parallelization using the GRID technique - the problem I
>>         encounter
>>         here is essentially
>>         the same. As an example, I use a modified run_example_1 of the
>>         Recover_example
>>         directory of PH.
>>
>>         Description of the problem:
>>
>>         0. (Following the example) I did an scf calculation using 2
>>         CPUs with:
>>
>>           &control
>>              calculation='scf'
>>              restart_mode='from_scratch',
>>              prefix='aluminum',
>>              pseudo_dir = './',
>>              outdir='./tempdir/'
>>           /
>>           &system
>>              ibrav=  2, celldm(1) =7.5, nat= 1, ntyp= 1,
>>              ecutwfc =15.0,
>>              occupations='smearing', smearing='methfessel-paxton',
>>         degauss=0.05,
>>              la2F = .true.,
>>           /
>>           &electrons
>>              conv_thr =  1.0d-8
>>              mixing_beta = 0.7
>>           /
>>         ATOMIC_SPECIES
>>           Al  26.98 Al.pz-vbc.UPF
>>         ATOMIC_POSITIONS
>>           Al 0.00 0.00 0.00
>>         K_POINTS {automatic}
>>           16 16 16  0 0 0
>>
>>
>>         1. I'll do the scf calculation using 2 CPUS and:
>>
>>           &control
>>              calculation='scf'
>>              restart_mode='from_scratch',
>>              prefix='aluminum',
>>              pseudo_dir = './',
>>              outdir='./tempdir/'
>>           /
>>           &system
>>              ibrav=  2, celldm(1) =7.5, nat= 1, ntyp= 1,
>>              ecutwfc =15.0,
>>              occupations='smearing', smearing='methfessel-paxton',
>>         degauss=0.05
>>           /
>>           &electrons
>>              conv_thr =  1.0d-8
>>              mixing_beta = 0.7
>>           /
>>         ATOMIC_SPECIES
>>           Al  26.98 Al.pz-vbc.UPF
>>         ATOMIC_POSITIONS
>>           Al 0.00 0.00 0.00
>>         K_POINTS {automatic}
>>           8 8 8  0 0 0
>>
>>
>>         2. I'll do a phonon calculation including storing the dvscf
>>         files and
>>         using images.
>>         More specifically I used:
>>
>>         mpirun -np 4 ph.x -ni 2 < al.elph.in <http://al.elph.in>
>>
>>         with al.elph.in <http://al.elph.in> given by:
>>
>>         Electron-phonon coefficients for Al
>>           &inputph
>>            tr2_ph=1.0d-10,
>>            prefix='aluminum',
>>            fildvscf='aldv',
>>            amass(1)=26.98,
>>            outdir='./tempdir/',
>>            fildyn='al.dyn',
>>         !  electron_phonon='interpolated',
>>         !  el_ph_sigma=0.005,
>>         !  el_ph_nsigma=10,
>>         !  recover=.true.
>>         !  trans=.false.,
>>            ldisp=.true.
>>            max_seconds=6,
>>            nq1=4, nq2=4, nq3=4
>>           /
>>
>>         I used max_seconds in order to simulate the finite run time
>>         we have on
>>         our HPC.
>>         Restarting with recover=.true. works fine... I.e. I used:
>>
>>         Electron-phonon coefficients for Al
>>           &inputph
>>            tr2_ph=1.0d-10,
>>            prefix='aluminum',
>>            fildvscf='aldv',
>>            amass(1)=26.98,
>>            outdir='./tempdir/',
>>            fildyn='al.dyn',
>>         !  electron_phonon='interpolated',
>>         !  el_ph_sigma=0.005,
>>         !  el_ph_nsigma=10,
>>            recover=.true.
>>         !  trans=.false.,
>>            ldisp=.true.
>>            max_seconds=6,
>>            nq1=4, nq2=4, nq3=4
>>           /
>>
>>
>>         3. Now I want to collect all data using no images:
>>
>>         mpirun -np 2 ph.x < al.elph.in <http://al.elph.in>
>>
>>         with the same input file as given in 2.
>>
>>         I'll get the error "Possibly too few bands at point ..." once
>>         the code
>>         wants to
>>         recalculate the wave functions for the q points which were
>>         calculated
>>         only on
>>         the second image, i.e., for q points 6, 7, and 8.
>>
>>         If I check the charge_density.dat files in the subfolders of
>>         the q
>>         points in the
>>         _ph0 directory I find that they're empty. Thus, I copied the q
>>         subfolders of the
>>         second image by hand to the folder of the first image using:
>>
>>         cp -r _ph1/aluminum.q_* _ph0/
>>
>>         If I now restart without images, using the input of 2. it
>>         works...
>>         Everything is fine...
>>
>>
>>         4. Now I can also calculate the el-ph parameters using the input:
>>
>>         Electron-phonon coefficients for Al
>>           &inputph
>>            tr2_ph=1.0d-10,
>>            prefix='aluminum',
>>            fildvscf='aldv',
>>            amass(1)=26.98,
>>            outdir='./tempdir/',
>>            fildyn='al.dyn',
>>            electron_phonon='interpolated',
>>            el_ph_sigma=0.005,
>>            el_ph_nsigma=10,
>>         !  recover=.true.
>>            trans=.false.,
>>            ldisp=.true.
>>         !  max_seconds=6,
>>            nq1=4, nq2=4, nq3=4
>>           /
>>
>>
>>         5. Another problem I encounter is the following... Suppose
>>         the run time
>>         is not enough to
>>         finish the el-ph calculations, i.e., instead of the input in
>>         4. I use:
>>
>>         Electron-phonon coefficients for Al
>>           &inputph
>>            tr2_ph=1.0d-10,
>>            prefix='aluminum',
>>            fildvscf='aldv',
>>            amass(1)=26.98,
>>            outdir='./tempdir/',
>>            fildyn='al.dyn',
>>            electron_phonon='interpolated',
>>            el_ph_sigma=0.005,
>>            el_ph_nsigma=10,
>>         !  recover=.true.
>>            trans=.false.,
>>            ldisp=.true.
>>            max_seconds=6,
>>            nq1=4, nq2=4, nq3=4
>>           /
>>
>>         The code will stop at a certain point (in my case the 4th q
>>         point). If I
>>         now restart the calculation
>>         using:
>>
>>         Electron-phonon coefficients for Al
>>           &inputph
>>            tr2_ph=1.0d-10,
>>            prefix='aluminum',
>>            fildvscf='aldv',
>>            amass(1)=26.98,
>>            outdir='./tempdir/',
>>            fildyn='al.dyn',
>>            electron_phonon='interpolated',
>>            el_ph_sigma=0.005,
>>            el_ph_nsigma=10,
>>            recover=.true.
>>            trans=.false.,
>>            ldisp=.true.
>>         !  max_seconds=6,
>>            nq1=4, nq2=4, nq3=4
>>           /
>>
>>         I get (again) the error message "Possibly too few bands at
>>         point ..."
>>         once the code wants to calculate
>>         the wave functions for the 4th q point (the one it stopped
>>         before)...
>>         All other points are fine...
>>
>>
>>         I think that the whole problem is related to the storing of
>>         the wave
>>         functions and the charge density.
>>         Maybe I'm doing something really wrong, but I don't see any
>>         obvious
>>         error in the input... Also I don't
>>         see any input variable for ph which influences the saving of wave
>>         functions...
>>
>>         Regards
>>
>>         Thomas
>>
>>         --
>>         Dr. rer. nat. Thomas Brumme
>>         Max Planck Institute for the Structure and Dynamics of Matter
>>         Luruper Chaussee 149
>>         22761 Hamburg
>>
>>         Tel: +49 (0)40 8998 6557 <tel:%2B49%20%280%2940%208998%206557>
>>
>>         email: Thomas.Brumme at mpsd.mpg.de
>>         <mailto:Thomas.Brumme at mpsd.mpg.de>
>>
>>         _______________________________________________
>>         Q-e-developers mailing list
>>         Q-e-developers at qe-forge.org <mailto:Q-e-developers at qe-forge.org>
>>         http://qe-forge.org/mailman/listinfo/q-e-developers
>>         <http://qe-forge.org/mailman/listinfo/q-e-developers>
>>
>>
>>
>>
>>     -- 
>>     Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
>>     Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
>>     Phone +39-0432-558216 <tel:%2B39-0432-558216>, fax
>>     +39-0432-558222 <tel:%2B39-0432-558222>
>>
>>
>>
>>     _______________________________________________
>>     Q-e-developers mailing list
>>     Q-e-developers at qe-forge.org <mailto:Q-e-developers at qe-forge.org>
>>     http://qe-forge.org/mailman/listinfo/q-e-developers
>>     <http://qe-forge.org/mailman/listinfo/q-e-developers>
>
>     -- 
>     Dr. rer. nat. Thomas Brumme
>     Max Planck Institute for the Structure and Dynamics of Matter
>     Luruper Chaussee 149
>     22761 Hamburg
>
>     Tel:+49 (0)40 8998 6557 <tel:%2B49%20%280%2940%208998%206557>
>
>     email:Thomas.Brumme at mpsd.mpg.de <mailto:Thomas.Brumme at mpsd.mpg.de>
>
>     _______________________________________________ Q-e-developers
>     mailing list Q-e-developers at qe-forge.org
>     <mailto:Q-e-developers at qe-forge.org>
>     http://qe-forge.org/mailman/listinfo/q-e-developers
>     <http://qe-forge.org/mailman/listinfo/q-e-developers> 
>
> -- 
> Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche, 
> Univ. Udine, via delle Scienze 208, 33100 Udine, Italy Phone 
> +39-0432-558216, fax +39-0432-558222
>
> _______________________________________________
> Q-e-developers mailing list
> Q-e-developers at qe-forge.org
> http://qe-forge.org/mailman/listinfo/q-e-developers
-- 
Dr. rer. nat. Thomas Brumme
Max Planck Institute for the Structure and Dynamics of Matter
Luruper Chaussee 149
22761 Hamburg

Tel:  +49 (0)40 8998 6557

email: Thomas.Brumme at mpsd.mpg.de
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/developers/attachments/20160926/b381dda0/attachment.html>


More information about the developers mailing list