[Q-e-developers] bug image parallelization in PHonon
Thomas Brumme
thomas.brumme at mpsd.mpg.de
Mon Sep 26 12:01:58 CEST 2016
Dear Paolo,
the "patch" is working. Now I can't calculate the dvscf files using
images... I need to use the GRID way...
Not really a solution but at least the problem won't occur in the
official version :)
Yet, the 5th problem remains... Restarting an el-ph calculation does not
work no matter if I use the
calculation from file (trans=.false.) or not. At least for me it does
not work...
Regards
Thomas
On 09/23/2016 06:21 PM, Paolo Giannozzi wrote:
> Is the following patch doing (that is: stopping) the job?
> ---
> Index: /home/giannozz/trunk/espresso/PHonon/PH/phq_readin.f90
> ===================================================================
> --- /home/giannozz/trunk/espresso/PHonon/PH/phq_readin.f90 (revision
> 13008)
> +++ /home/giannozz/trunk/espresso/PHonon/PH/phq_readin.f90 (working copy)
> @@ -679,6 +679,8 @@
>
> IF(elph.and.nimage>1) call errore('phq_readin',&
> 'el-ph with images not implemented',1)
> + IF( fildvscf /= ' ' .and. nimage > 1 ) call errore('phq_readin',&
> + 'saving dvscf to file images not implemented',1)
>
> IF (elph.OR.fildvscf /= ' ') lqdir=.TRUE.
> ---
>
> Paolo
>
> On Fri, Sep 23, 2016 at 4:45 PM, Thomas Brumme
> <thomas.brumme at mpsd.mpg.de <mailto:thomas.brumme at mpsd.mpg.de>> wrote:
>
> I meanwhile had a discussion with Lorenzo Paulatto about a similar
> problem.
>
> I think that it might be a rather specific problem. As soon as I
> parallelize only over
> q points using start_q and last_q there is no problem - also for
> restarting.
>
> Using images I can, in principle, even create the full dvscf
> files, without having to
> rerun the calculation without images, using split and cat on the
> different dvscf files
> in the different temp folders. It's tedious but it works. Yet, in
> future I will use only
> the parallelization over q points for the calculation of the dvscf.
>
> In summary, the parallelization for PH is not straightforward and
> I think that it
> might help to store, e.g., the dvscf files for different
> representations separately.
> But Lorenzo mentioned that system administrators complain if the
> number of
> written files is large... It could be helpful if there would be a
> kind of summary
> what can be done using images and what not... I.e. dvscf (and
> el-ph) does not
> work if image parallelization is used, especially if the different
> representations
> of one q point are split across different images. For el-ph the
> code does not
> start, but maybe a similar check can be added for the dvscf files?
>
> Well, or maybe not, I don't know :)
>
>
> On 09/23/2016 04:24 PM, Paolo Giannozzi wrote:
>> has anybody any idea? P.
>>
>> On Wed, Sep 14, 2016 at 1:30 PM, Thomas Brumme
>> <thomas.brumme at mpsd.mpg.de <mailto:thomas.brumme at mpsd.mpg.de>> wrote:
>>
>> Dear all,
>>
>> I think I found a bug in the image parallelization of PH - or
>> I'm doing
>> something wrong.
>> I used the version 5.4 but the problem is also there if I use
>> the 6.0 beta.
>> Maybe someone remembers my email few days ago to the normal
>> email list
>> concerning
>> the parallelization using the GRID technique - the problem I
>> encounter
>> here is essentially
>> the same. As an example, I use a modified run_example_1 of the
>> Recover_example
>> directory of PH.
>>
>> Description of the problem:
>>
>> 0. (Following the example) I did an scf calculation using 2
>> CPUs with:
>>
>> &control
>> calculation='scf'
>> restart_mode='from_scratch',
>> prefix='aluminum',
>> pseudo_dir = './',
>> outdir='./tempdir/'
>> /
>> &system
>> ibrav= 2, celldm(1) =7.5, nat= 1, ntyp= 1,
>> ecutwfc =15.0,
>> occupations='smearing', smearing='methfessel-paxton',
>> degauss=0.05,
>> la2F = .true.,
>> /
>> &electrons
>> conv_thr = 1.0d-8
>> mixing_beta = 0.7
>> /
>> ATOMIC_SPECIES
>> Al 26.98 Al.pz-vbc.UPF
>> ATOMIC_POSITIONS
>> Al 0.00 0.00 0.00
>> K_POINTS {automatic}
>> 16 16 16 0 0 0
>>
>>
>> 1. I'll do the scf calculation using 2 CPUS and:
>>
>> &control
>> calculation='scf'
>> restart_mode='from_scratch',
>> prefix='aluminum',
>> pseudo_dir = './',
>> outdir='./tempdir/'
>> /
>> &system
>> ibrav= 2, celldm(1) =7.5, nat= 1, ntyp= 1,
>> ecutwfc =15.0,
>> occupations='smearing', smearing='methfessel-paxton',
>> degauss=0.05
>> /
>> &electrons
>> conv_thr = 1.0d-8
>> mixing_beta = 0.7
>> /
>> ATOMIC_SPECIES
>> Al 26.98 Al.pz-vbc.UPF
>> ATOMIC_POSITIONS
>> Al 0.00 0.00 0.00
>> K_POINTS {automatic}
>> 8 8 8 0 0 0
>>
>>
>> 2. I'll do a phonon calculation including storing the dvscf
>> files and
>> using images.
>> More specifically I used:
>>
>> mpirun -np 4 ph.x -ni 2 < al.elph.in <http://al.elph.in>
>>
>> with al.elph.in <http://al.elph.in> given by:
>>
>> Electron-phonon coefficients for Al
>> &inputph
>> tr2_ph=1.0d-10,
>> prefix='aluminum',
>> fildvscf='aldv',
>> amass(1)=26.98,
>> outdir='./tempdir/',
>> fildyn='al.dyn',
>> ! electron_phonon='interpolated',
>> ! el_ph_sigma=0.005,
>> ! el_ph_nsigma=10,
>> ! recover=.true.
>> ! trans=.false.,
>> ldisp=.true.
>> max_seconds=6,
>> nq1=4, nq2=4, nq3=4
>> /
>>
>> I used max_seconds in order to simulate the finite run time
>> we have on
>> our HPC.
>> Restarting with recover=.true. works fine... I.e. I used:
>>
>> Electron-phonon coefficients for Al
>> &inputph
>> tr2_ph=1.0d-10,
>> prefix='aluminum',
>> fildvscf='aldv',
>> amass(1)=26.98,
>> outdir='./tempdir/',
>> fildyn='al.dyn',
>> ! electron_phonon='interpolated',
>> ! el_ph_sigma=0.005,
>> ! el_ph_nsigma=10,
>> recover=.true.
>> ! trans=.false.,
>> ldisp=.true.
>> max_seconds=6,
>> nq1=4, nq2=4, nq3=4
>> /
>>
>>
>> 3. Now I want to collect all data using no images:
>>
>> mpirun -np 2 ph.x < al.elph.in <http://al.elph.in>
>>
>> with the same input file as given in 2.
>>
>> I'll get the error "Possibly too few bands at point ..." once
>> the code
>> wants to
>> recalculate the wave functions for the q points which were
>> calculated
>> only on
>> the second image, i.e., for q points 6, 7, and 8.
>>
>> If I check the charge_density.dat files in the subfolders of
>> the q
>> points in the
>> _ph0 directory I find that they're empty. Thus, I copied the q
>> subfolders of the
>> second image by hand to the folder of the first image using:
>>
>> cp -r _ph1/aluminum.q_* _ph0/
>>
>> If I now restart without images, using the input of 2. it
>> works...
>> Everything is fine...
>>
>>
>> 4. Now I can also calculate the el-ph parameters using the input:
>>
>> Electron-phonon coefficients for Al
>> &inputph
>> tr2_ph=1.0d-10,
>> prefix='aluminum',
>> fildvscf='aldv',
>> amass(1)=26.98,
>> outdir='./tempdir/',
>> fildyn='al.dyn',
>> electron_phonon='interpolated',
>> el_ph_sigma=0.005,
>> el_ph_nsigma=10,
>> ! recover=.true.
>> trans=.false.,
>> ldisp=.true.
>> ! max_seconds=6,
>> nq1=4, nq2=4, nq3=4
>> /
>>
>>
>> 5. Another problem I encounter is the following... Suppose
>> the run time
>> is not enough to
>> finish the el-ph calculations, i.e., instead of the input in
>> 4. I use:
>>
>> Electron-phonon coefficients for Al
>> &inputph
>> tr2_ph=1.0d-10,
>> prefix='aluminum',
>> fildvscf='aldv',
>> amass(1)=26.98,
>> outdir='./tempdir/',
>> fildyn='al.dyn',
>> electron_phonon='interpolated',
>> el_ph_sigma=0.005,
>> el_ph_nsigma=10,
>> ! recover=.true.
>> trans=.false.,
>> ldisp=.true.
>> max_seconds=6,
>> nq1=4, nq2=4, nq3=4
>> /
>>
>> The code will stop at a certain point (in my case the 4th q
>> point). If I
>> now restart the calculation
>> using:
>>
>> Electron-phonon coefficients for Al
>> &inputph
>> tr2_ph=1.0d-10,
>> prefix='aluminum',
>> fildvscf='aldv',
>> amass(1)=26.98,
>> outdir='./tempdir/',
>> fildyn='al.dyn',
>> electron_phonon='interpolated',
>> el_ph_sigma=0.005,
>> el_ph_nsigma=10,
>> recover=.true.
>> trans=.false.,
>> ldisp=.true.
>> ! max_seconds=6,
>> nq1=4, nq2=4, nq3=4
>> /
>>
>> I get (again) the error message "Possibly too few bands at
>> point ..."
>> once the code wants to calculate
>> the wave functions for the 4th q point (the one it stopped
>> before)...
>> All other points are fine...
>>
>>
>> I think that the whole problem is related to the storing of
>> the wave
>> functions and the charge density.
>> Maybe I'm doing something really wrong, but I don't see any
>> obvious
>> error in the input... Also I don't
>> see any input variable for ph which influences the saving of wave
>> functions...
>>
>> Regards
>>
>> Thomas
>>
>> --
>> Dr. rer. nat. Thomas Brumme
>> Max Planck Institute for the Structure and Dynamics of Matter
>> Luruper Chaussee 149
>> 22761 Hamburg
>>
>> Tel: +49 (0)40 8998 6557 <tel:%2B49%20%280%2940%208998%206557>
>>
>> email: Thomas.Brumme at mpsd.mpg.de
>> <mailto:Thomas.Brumme at mpsd.mpg.de>
>>
>> _______________________________________________
>> Q-e-developers mailing list
>> Q-e-developers at qe-forge.org <mailto:Q-e-developers at qe-forge.org>
>> http://qe-forge.org/mailman/listinfo/q-e-developers
>> <http://qe-forge.org/mailman/listinfo/q-e-developers>
>>
>>
>>
>>
>> --
>> Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
>> Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
>> Phone +39-0432-558216 <tel:%2B39-0432-558216>, fax
>> +39-0432-558222 <tel:%2B39-0432-558222>
>>
>>
>>
>> _______________________________________________
>> Q-e-developers mailing list
>> Q-e-developers at qe-forge.org <mailto:Q-e-developers at qe-forge.org>
>> http://qe-forge.org/mailman/listinfo/q-e-developers
>> <http://qe-forge.org/mailman/listinfo/q-e-developers>
>
> --
> Dr. rer. nat. Thomas Brumme
> Max Planck Institute for the Structure and Dynamics of Matter
> Luruper Chaussee 149
> 22761 Hamburg
>
> Tel:+49 (0)40 8998 6557 <tel:%2B49%20%280%2940%208998%206557>
>
> email:Thomas.Brumme at mpsd.mpg.de <mailto:Thomas.Brumme at mpsd.mpg.de>
>
> _______________________________________________ Q-e-developers
> mailing list Q-e-developers at qe-forge.org
> <mailto:Q-e-developers at qe-forge.org>
> http://qe-forge.org/mailman/listinfo/q-e-developers
> <http://qe-forge.org/mailman/listinfo/q-e-developers>
>
> --
> Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
> Univ. Udine, via delle Scienze 208, 33100 Udine, Italy Phone
> +39-0432-558216, fax +39-0432-558222
>
> _______________________________________________
> Q-e-developers mailing list
> Q-e-developers at qe-forge.org
> http://qe-forge.org/mailman/listinfo/q-e-developers
--
Dr. rer. nat. Thomas Brumme
Max Planck Institute for the Structure and Dynamics of Matter
Luruper Chaussee 149
22761 Hamburg
Tel: +49 (0)40 8998 6557
email: Thomas.Brumme at mpsd.mpg.de
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/developers/attachments/20160926/b381dda0/attachment.html>
More information about the developers
mailing list