<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<p>I meanwhile had a discussion with Lorenzo Paulatto about a
similar problem.<br>
</p>
<p>I think that it might be a rather specific problem. As soon as I
parallelize only over<br>
q points using start_q and last_q there is no problem - also for
restarting.</p>
<p>Using images I can, in principle, even create the full dvscf
files, without having to<br>
rerun the calculation without images, using split and cat on the
different dvscf files<br>
in the different temp folders. It's tedious but it works. Yet, in
future I will use only<br>
the parallelization over q points for the calculation of the
dvscf.</p>
<p>In summary, the parallelization for PH is not straightforward and
I think that it<br>
might help to store, e.g., the dvscf files for different
representations separately.<br>
But Lorenzo mentioned that system administrators complain if the
number of<br>
written files is large... It could be helpful if there would be a
kind of summary<br>
what can be done using images and what not... I.e. dvscf (and
el-ph) does not<br>
work if image parallelization is used, especially if the different
representations<br>
of one q point are split across different images. For el-ph the
code does not<br>
start, but maybe a similar check can be added for the dvscf files?</p>
<p>Well, or maybe not, I don't know :)<br>
</p>
<br>
<div class="moz-cite-prefix">On 09/23/2016 04:24 PM, Paolo Giannozzi
wrote:<br>
</div>
<blockquote
cite="mid:CAPMgbCsV0BYAcyU87tVjuKOFGowBHmBzR2yQRuNJ5GbD-HO5kQ@mail.gmail.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<div dir="ltr">has anybody any idea? P.<br>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Wed, Sep 14, 2016 at 1:30 PM, Thomas
Brumme <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:thomas.brumme@mpsd.mpg.de" target="_blank">thomas.brumme@mpsd.mpg.de</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">Dear all,<br>
<br>
I think I found a bug in the image parallelization of PH -
or I'm doing<br>
something wrong.<br>
I used the version 5.4 but the problem is also there if I
use the 6.0 beta.<br>
Maybe someone remembers my email few days ago to the normal
email list<br>
concerning<br>
the parallelization using the GRID technique - the problem I
encounter<br>
here is essentially<br>
the same. As an example, I use a modified run_example_1 of
the<br>
Recover_example<br>
directory of PH.<br>
<br>
Description of the problem:<br>
<br>
0. (Following the example) I did an scf calculation using 2
CPUs with:<br>
<br>
&control<br>
calculation='scf'<br>
restart_mode='from_scratch',<br>
prefix='aluminum',<br>
pseudo_dir = './',<br>
outdir='./tempdir/'<br>
/<br>
&system<br>
ibrav= 2, celldm(1) =7.5, nat= 1, ntyp= 1,<br>
ecutwfc =15.0,<br>
occupations='smearing', smearing='methfessel-paxton',
degauss=0.05,<br>
la2F = .true.,<br>
/<br>
&electrons<br>
conv_thr = 1.0d-8<br>
mixing_beta = 0.7<br>
/<br>
ATOMIC_SPECIES<br>
Al 26.98 Al.pz-vbc.UPF<br>
ATOMIC_POSITIONS<br>
Al 0.00 0.00 0.00<br>
K_POINTS {automatic}<br>
16 16 16 0 0 0<br>
<br>
<br>
1. I'll do the scf calculation using 2 CPUS and:<br>
<br>
&control<br>
calculation='scf'<br>
restart_mode='from_scratch',<br>
prefix='aluminum',<br>
pseudo_dir = './',<br>
outdir='./tempdir/'<br>
/<br>
&system<br>
ibrav= 2, celldm(1) =7.5, nat= 1, ntyp= 1,<br>
ecutwfc =15.0,<br>
occupations='smearing', smearing='methfessel-paxton',
degauss=0.05<br>
/<br>
&electrons<br>
conv_thr = 1.0d-8<br>
mixing_beta = 0.7<br>
/<br>
ATOMIC_SPECIES<br>
Al 26.98 Al.pz-vbc.UPF<br>
ATOMIC_POSITIONS<br>
Al 0.00 0.00 0.00<br>
K_POINTS {automatic}<br>
8 8 8 0 0 0<br>
<br>
<br>
2. I'll do a phonon calculation including storing the dvscf
files and<br>
using images.<br>
More specifically I used:<br>
<br>
mpirun -np 4 ph.x -ni 2 < <a moz-do-not-send="true"
href="http://al.elph.in" rel="noreferrer" target="_blank">al.elph.in</a><br>
<br>
with <a moz-do-not-send="true" href="http://al.elph.in"
rel="noreferrer" target="_blank">al.elph.in</a> given by:<br>
<br>
Electron-phonon coefficients for Al<br>
&inputph<br>
tr2_ph=1.0d-10,<br>
prefix='aluminum',<br>
fildvscf='aldv',<br>
amass(1)=26.98,<br>
outdir='./tempdir/',<br>
fildyn='al.dyn',<br>
! electron_phonon='interpolated'<wbr>,<br>
! el_ph_sigma=0.005,<br>
! el_ph_nsigma=10,<br>
! recover=.true.<br>
! trans=.false.,<br>
ldisp=.true.<br>
max_seconds=6,<br>
nq1=4, nq2=4, nq3=4<br>
/<br>
<br>
I used max_seconds in order to simulate the finite run time
we have on<br>
our HPC.<br>
Restarting with recover=.true. works fine... I.e. I used:<br>
<br>
Electron-phonon coefficients for Al<br>
&inputph<br>
tr2_ph=1.0d-10,<br>
prefix='aluminum',<br>
fildvscf='aldv',<br>
amass(1)=26.98,<br>
outdir='./tempdir/',<br>
fildyn='al.dyn',<br>
! electron_phonon='interpolated'<wbr>,<br>
! el_ph_sigma=0.005,<br>
! el_ph_nsigma=10,<br>
recover=.true.<br>
! trans=.false.,<br>
ldisp=.true.<br>
max_seconds=6,<br>
nq1=4, nq2=4, nq3=4<br>
/<br>
<br>
<br>
3. Now I want to collect all data using no images:<br>
<br>
mpirun -np 2 ph.x < <a moz-do-not-send="true"
href="http://al.elph.in" rel="noreferrer" target="_blank">al.elph.in</a><br>
<br>
with the same input file as given in 2.<br>
<br>
I'll get the error "Possibly too few bands at point ..."
once the code<br>
wants to<br>
recalculate the wave functions for the q points which were
calculated<br>
only on<br>
the second image, i.e., for q points 6, 7, and 8.<br>
<br>
If I check the charge_density.dat files in the subfolders of
the q<br>
points in the<br>
_ph0 directory I find that they're empty. Thus, I copied the
q<br>
subfolders of the<br>
second image by hand to the folder of the first image using:<br>
<br>
cp -r _ph1/aluminum.q_* _ph0/<br>
<br>
If I now restart without images, using the input of 2. it
works...<br>
Everything is fine...<br>
<br>
<br>
4. Now I can also calculate the el-ph parameters using the
input:<br>
<br>
Electron-phonon coefficients for Al<br>
&inputph<br>
tr2_ph=1.0d-10,<br>
prefix='aluminum',<br>
fildvscf='aldv',<br>
amass(1)=26.98,<br>
outdir='./tempdir/',<br>
fildyn='al.dyn',<br>
electron_phonon='interpolated'<wbr>,<br>
el_ph_sigma=0.005,<br>
el_ph_nsigma=10,<br>
! recover=.true.<br>
trans=.false.,<br>
ldisp=.true.<br>
! max_seconds=6,<br>
nq1=4, nq2=4, nq3=4<br>
/<br>
<br>
<br>
5. Another problem I encounter is the following... Suppose
the run time<br>
is not enough to<br>
finish the el-ph calculations, i.e., instead of the input in
4. I use:<br>
<br>
Electron-phonon coefficients for Al<br>
&inputph<br>
tr2_ph=1.0d-10,<br>
prefix='aluminum',<br>
fildvscf='aldv',<br>
amass(1)=26.98,<br>
outdir='./tempdir/',<br>
fildyn='al.dyn',<br>
electron_phonon='interpolated'<wbr>,<br>
el_ph_sigma=0.005,<br>
el_ph_nsigma=10,<br>
! recover=.true.<br>
trans=.false.,<br>
ldisp=.true.<br>
max_seconds=6,<br>
nq1=4, nq2=4, nq3=4<br>
/<br>
<br>
The code will stop at a certain point (in my case the 4th q
point). If I<br>
now restart the calculation<br>
using:<br>
<br>
Electron-phonon coefficients for Al<br>
&inputph<br>
tr2_ph=1.0d-10,<br>
prefix='aluminum',<br>
fildvscf='aldv',<br>
amass(1)=26.98,<br>
outdir='./tempdir/',<br>
fildyn='al.dyn',<br>
electron_phonon='interpolated'<wbr>,<br>
el_ph_sigma=0.005,<br>
el_ph_nsigma=10,<br>
recover=.true.<br>
trans=.false.,<br>
ldisp=.true.<br>
! max_seconds=6,<br>
nq1=4, nq2=4, nq3=4<br>
/<br>
<br>
I get (again) the error message "Possibly too few bands at
point ..."<br>
once the code wants to calculate<br>
the wave functions for the 4th q point (the one it stopped
before)...<br>
All other points are fine...<br>
<br>
<br>
I think that the whole problem is related to the storing of
the wave<br>
functions and the charge density.<br>
Maybe I'm doing something really wrong, but I don't see any
obvious<br>
error in the input... Also I don't<br>
see any input variable for ph which influences the saving of
wave<br>
functions...<br>
<br>
Regards<br>
<br>
Thomas<br>
<br>
--<br>
Dr. rer. nat. Thomas Brumme<br>
Max Planck Institute for the Structure and Dynamics of
Matter<br>
Luruper Chaussee 149<br>
22761 Hamburg<br>
<br>
Tel: +49 (0)40 8998 6557<br>
<br>
email: <a moz-do-not-send="true"
href="mailto:Thomas.Brumme@mpsd.mpg.de">Thomas.Brumme@mpsd.mpg.de</a><br>
<br>
______________________________<wbr>_________________<br>
Q-e-developers mailing list<br>
<a moz-do-not-send="true"
href="mailto:Q-e-developers@qe-forge.org">Q-e-developers@qe-forge.org</a><br>
<a moz-do-not-send="true"
href="http://qe-forge.org/mailman/listinfo/q-e-developers"
rel="noreferrer" target="_blank">http://qe-forge.org/mailman/<wbr>listinfo/q-e-developers</a><br>
</blockquote>
</div>
<br>
<br clear="all">
<br>
-- <br>
<div class="gmail_signature" data-smartmail="gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">
<div>Paolo Giannozzi, Dip. Scienze Matematiche
Informatiche e Fisiche,<br>
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy<br>
Phone +39-0432-558216, fax +39-0432-558222<br>
<br>
</div>
</div>
</div>
</div>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Q-e-developers mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Q-e-developers@qe-forge.org">Q-e-developers@qe-forge.org</a>
<a class="moz-txt-link-freetext" href="http://qe-forge.org/mailman/listinfo/q-e-developers">http://qe-forge.org/mailman/listinfo/q-e-developers</a>
</pre>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
Dr. rer. nat. Thomas Brumme
Max Planck Institute for the Structure and Dynamics of Matter
Luruper Chaussee 149
22761 Hamburg
Tel: +49 (0)40 8998 6557
email: <a class="moz-txt-link-abbreviated" href="mailto:Thomas.Brumme@mpsd.mpg.de">Thomas.Brumme@mpsd.mpg.de</a>
</pre>
</body>
</html>