<div dir="ltr"><div><div>In my opinion 93 Ry is a lot. You may use a lower cutoff for optimizing the structure, refining it later (discovering maybe that very little changes). Also note that convergence thresholds <span class="im"> etot_conv_thr = 1.0D-5 , forc_conv_thr = 1.95D-6 are very strict (too much in my opinion: the code will go on forever performing tiny steps very close to convergence). Also scf convergence threshold</span><span class="im"> conv_thr =
1D-10 is too strict in my opinion: the code reduces it anyway while approaching convergence.<br><br></span></div><span class="im">For parallelism: make tests with a single scf iteration, experiment with different parallelization levels, and do not assume that using all available processors is always the best choice. Mixed MPI-OpenMP parallelization might work better than an all-MPI one..<br><br></span></div><span class="im">Paolo<br> </span></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Sep 2, 2015 at 9:55 PM, Bang C. Huynh <span dir="ltr"><<a href="mailto:cbh31@cam.ac.uk" target="_blank">cbh31@cam.ac.uk</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><u></u>
<div style="font-size:10pt">
<p>Dear Pascal & Paolo,</p>
<p>Thank you for your replies. Although it's true that I'm using USPP, there's uranium involved and the pseudopotential file (generated from the <span>PSlibrary for the ld1.x atomic code)</span> suggests a minimum value of 93 for Ecutwfc, hence the high Ecutwfc used in my input. I'm a bit hesitant to lower this, as I'm not sure how much accuracy will be compromised.</p>
<p>I'll play around with the k-grid to see if I can use a coarser one, perhaps 2x2x1. Currently with 4x4x2 there are 18 non-equivalent k-points in total, which I do agree are a bit much.</p>
<p>If you are interested here's the output file (incomplete): <a href="https://dl.dropboxusercontent.com/u/21657676/skutU0.125_vc.vc2.out" target="_blank">https://dl.dropboxusercontent.com/u/21657676/skutU0.125_vc.vc2.out</a></p>
<p> </p>
<p>Regards,</p>
<div>---<span class=""><br>
<pre><strong>Bang C. Huynh<br></strong>Peterhouse<br>University of Cambridge<br>CB2 1RD<br>The United Kingdom</pre>
</span></div>
<p>On 02-09-2015 12:00, <a href="mailto:pw_forum-request@pwscf.org" target="_blank">pw_forum-request@pwscf.org</a> wrote:</p>
<blockquote type="cite" style="padding-left:5px;border-left:#1010ff 2px solid;margin-left:5px">
<pre> </pre>
<pre>Message: 6
Date: Tue, 1 Sep 2015 22:03:49 +0200
From: Pascal Boulet <<a href="mailto:pascal.boulet@univ-amu.fr" target="_blank">pascal.boulet@univ-amu.fr</a>>
Subject: Re: [Pw_forum] Effective parallel implementation?
To: PWSCF Forum <<a href="mailto:pw_forum@pwscf.org" target="_blank">pw_forum@pwscf.org</a>>
Message-ID: <<a href="mailto:4D46F061-4B8E-46E4-988C-E3739DCBC1B0@univ-amu.fr" target="_blank">4D46F061-4B8E-46E4-988C-E3739DCBC1B0@univ-amu.fr</a>>
Content-Type: text/plain; charset="windows-1252"
Hello Bang,
For comparison, in my case I have been able to run a vc-relax job on 92 atoms, 48 cores in 12 hours. I am using USPP, gamma point calculation and no symmetry. This is a slab. I am using a basic command line: mpirun -np 40 pw.x < input > output.
Of course, I do not know if our computers are comparable, but it seems that your performance could be improved, probably through compilation optimization. I am using national supercomputer facility and QE was installed by a system manager, so I cannot give compilation details.
I have noted a few things in your input that I (personally) would change:
Energy convergence=1d-7
Force convergence=1d-4 eventually 1d-5 if you plan to compute phonons (in any case it should not be smaller than energy convergence)
conv_thr=1d-8 (or smaller ? 1d-9, 1d-10 -- in case of phonon calculations only)
It seems that your are using USPP, so in this case I think you can reduce Ecutwfc to between 30-50, but you have to test this, and Ecutrho should be between 8 and 12 times Ecutwfc.
Check also if you can reduce the number of k-points: your cell seems to be rather large.
HTH
Pascal
Le 1 sept. 2015 ? 21:18, Bang C. Huynh <<a href="mailto:cbh31@cam.ac.uk" target="_blank">cbh31@cam.ac.uk</a>> a ?crit :</pre>
<blockquote type="cite" style="padding-left:5px;border-left:#1010ff 2px solid;margin-left:5px">Dear all, I am currently attempting to perform several structural relaxation calculations on supercells that contain at least ~70 atoms (and even more, say hundreds). My calculations are being done on a single node with 40 cores, 2.4 GHz, Intel Xeon E5-2676v3 and 160 GiB memory (m4.10xlarge Amazon EC2). I am just wondering if my implementation for parallelism is 'reasonable' in the sense that the resources are fully utilised, and not in some way underutilised or poorly distributed. I'm pretty new to this so I'm not sure what to expect... Should I be happy with the current performance, can it be better, or should I consider deploying more resources and is it worth it? Currently one scf iteration takes around 5-7 minutes. Scf-convergence is achieved after around 50 scf iterations, and I'm not sure how long it's going to take for the vc-relax iterations to converge... The input file is shown below. I use this command to initiate the job:
<span class=""><blockquote type="cite" style="padding-left:5px;border-left:#1010ff 2px solid;margin-left:5px">mpirun -np 40 pw.x -npool 2 -ndiag 36 < skutU0.125_vc.vc2 > skutU0.125_vc.vc2.out</blockquote>
Thank you for your help. Regards, -- Bang C. Huynh Peterhouse University of Cambridge CB2 1RD The United Kingdom ========input=======
<blockquote type="cite" style="padding-left:5px;border-left:#1010ff 2px solid;margin-left:5px">&CONTROL title = skutterudite-U-doped , calculation = 'vc-relax' , outdir = './' , wfcdir = './' , pseudo_dir = '../pseudo/' , prefix = 'skutU0.125_vc' , etot_conv_thr = 1.0D-5 , forc_conv_thr = 1.95D-6 , nstep = 250 , dt = 150 , / &SYSTEM ibrav = 6, celldm(1) = 17.06 celldm(3) = 2, nat = 66, ntyp = 3, ecutwfc = 93 , ecutrho = 707 , occupations = 'smearing' , starting_spin_angle = .false. , degauss = 0.02 , smearing = 'methfessel-paxton' , / &ELECTRONS conv_thr = 1D-10 , / &IONS / &CELL cell_dynamics = 'damp-w' , cell_dofree = 'all' , / ATOMIC_SPECIES Co 58.93000 Co.pz-nd-rrkjus.UPF Sb 121.76000 Sb.pz-bhs.UPF U 238.02891 U.pz-spfn-rrkjus_psl.1.0.0.UPF ATOMIC_POSITIONS crystal Co 0.250000000 0.250000000 0.125000000 0 0 0 Co 0.250000000 0.250000000 0.625000000 0 0 0 Co 0.750000000 0.750000000 0.375000000 0 0 0 Co 0.750000000 0.750000000 0.875000000 0 0 0 Co 0.750000000 0.750000000 0.125000000 0 0 0 Co 0.750000000 0.750000000 0.625000000 0 0 0 Co 0.250000000 0.250000000 0.375000000 0 0 0 Co 0.250000000 0.250000000 0.875000000 0 0 0 Co 0.750000000 0.250000000 0.375000000 0 0 0 Co 0.750000000 0.250000000 0.875000000 0 0 0 Co 0.250000000 0.750000000 0.125000000 0 0 0 Co 0.250000000 0.750000000 0.625000000 0 0 0 Co 0.250000000 0.750000000 0.375000000 0 0 0 Co 0.250000000 0.750000000 0.875000000 0 0 0 Co 0.750000000 0.250000000 0.125000000 0 0 0 Co 0.750000000 0.250000000 0.625000000 0 0 0 Sb 0.000000000 0.337592989 0.078553498 Sb 0.000000000 0.337592989 0.578553498 Sb 0.000000000 0.662407041 0.421446502 Sb 0.000000000 0.662407041 0.921446502 Sb 0.000000000 0.662407041 0.078553498 Sb 0.000000000 0.662407041 0.578553498 Sb 0.000000000 0.337592989 0.421446502 Sb 0.000000000 0.337592989 0.921446502 Sb 0.157106996 0.000000000 0.168796495 Sb 0.157106996 0.000000000 0.668796480 Sb 0.842893004 0.000000000 0.331203520 Sb 0.842893004 0.000000000 0.831203520 Sb 0.157106996 0.000000000 0.331203520 Sb 0.157106996 0.000000000 0.831203520 Sb 0.842893004 0.000000000 0.168796495 Sb 0.842893004 0.000000000 0.668796480 Sb 0.337592989 0.157106996 0.000000000 Sb 0.337592989 0.157106996 0.500000000 Sb 0.662407041 0.842893004 0.000000000 Sb 0.662407041 0.842893004 0.500000000 Sb 0.662407041 0.157106996 0.000000000 Sb 0.662407041 0.157106996 0.500000000 Sb 0.337592989 0.842893004 0.000000000 Sb 0.337592989 0.842893004 0.500000000 Sb 0.500000000 0.837592959 0.328553498 Sb 0.500000000 0.837592959 0.828553498 Sb 0.500000000 0.162407011 0.171446502 Sb 0.500000000 0.162407011 0.671446502 Sb 0.500000000 0.162407011 0.328553498 Sb 0.500000000 0.162407011 0.828553498 Sb 0.500000000 0.837592959 0.171446502 Sb 0.500000000 0.837592959 0.671446502 Sb 0.657106996 0.500000000 0.418796480 Sb 0.657106996 0.500000000 0.918796480 Sb 0.342893004 0.500000000 0.081203505 Sb 0.342893004 0.500000000 0.581203520 Sb 0.657106996 0.500000000 0.081203505 Sb 0.657106996 0.500000000 0.581203520 Sb 0.342893004 0.500000000 0.418796480 Sb 0.342893004 0.500000000 0.918796480 Sb 0.837592959 0.657106996 0.250000000 Sb 0.837592959 0.657106996 0.750000000 Sb 0.162407011 0.342893004 0.250000000 Sb 0.162407011 0.342893004 0.750000000 Sb 0.162407011 0.657106996 0.250000000 Sb 0.162407011 0.657106996 0.750000000 Sb 0.837592959 0.342893004 0.250000000 Sb 0.837592959 0.342893004 0.750000000 U 0.000000000 0.000000000 0.000000000 U 0.000000000 0.000000000 0.500000000 K_POINTS automatic 4 4 2 0 0 0</blockquote>
_______________________________________________ Pw_forum mailing list <a href="mailto:Pw_forum@pwscf.org" target="_blank">Pw_forum@pwscf.org</a> <a href="http://pwscf.org/mailman/listinfo/pw_forum" target="_blank">http://pwscf.org/mailman/listinfo/pw_forum</a></span></blockquote>
<pre>--
Pascal Boulet - MCF HDR, Resp. L1 MPCI - DEPARTEMENT CHIMIE
Aix-Marseille Universit? - ST JEROME - Avenue Escadrille Normandie Niemen - 13013 Marseille
T?l: <a href="tel:%2B33%280%294%2013%2055%2018%2010" value="+33413551810" target="_blank">+33(0)4 13 55 18 10</a> - Fax : <a href="tel:%2B33%280%294%2013%2055%2018%2050" value="+33413551850" target="_blank">+33(0)4 13 55 18 50</a><span class="">
Site : <a href="http://allos.up.univ-mrs.fr/pascal" target="_blank">http://allos.up.univ-mrs.fr/pascal</a> - Email : <a href="mailto:pascal.boulet@univ-amu.fr" target="_blank">pascal.boulet@univ-amu.fr</a></span>
Afin de respecter l'environnement, merci de n'imprimer cet email que si n?cessaire.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <a href="http://pwscf.org/pipermail/pw_forum/attachments/20150901/e196c6be/attachment-0001.html" target="_blank">http://pwscf.org/pipermail/pw_forum/attachments/20150901/e196c6be/attachment-0001.html</a>
------------------------------
Message: 7
Date: Tue, 1 Sep 2015 22:41:53 +0200
From: Paolo Giannozzi <<a href="mailto:p.giannozzi@gmail.com" target="_blank">p.giannozzi@gmail.com</a>>
Subject: Re: [Pw_forum] Effective parallel implementation?
To: PWSCF Forum <<a href="mailto:pw_forum@pwscf.org" target="_blank">pw_forum@pwscf.org</a>>
Message-ID:
<<a href="mailto:CAPMgbCuVHov+w+7sG=iYNZ7Lff441P+uRr5nm5nQeR6Mm7g_Pg@mail.gmail.com" target="_blank">CAPMgbCuVHov+w+7sG=iYNZ7Lff441P+uRr5nm5nQeR6Mm7g_Pg@mail.gmail.com</a>>
Content-Type: text/plain; charset="utf-8"
On Tue, Sep 1, 2015 at 9:18 PM, Bang C. Huynh <<a href="mailto:cbh31@cam.ac.uk" target="_blank">cbh31@cam.ac.uk</a>> wrote:</pre><div><div class="h5">
<blockquote type="cite" style="padding-left:5px;border-left:#1010ff 2px solid;margin-left:5px">The input file is shown below</blockquote>
<pre>performances are better estimated from the output, rather than the input.
What is useful in particular is the final printout with timings. It is
sufficient to do it for a single scf step.
Paolo
--
Paolo Giannozzi, Dept. Chemistry&Physics&Environment,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
Phone <a href="tel:%2B39-0432-558216" value="+390432558216" target="_blank">+39-0432-558216</a>, fax <a href="tel:%2B39-0432-558222" value="+390432558222" target="_blank">+39-0432-558222</a>
</pre>
</div></div></blockquote>
</div>
<br>_______________________________________________<br>
Pw_forum mailing list<br>
<a href="mailto:Pw_forum@pwscf.org">Pw_forum@pwscf.org</a><br>
<a href="http://pwscf.org/mailman/listinfo/pw_forum" rel="noreferrer" target="_blank">http://pwscf.org/mailman/listinfo/pw_forum</a><br></blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><span><span><font color="#888888">Paolo Giannozzi, Dept. Chemistry&Physics&Environment,<br>
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy<br>
Phone <a href="tel:%2B39-0432-558216" value="+390432558216" target="_blank">+39-0432-558216</a>, fax <a href="tel:%2B39-0432-558222" value="+390432558222" target="_blank">+39-0432-558222</a></font></span></span></div></div></div></div>
</div>