<div dir="ltr"><div>If I understand correctly, you are parallelizing over k points with 32 processors, but you have just 20 k points. As a consequence, in all loops over k-points, 12 processors will do nothing. While I am quite sure that such a wasteful parallelization works anyway for the self-consistent code, I am not equally sure it will for the phonon code. It isn't presumably difficult to fix it, but I would move to a more sensible parallelization. For 20 k points and 32 processors, I would try 4 pools of 8 processors (mpirun -np 32<br></div><div> ph.x -nk 4 ...)<br></div><div>Paolo<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, May 19, 2020 at 2:12 PM M.J. Hutcheon <<a href="mailto:mjh261@cam.ac.uk">mjh261@cam.ac.uk</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div style="font-size:10pt">
<p>Dear QE users/developers,</p>
<p>Following from the previous request, I've changed to a newer MPI library which gives a little more error information, specifically it does now crash with the following message:</p>
<p>An error occurred in MPI_Allreduce<br>eported by process [1564540929,0]<br>on communicator MPI COMMUNICATOR 6 SPLIT FROM 3<br>MPI_ERR_TRUNCATE: message truncated<br>MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, and potentially your MPI job)</p>
<p>It appears that this is thrown at the end of a self-consistent DFPT calculation (see the attached output file - it appears the final iteration has converged). I'm using the development version of QE, so I suspect that the error arises from somewhere inside <a href="https://gitlab.com/QEF/q-e/-/blob/develop/PHonon/PH/solve_linter.f90" target="_blank">https://gitlab.com/QEF/q-e/-/blob/develop/PHonon/PH/solve_linter.f90</a>.</p>
<p>I don't really know how to debug/workaround this further, any ideas/suggestions would be most welcome.</p>
<p>Best,</p>
<p>Michael Hutcheon</p>
<p>TCM group, University of Cambridge</p>
<p><br></p>
<p><br></p>
<p id="gmail-m_6701944341264809987reply-intro">On 2020-05-12 13:29, M.J. Hutcheon wrote:</p>
<blockquote type="cite" style="padding:0px 0.4em;border-left:2px solid rgb(16,16,255);margin:0px">
<div id="gmail-m_6701944341264809987replybody1">
<div style="font-size:10pt">
<p>Dear QE users/developers,</p>
<p>I am running an electron-phonon coupling calculation at the gamma point for a large unit cell Calcium-Hydride (Output file attached). The calculation appears to get stuck during the DFPT stage. It does not crash, or produce any error files/output of any sort, or run out of walltime, but the calculation does not progress either. I have tried different parameter sets (k-point grids + cutoffs), which changes the representation where the calculation gets stuck, but it still gets stuck. I don't really know what to try next, short of compiling QE in debug mode and running under a debugger to see where it gets stuck. Any ideas before I head down this laborious route?</p>
<p>Many thanks,</p>
<p>Michael Hutcheon</p>
<p>TCM group, University of Cambridge</p>
</div>
</div>
</blockquote>
<p><br></p>
</div>
_______________________________________________<br>
Quantum ESPRESSO is supported by MaX (<a href="http://www.max-centre.eu/quantum-espresso" rel="noreferrer" target="_blank">www.max-centre.eu/quantum-espresso</a>)<br>
users mailing list <a href="mailto:users@lists.quantum-espresso.org" target="_blank">users@lists.quantum-espresso.org</a><br>
<a href="https://lists.quantum-espresso.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.quantum-espresso.org/mailman/listinfo/users</a></blockquote></div><br clear="all"><br>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div>Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,<br>Univ. Udine, via delle Scienze 208, 33100 Udine, Italy<br>Phone +39-0432-558216, fax +39-0432-558222<br><br></div></div></div></div></div>