[QE-users] DFPT getting stuck [MPI_ERR_TRUNCATE]
M.J. Hutcheon
mjh261 at cam.ac.uk
Tue May 19 14:12:02 CEST 2020
Dear QE users/developers,
Following from the previous request, I've changed to a newer MPI library
which gives a little more error information, specifically it does now
crash with the following message:
An error occurred in MPI_Allreduce
eported by process [1564540929,0]
on communicator MPI COMMUNICATOR 6 SPLIT FROM 3
MPI_ERR_TRUNCATE: message truncated
MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, and
potentially your MPI job)
It appears that this is thrown at the end of a self-consistent DFPT
calculation (see the attached output file - it appears the final
iteration has converged). I'm using the development version of QE, so I
suspect that the error arises from somewhere inside
https://gitlab.com/QEF/q-e/-/blob/develop/PHonon/PH/solve_linter.f90.
I don't really know how to debug/workaround this further, any
ideas/suggestions would be most welcome.
Best,
Michael Hutcheon
TCM group, University of Cambridge
On 2020-05-12 13:29, M.J. Hutcheon wrote:
> Dear QE users/developers,
>
> I am running an electron-phonon coupling calculation at the gamma point for a large unit cell Calcium-Hydride (Output file attached). The calculation appears to get stuck during the DFPT stage. It does not crash, or produce any error files/output of any sort, or run out of walltime, but the calculation does not progress either. I have tried different parameter sets (k-point grids + cutoffs), which changes the representation where the calculation gets stuck, but it still gets stuck. I don't really know what to try next, short of compiling QE in debug mode and running under a debugger to see where it gets stuck. Any ideas before I head down this laborious route?
>
> Many thanks,
>
> Michael Hutcheon
>
> TCM group, University of Cambridge
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20200519/2a7ff566/attachment.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: elph.out
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20200519/2a7ff566/attachment.ksh>
More information about the users
mailing list