[Pw_forum] QE-5.1.2: NEB.X Crashes with MPI errors

Paolo Giannozzi p.giannozzi at gmail.com
Tue Jun 16 11:42:25 CEST 2015


On Mon, Jun 15, 2015 at 4:13 PM, Mauro Sgroi <maurofrancesco.sgroi at gmail.com
> wrote:


> Inserting the cobalt atoms neb.x crashes during the calculation on the
> second image.
> No error is printed in the log but in the standard output I obtain:
>
>           2           2 INTERMEDIATE_IMAGE
> [t12node084:17231] 55 more processes have sent help message
> help-mpi-btl-openib.txt / reg mem limit low
> [t12node084:17231] Set MCA parameter "orte_base_help_aggregate" to 0 to
> see all help / error messages
> [t12node082:28692] *** An error occurred in MPI_Bcast
> [t12node082:28692] *** on communicator MPI COMMUNICATOR 9 SPLIT FROM 7
> [t12node082:28692] *** MPI_ERR_TRUNCATE: message truncated
> [t12node082:28692] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
>

try to see if there is a CRASH file somewhere, in particular in the output
directory of image 2. If there is none, the code hasn't stopped in one of
the many error checks. If there is one, the error message should explain
something.

It is very difficult to figure out from such obscure messages what went
wrong and where. Typically this kind of obscure crashes are either due to a
library or compiler bug, or to subtle code bugs, leading to different
processors taking different paths (for instance: some processors find
convergence, some don't).

Paolo
-- 
Paolo Giannozzi, Dept. Chemistry&Physics&Environment,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
hone +39-0432-558216, fax +39-0432-558222
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20150616/429a7547/attachment.html>


More information about the users mailing list