[QE-users] need help, QE parallel execution is not working properly
Mainak Ghosh
mainakghosh555555 at gmail.com
Fri Dec 25 20:58:32 CET 2020
Dear all,
Today i tried to run a quantum espresso code in parallel
execution via the command line as,
" mpirun -np 32 '/home/mainak/Desktop/qe-6.5/bin/pw.x' -npool 4 -
bgrp 4 -ndiag 36 <pha_nscf.in| tee pha_nscf.out "
the run was completed and the results are correct but in terminal it
shows,
" PWSCF : 41m 7.66s CPU 7h22m WALL
This run was terminated on: 20:59:32 25Dec2020
=------------------------------------------------------------------------------=
JOB DONE.
=------------------------------------------------------------------------------=
[mainak-linpc:26225] [[38684,1],0]-[[38684,0],0] mca_oob_tcp_msg_recv:
readv failed: Connection timed out (110)
[mainak-linpc:26227] [[38684,1],2]-[[38684,0],0] mca_oob_tcp_msg_recv:
readv failed: Connection timed out (110)
[mainak-linpc:26229] [[38684,1],4]-[[38684,0],0] mca_oob_tcp_msg_recv:
readv failed: Connection timed out (110)
--------------------------------------------------------------------------
mpirun has exited due to process rank 2 with PID 26227 on
node mainak-linpc exiting improperly. There are two reasons this could
occur:
1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.
2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"
This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
"
As you can see in the last line,
" PWSCF : 41m 7.66s CPU 7h22m WALL "
I, personally unable to find out what went wrong ! Advices will be very
much helpful to me. For concern, i would like to tell that my cpu has 4
physical cores. Here in the following i am also attaching the '.out' file.
Thanks in advance.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20201226/a41b5e09/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pha_nscf.out
Type: application/octet-stream
Size: 6318 bytes
Desc: not available
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20201226/a41b5e09/attachment.obj>
More information about the users
mailing list