[Pw_forum] Problem with MPI parallelization: Error in routine zsqmred
Jan Oliver Oelerich
jan.oliver.oelerich at physik.uni-marburg.de
Fri Sep 2 09:43:59 CEST 2016
Hi QE users,
I am trying to run QE 5.4.0 with MPI parallelization on a mid-size
cluster. I successfully tested the installation using 8 processes
distributed on 2 nodes, so communication across nodes is not a problem.
When I, however, run the same calculation on 64 cores, I am getting the
following error messages in the stdout:
iteration # 1 ecut= 30.00 Ry beta=0.70
Davidson diagonalization with overlap
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Error in routine zsqmred (8):
somthing wrong with row 3
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
stopping ...
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Error in routine zsqmred (4):
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
somthing wrong with row 3
Error in routine zsqmred (12):
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
somthing wrong with row 3
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
stopping ...
stopping ...
The cluster queues stderr shows that some MPI processes exited:
PSIlogger: Child with rank 28 exited with status 12.
PSIlogger: Child with rank 8 exited with status 4.
application called MPI_Abort(MPI_COMM_WORLD, 12) - process 28application
called MPI_Abort(MPI_COMM_WORLD, 4) - process 8application called
MPI_Abort(MPI_COMM_WORLD, 8) - process 18kvsprovider[12375]: sighandler:
Terminating the job.
PSIlogger: Child with rank 18 exited with status 8.
PSIlogger: Child with rank 4 exited with status 1.
PSIlogger: Child with rank 15 exited with status 1.
PSIlogger: Child with rank 53 exited with status 1.
PSIlogger: Child with rank 30 exited with status 1.
The cluster is running some sort of Sun Grid Engine and I used intel
MPI. I see no other error messages. Could you give me a hint how to
debug this further? Verbosity is already 'high'.
Thank you very much and best regards,
Jan Oliver Oelerich
--
Dr. Jan Oliver Oelerich
Faculty of Physics and Material Sciences Center
Philipps-Universität Marburg
Addr.: Room 02D35, Hans-Meerwein-Straße 6, 35032 Marburg, Germany
Phone: +49 6421 2822260
Mail : jan.oliver.oelerich at physik.uni-marburg.de
Web : http://academics.oelerich.org
More information about the users
mailing list