[Pw_forum] Fwd: error in running pw.x command

mohaddeseh abbasnejad m.abbasnejad at gmail.com
Mon Jul 20 05:28:45 CEST 2015


---------- Forwarded message ----------
From: "mohaddeseh abbasnejad" <m.abbasnejad at gmail.com>
Date: Jul 16, 2015 4:55 PM
Subject: error in running pw.x command
To: "PWSCF Forum" <pw_forum at pwscf.org>
Cc:


Dear all,

I have recently installed PWscf (version 5.1) on our cluster (4 nodes, 32
cores).
Ifort & mkl version 11.1 has been installed.
When I run pw.x command on every node individually, for both the following
command, it will work properly.
1- /opt/exp_soft/espresso-5.1/bin/pw.x -in scf.in
2- mpirun -n 4 /opt/exp_soft/espresso-5.1/bin/pw.x -in scf.in
However, when I use the following command (again for each of them,
separately),
3- mpirun -n 8 /opt/exp_soft/espresso-5.1/bin/pw.x -in scf.in
it gives me such an error:

[cluster:14752] *** Process received signal ***
[cluster:14752] Signal: Segmentation fault (11)
[cluster:14752] Signal code:  (128)
[cluster:14752] Failing at address: (nil)
[cluster:14752] [ 0] /lib64/libpthread.so.0() [0x3a78c0f710]
[cluster:14752] [ 1]
/opt/intel/Compiler/11.1/064/mkl/lib/em64t/libmkl_mc3.so(mkl_blas_zdotc+0x79)
[0x2b5e8e37d4f9]
[cluster:14752] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 4 with PID 14752 on node
cluster.khayam.local exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

This error also exists when I use all the node with each other in parallel
mode (using the following command):
4- mpirun -n 32 -hostfile testhost /opt/exp_soft/espresso-5.1/bin/pw.x -in
scf.in
The error:

[cluster:14838] *** Process received signal ***
[cluster:14838] Signal: Segmentation fault (11)
[cluster:14838] Signal code:  (128)
[cluster:14838] Failing at address: (nil)
[cluster:14838] [ 0] /lib64/libpthread.so.0() [0x3a78c0f710]
[cluster:14838] [ 1]
/opt/intel/Compiler/11.1/064/mkl/lib/em64t/libmkl_mc3.so(mkl_blas_zdotc+0x79)
[0x2b04082cf4f9]
[cluster:14838] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 24 with PID 14838 on node
cluster.khayam.local exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

Any help will be appreciated.

Regards,
Mohaddeseh

---------------------------------------------------------

Mohaddeseh Abbasnejad,
Room No. 323, Department of Physics,
University of Tehran, North Karegar Ave.,
Tehran, P.O. Box: 14395-547- IRAN
Tel. No.: +98 21 6111 8634  & Fax No.: +98 21 8800 4781
Cellphone: +98 917 731 7514
E-Mail:     m.abbasnejad at gmail.com
Website:  http://physics.ut.ac.ir

---------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20150720/dd7f822d/attachment.html>


More information about the users mailing list