<div dir="ltr"><span style="font-size:12.8000001907349px">Dear Mohaddeseh et co,</span><br><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">installing one of the old version of mpi could solve the problem. </span></div></div><div class="gmail_extra"><br><div class="gmail_quote">2015-07-20 10:06 GMT+03:00 Ari P Seitsonen <span dir="ltr"><<a href="mailto:Ari.P.Seitsonen@iki.fi" target="_blank">Ari.P.Seitsonen@iki.fi</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>
Dear Mohaddeseh et co,<br>
<br>
Just a note: I used to have such problems when I had compiled with MKL-ScaLAPACK of old version, indeed around 11.1, when I ran with more than four cores. I think I managed to run when I disabled ScaLAPACK. Of course this might be fully unrelated to your problem.<br>
<br>
Greetings from Lappeenranta,<br>
<br>
apsi<br>
<br>
-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-<br>
Ari Paavo Seitsonen / <a href="mailto:Ari.P.Seitsonen@iki.fi" target="_blank">Ari.P.Seitsonen@iki.fi</a> / <a href="http://www.iki.fi/~apsi/" rel="noreferrer" target="_blank">http://www.iki.fi/~apsi/</a><br>
Ecole Normale Supérieure (ENS), Département de Chimie, Paris<br>
Mobile (F) : <a href="tel:%2B33%20789%2037%2024%2025" value="+33789372425" target="_blank">+33 789 37 24 25</a> (CH) : <a href="tel:%2B41%2079%2071%2090%20935" value="+41797190935" target="_blank">+41 79 71 90 935</a><div class="HOEnZb"><div class="h5"><br>
<br>
<br>
On Mon, 20 Jul 2015, Paolo Giannozzi wrote:<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
This is not a QE problem: the fortran code knows nothing about nodes and cores. It's the software setup for parallel execution on your machine that has a problem.<br>
<br>
Paolo<br>
<br>
On Thu, Jul 16, 2015 at 2:25 PM, mohaddeseh abbasnejad <<a href="mailto:m.abbasnejad@gmail.com" target="_blank">m.abbasnejad@gmail.com</a>> wrote:<br>
<br>
Dear all,<br>
<br>
I have recently installed PWscf (version 5.1) on our cluster (4 nodes, 32 cores).<br>
Ifort & mkl version 11.1 has been installed.<br>
When I run pw.x command on every node individually, for both the following command, it will work properly.<br>
1- /opt/exp_soft/espresso-5.1/bin/pw.x -in <a href="http://scf.in" rel="noreferrer" target="_blank">scf.in</a><br>
2- mpirun -n 4 /opt/exp_soft/espresso-5.1/bin/pw.x -in <a href="http://scf.in" rel="noreferrer" target="_blank">scf.in</a><br>
However, when I use the following command (again for each of them, separately),<br>
3- mpirun -n 8 /opt/exp_soft/espresso-5.1/bin/pw.x -in <a href="http://scf.in" rel="noreferrer" target="_blank">scf.in</a><br>
it gives me such an error:<br>
<br>
[cluster:14752] *** Process received signal ***<br>
[cluster:14752] Signal: Segmentation fault (11)<br>
[cluster:14752] Signal code: (128)<br>
[cluster:14752] Failing at address: (nil)<br>
[cluster:14752] [ 0] /lib64/libpthread.so.0() [0x3a78c0f710]<br>
[cluster:14752] [ 1] /opt/intel/Compiler/11.1/064/mkl/lib/em64t/libmkl_mc3.so(mkl_blas_zdotc+0x79) [0x2b5e8e37d4f9]<br>
[cluster:14752] *** End of error message ***<br>
--------------------------------------------------------------------------<br>
mpirun noticed that process rank 4 with PID 14752 on node cluster.khayam.local exited on signal 11 (Segmentation fault).<br>
--------------------------------------------------------------------------<br>
<br>
This error also exists when I use all the node with each other in parallel mode (using the following command):<br>
4- mpirun -n 32 -hostfile testhost /opt/exp_soft/espresso-5.1/bin/pw.x -in <a href="http://scf.in" rel="noreferrer" target="_blank">scf.in</a><br>
The error:<br>
<br>
[cluster:14838] *** Process received signal ***<br>
[cluster:14838] Signal: Segmentation fault (11)<br>
[cluster:14838] Signal code: (128)<br>
[cluster:14838] Failing at address: (nil)<br>
[cluster:14838] [ 0] /lib64/libpthread.so.0() [0x3a78c0f710]<br>
[cluster:14838] [ 1] /opt/intel/Compiler/11.1/064/mkl/lib/em64t/libmkl_mc3.so(mkl_blas_zdotc+0x79) [0x2b04082cf4f9]<br>
[cluster:14838] *** End of error message ***<br>
--------------------------------------------------------------------------<br>
mpirun noticed that process rank 24 with PID 14838 on node cluster.khayam.local exited on signal 11 (Segmentation fault).<br>
--------------------------------------------------------------------------<br>
<br>
Any help will be appreciated.<br>
<br>
Regards,<br>
Mohaddeseh<br>
<br>
---------------------------------------------------------<br>
<br>
Mohaddeseh Abbasnejad,<br>
Room No. 323, Department of Physics,<br>
University of Tehran, North Karegar Ave.,<br>
Tehran, P.O. Box: 14395-547- IRAN<br>
Tel. No.: <a href="tel:%2B98%2021%206111%208634" value="+982161118634" target="_blank">+98 21 6111 8634</a> & Fax No.: <a href="tel:%2B98%2021%208800%204781" value="+982188004781" target="_blank">+98 21 8800 4781</a><br>
Cellphone: <a href="tel:%2B98%20917%20731%207514" value="+989177317514" target="_blank">+98 917 731 7514</a><br>
E-Mail: <a href="mailto:m.abbasnejad@gmail.com" target="_blank">m.abbasnejad@gmail.com</a><br>
Website: <a href="http://physics.ut.ac.ir" rel="noreferrer" target="_blank">http://physics.ut.ac.ir</a><br>
<br>
---------------------------------------------------------<br>
<br>
_______________________________________________<br>
Pw_forum mailing list<br>
<a href="mailto:Pw_forum@pwscf.org" target="_blank">Pw_forum@pwscf.org</a><br>
<a href="http://pwscf.org/mailman/listinfo/pw_forum" rel="noreferrer" target="_blank">http://pwscf.org/mailman/listinfo/pw_forum</a><br>
<br>
<br>
<br>
<br>
--<br>
Paolo Giannozzi, Dept. Chemistry&Physics&Environment,<br>
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy<br>
Phone <a href="tel:%2B39-0432-558216" value="+390432558216" target="_blank">+39-0432-558216</a>, fax <a href="tel:%2B39-0432-558222" value="+390432558222" target="_blank">+39-0432-558222</a><br>
<br>
</blockquote>
</div></div><br>_______________________________________________<br>
Pw_forum mailing list<br>
<a href="mailto:Pw_forum@pwscf.org">Pw_forum@pwscf.org</a><br>
<a href="http://pwscf.org/mailman/listinfo/pw_forum" rel="noreferrer" target="_blank">http://pwscf.org/mailman/listinfo/pw_forum</a><br></blockquote></div><br></div>