[Pw_forum] error in running pw.x command

nicola varini nicola.varini at gmail.com
Mon Jul 20 10:27:16 CEST 2015


Dear all, if you use mkl you can rely on the intel linking advisor for
proper linking
https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor
If you open the file $MKL_ROOT/include/mkl.h you see the version number.
It should be something like

#define __INTEL_MKL__ 11

#define __INTEL_MKL_MINOR__ 2

#define __INTEL_MKL_UPDATE__ 2

In the link above put your version number, OS, and other options.

You should get some options in output which you should use for linking.

HTH


Nicola




2015-07-20 9:57 GMT+02:00 Bahadır salmankurt <bsalmankurt at gmail.com>:

> Dear Mohaddeseh et co,
>
> installing one of the old version of mpi could solve the problem.
>
> 2015-07-20 10:06 GMT+03:00 Ari P Seitsonen <Ari.P.Seitsonen at iki.fi>:
>
>>
>> Dear Mohaddeseh et co,
>>
>>   Just a note: I used to have such problems when I had compiled with
>> MKL-ScaLAPACK of old version, indeed around 11.1, when I ran with more than
>> four cores. I think I managed to run when I disabled ScaLAPACK. Of course
>> this might be fully unrelated to your problem.
>>
>>     Greetings from Lappeenranta,
>>
>>        apsi
>>
>>
>> -=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-
>>   Ari Paavo Seitsonen / Ari.P.Seitsonen at iki.fi / http://www.iki.fi/~apsi/
>>   Ecole Normale Supérieure (ENS), Département de Chimie, Paris
>>   Mobile (F) : +33 789 37 24 25    (CH) : +41 79 71 90 935
>>
>>
>>
>> On Mon, 20 Jul 2015, Paolo Giannozzi wrote:
>>
>>  This is not a QE problem: the fortran code knows nothing about nodes and
>>> cores. It's the software setup for parallel execution on your machine that
>>> has a problem.
>>>
>>> Paolo
>>>
>>> On Thu, Jul 16, 2015 at 2:25 PM, mohaddeseh abbasnejad <
>>> m.abbasnejad at gmail.com> wrote:
>>>
>>>       Dear all,
>>>
>>> I have recently installed PWscf (version 5.1) on our cluster (4 nodes,
>>> 32 cores).
>>> Ifort & mkl version 11.1 has been installed.
>>> When I run pw.x command on every node individually, for both the
>>> following command, it will work properly.
>>> 1- /opt/exp_soft/espresso-5.1/bin/pw.x -in scf.in
>>> 2- mpirun -n 4 /opt/exp_soft/espresso-5.1/bin/pw.x -in scf.in
>>> However, when I use the following command (again for each of them,
>>> separately),
>>> 3- mpirun -n 8 /opt/exp_soft/espresso-5.1/bin/pw.x -in scf.in
>>> it gives me such an error:
>>>
>>> [cluster:14752] *** Process received signal ***
>>> [cluster:14752] Signal: Segmentation fault (11)
>>> [cluster:14752] Signal code:  (128)
>>> [cluster:14752] Failing at address: (nil)
>>> [cluster:14752] [ 0] /lib64/libpthread.so.0() [0x3a78c0f710]
>>> [cluster:14752] [ 1]
>>> /opt/intel/Compiler/11.1/064/mkl/lib/em64t/libmkl_mc3.so(mkl_blas_zdotc+0x79)
>>> [0x2b5e8e37d4f9]
>>> [cluster:14752] *** End of error message ***
>>>
>>> --------------------------------------------------------------------------
>>> mpirun noticed that process rank 4 with PID 14752 on node
>>> cluster.khayam.local exited on signal 11 (Segmentation fault).
>>>
>>> --------------------------------------------------------------------------
>>>
>>> This error also exists when I use all the node with each other in
>>> parallel mode (using the following command):
>>> 4- mpirun -n 32 -hostfile testhost /opt/exp_soft/espresso-5.1/bin/pw.x
>>> -in scf.in
>>> The error:
>>>
>>> [cluster:14838] *** Process received signal ***
>>> [cluster:14838] Signal: Segmentation fault (11)
>>> [cluster:14838] Signal code:  (128)
>>> [cluster:14838] Failing at address: (nil)
>>> [cluster:14838] [ 0] /lib64/libpthread.so.0() [0x3a78c0f710]
>>> [cluster:14838] [ 1]
>>> /opt/intel/Compiler/11.1/064/mkl/lib/em64t/libmkl_mc3.so(mkl_blas_zdotc+0x79)
>>> [0x2b04082cf4f9]
>>> [cluster:14838] *** End of error message ***
>>>
>>> --------------------------------------------------------------------------
>>> mpirun noticed that process rank 24 with PID 14838 on node
>>> cluster.khayam.local exited on signal 11 (Segmentation fault).
>>>
>>> --------------------------------------------------------------------------
>>>
>>> Any help will be appreciated.
>>>
>>> Regards,
>>> Mohaddeseh
>>>
>>> ---------------------------------------------------------
>>>
>>> Mohaddeseh Abbasnejad,
>>> Room No. 323, Department of Physics,
>>> University of Tehran, North Karegar Ave.,
>>> Tehran, P.O. Box: 14395-547- IRAN
>>> Tel. No.: +98 21 6111 8634  & Fax No.: +98 21 8800 4781
>>> Cellphone: +98 917 731 7514
>>> E-Mail:     m.abbasnejad at gmail.com
>>> Website:  http://physics.ut.ac.ir
>>>
>>> ---------------------------------------------------------
>>>
>>> _______________________________________________
>>> Pw_forum mailing list
>>> Pw_forum at pwscf.org
>>> http://pwscf.org/mailman/listinfo/pw_forum
>>>
>>>
>>>
>>>
>>> --
>>> Paolo Giannozzi, Dept. Chemistry&Physics&Environment,
>>> Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
>>> Phone +39-0432-558216, fax +39-0432-558222
>>>
>>>
>> _______________________________________________
>> Pw_forum mailing list
>> Pw_forum at pwscf.org
>> http://pwscf.org/mailman/listinfo/pw_forum
>>
>
>
> _______________________________________________
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://pwscf.org/mailman/listinfo/pw_forum
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20150720/7e687684/attachment.html>


More information about the users mailing list