[Pw_forum] On the run of GPU version of QE

Tue Mar 25 13:29:02 CET 2014

Dear QE users and developers,

Recently I compiled QE-GPU on my system with GPU NVidia Tesla K20Xm.
I have CUDA ver. 5.5.
Intel MPI 4.1.3.048
Intel Compiler 14.0

I compiled GPU-QE using the instruction on the website. I downloaded
espresso-5.0.2, than downloaded QE-GPU-14.01.0 with patch file. After the
patching I type a configuration command:

./configure
 LAPACK_LIBS=/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64/libmkl_lapack95_lp64.a
--enable-openmp --enable-parallel --enable-cuda --with-gpu-arch=30
--with-cuda-dir=/opt/cuda/5.5  --enable-phigemm --enable-magma

After successful configuration I compile pw-gpu.x and everything was fine.
Then I tried to test the performance of this system compared with common
HPC.
But I have troubles. As I understand commonly 1 process goes to 1GPU, so I
need to run 2 MPI processes on my host with 2 GPUs. If I am wrong, please,
tell me where? After the start I have the error message:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    task #         0
    from cdiaghg : error #        61
    diagonalization (ZHEGV*) failed
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

And it does no matter what file I ran. I double checked all the files I
run. In any case I had the same error. Moreover when I ran the same file
using QE without GPU I had successful running. Would you be so kind to tell
me where I did the mistake? And what should I do to achieve successful
running?

The output of ldd command for pw-gpu.x is shown below:
        linux-vdso.so.1 =>  (0x00007ffff1dbb000)
        libmkl_scalapack_lp64.so =>
/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64/libmkl_scalapack_lp64.so
(0x00007f549bc79000)
        libmkl_blacs_intelmpi_lp64.so =>
/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so
(0x00007f549ba3d000)
        libmkl_gf_lp64.so =>
/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64/libmkl_gf_lp64.so
(0x00007f549b2f7000)
        libmkl_gnu_thread.so =>
/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64/libmkl_gnu_thread.so
(0x00007f549a811000)
        libmkl_core.so =>
/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64/libmkl_core.so
(0x00007f5499133000)
        libcudart.so.5.5 => /opt/cuda/5.5/lib64/libcudart.so.5.5
(0x00007f5498ee6000)
        libcublas.so.5.5 => /opt/cuda/5.5/lib64/libcublas.so.5.5
(0x00007f5495a2e000)
        libcufft.so.5.5 => /opt/cuda/5.5/lib64/libcufft.so.5.5
(0x00007f5490f0d000)
        libmpigf.so.4 => /opt/intel//impi/
4.1.3.048/intel64/lib/libmpigf.so.4 (0x00007f5490cdd000)
        libmpi_dbg_mt.so.4 => /opt/intel//impi/
4.1.3.048/intel64/lib/libmpi_dbg_mt.so.4 (0x00007f549044c000)
        libdl.so.2 => /lib64/libdl.so.2 (0x0000003c8d800000)
        librt.so.1 => /lib64/librt.so.1 (0x0000003c8e800000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003c8e000000)
        libgfortran.so.3 => /usr/lib64/libgfortran.so.3 (0x00007f549013c000)
        libm.so.6 => /lib64/libm.so.6 (0x0000003c8dc00000)
        libgomp.so.1 => /usr/lib64/libgomp.so.1 (0x0000003c8fa00000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003c91000000)
        libc.so.6 => /lib64/libc.so.6 (0x0000003c8d400000)
        libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x0000003c94000000)
        /lib64/ld-linux-x86-64.so.2 (0x0000003c8d000000)

Thank you in advance!

*--*
*Sincerely yours,*
*Alexander G. Kvashnin *

*=====================================================PhD Student Moscow
Institute of Physics and Technology          http://mipt.ru/
<http://s.wisestamp.com/links?url=http%3A%2F%2Fmipt.ru%2F&sn=>*
*141700, Institutsky lane 9, Dolgoprudny, Moscow Region, Russia*

*Junior research scientistTechnological Institute for Superhard  and Novel
Carbon Materials                                http://www.tisnum.ru/
<http://s.wisestamp.com/links?url=http%3A%2F%2Fwww.tisnum.ru%2F&sn=>
<http://s.wisestamp.com/links?url=http%3A%2F%2Fwww.ntcstm.troitsk.ru%2F&sn=>142190,
Central'naya St. 7a, Troitsk, Moscow Region,
Russia=====================================================*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20140325/ca2fb0a9/attachment.html>