[QE-users] Installing QE_GPU

Mohammad Moaddeli mohammad.moaddeli at gmail.com
Sun Oct 27 15:08:17 CET 2019


Dear all,

I am trying to install q-e-gpu-qe-gpu-6.4.1a1. The following PATHs are
added in /etc/bashrc:

####  CUDA  ####
export PATH=/usr/local/cuda-10.1/bin:$PATH
export PATH=/usr/local/cuda-10.1/include:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64:$LD_LIBRARY_PATH
export
LD_LIBRARY_PATH=/usr/local/cuda-10.1/extras/CUPTI/lib64:$LD_LIBRARY_PATH

####  PGI  ####
PGI=/opt/pgi
export PGI
PATH=/opt/pgi/linux86-64/19.4/bin:$PATH
export PATH
PATH=/opt/pgi/linux86-64/19.4/mpi/openmpi-3.1.3/bin:$PATH
export PATH
PATH=/opt/pgi/linux86-64/19.4/mpi/openmpi-3.1.3/include:$PATH
export PATH
PATH=/opt/pgi/linux86-64/19.4/mpi/openmpi-3.1.3/lib:$PATH
export PATH
MANPATH=$MANPATH:/opt/pgi/linux86-64/19.4/man
export MANPATH

and also the Graphic Card driver is installed:

[moaddeli at localhost ~]$ nvidia-smi
Sun Oct 27 16:45:25 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.50       Driver Version: 430.50       CUDA Version: 10.1
  |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr.
ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute
M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:04:00.0  On |
 N/A |
| 50%   28C    P8    10W / 250W |     58MiB / 11175MiB |      0%
 Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU
Memory |
|  GPU       PID   Type   Process name                             Usage
   |
|=============================================================================|
|    0     14528      G   /usr/bin/X
 39MiB |
|    0     14584      G   /usr/bin/gnome-shell
 16MiB |
+-----------------------------------------------------------------------------+


[root at localhost moaddeli]# lshw -numeric -C display
  *-display
       description: VGA compatible controller
       product: GP102 [GeForce GTX 1080 Ti] [10DE:1B06]
       vendor: NVIDIA Corporation [10DE]
       physical id: 0
       bus info: pci at 0000:04:00.0
       version: a1
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress vga_controller bus_master cap_list
rom
       configuration: driver=nvidia latency=0
       resources: iomemory:27f0-27ef iomemory:27f0-27ef irq:89
memory:c4000000-c4ffffff memory:27fe0000000-27fefffffff
memory:27ff0000000-27ff1ffffff ioport:c000(size=128)
memory:c5000000-c507ffff
  *-display
       description: VGA compatible controller
       product: ASPEED Graphics Family [1A03:2000]
       vendor: ASPEED Technology, Inc. [1A03]
       physical id: 0
       bus info: pci at 0000:12:00.0
       version: 30
       width: 32 bits
       clock: 33MHz
       capabilities: pm msi vga_controller cap_list
       configuration: driver=ast latency=0
       resources: irq:16 memory:c6000000-c6ffffff memory:c7000000-c701ffff
ioport:b000(size=128)

When I compile the code in a serial version, the executable pw.x is created
in bin directory and seems to work well:

[moaddeli at localhost test]$ /codes/qe4/q-e-gpu-qe-gpu-6.4.1a1/bin/pw.x

     Program PWSCF v.6.4.1 starts on 27Oct2019 at 16:48: 2

     This program is part of the open-source Quantum ESPRESSO suite
     for quantum simulation of materials; please cite
         "P. Giannozzi et al., J. Phys.:Condens. Matter 21 395502 (2009);
         "P. Giannozzi et al., J. Phys.:Condens. Matter 29 465901 (2017);
          URL http://www.quantum-espresso.org",
     in publications or presentations arising from this work. More details
at
     http://www.quantum-espresso.org/quote

     Serial version
     Waiting for input...

however, the following error appears:

[moaddeli at localhost test]$ /codes/qe4/q-e-gpu-qe-gpu-6.4.1a1/bin/pw.x <c.in
| tee c.out
0: ALLOCATE: copyin Symbol Memcpy FAILED:13(invalid device symbol)


When I compile the code in a parallel version, the executable pw.x is
created in bin directory, but it does not work:

[moaddeli at localhost ~]$ /codes/qe4/q-e-gpu-qe-gpu-6.4.1a1/bin/pw.x
[localhost.localdomain:40889] [[INVALID],INVALID] ORTE_ERROR_LOG: A
system-required executable either could not be found or was not executable
by this user in file
../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line 388
[localhost.localdomain:40889] [[INVALID],INVALID] ORTE_ERROR_LOG: A
system-required executable either could not be found or was not executable
by this user in file
../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line 166
--------------------------------------------------------------------------
Sorry!  You were supposed to get help about:
    orte_init:startup:internal-failure
But I couldn't open the help file:

/proj/pgi/linux86-64-llvm/2019/mpi/openmpi-3.1.3/share/openmpi/help-orte-runtime:
No such file or directory.  Sorry!
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Sorry!  You were supposed to get help about:
    mpi_init:startup:internal-failure
But I couldn't open the help file:

/proj/pgi/linux86-64-llvm/2019/mpi/openmpi-3.1.3/share/openmpi/help-mpi-runtime.txt:
No such file or directory.  Sorry!
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[localhost.localdomain:40889] Local abort before MPI_INIT completed
completed successfully, but am not able to aggregate error messages, and
not able to guarantee that all other processes were killed!

Any help will be greatly appreciated.

Mohammad Moaddeli

Shiraz University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20191027/2e13797b/attachment.html>


More information about the users mailing list