[Pw_forum] Memory problems with a job using PWSCF from qe-gpu

Reinaldo Pis Diez reinaldo.pisdiez at gmail.com
Fri Feb 9 01:01:55 CET 2018


Dear folks,

I've managed to compile the gpu-enabled version of PWSCF from the 
sources provided by Filippo Spiga using Portland compilers and MKL 
libraries. The node has CentOS 6.6 and 16 GB of RAM. I ran some 
small tests and results were the same than those obtained with a 
non-gpu version of qe-6.1 compiled with the GNU compilers.

When I try to test the executable with a more realistic job (a Cu 
surface made by 75 atoms with 1 to 6 carbon atoms on it) an "out of 
memory" problem occurs and the job terminates. I must say that that 
job was successfully ran on another similar node (except for the 
fact that it doesn't have a gpu card). When I use "mpirun -np 1" 
before invoking pw.x, I've got

...
      Estimated max dynamical RAM per process >   10128.95MB
      Generating pointlists ...
      new r_m :   0.0689 (alat units)  1.6647 (a.u.) for type    1
      new r_m :   0.0689 (alat units)  1.6647 (a.u.) for type    2

0: ALLOCATE: 2525186688 bytes requested; status = 2(out of memory)
/opt/pgi/linux86-64/17.4/lib/libpgf90_rpm1.so(__fort_abortx+0x17) 
[0x2b646f7f2af7]
   /opt/pgi/linux86-64/17.4/lib/libpgf90.so(__fort_abort+0x5e) 
[0x2b646f41897e]
   /opt/pgi/linux86-64/17.4/lib/libcudafor.so(+0x5ac38) [0x2b6456f6cc38]
/opt/pgi/linux86-64/17.4/lib/libcudafor.so(pgf90_dev_mod_alloc04+0xc9) 
[0x2b6456f6d70e]
   /usr/local/fspiga-qe-gpu-7e1de44/bin/pw.x() [0x5e16d7]
   /usr/local/fspiga-qe-gpu-7e1de44/bin/pw.x() [0x497953]
   /usr/local/fspiga-qe-gpu-7e1de44/bin/pw.x() [0x52b05d]
   /usr/local/fspiga-qe-gpu-7e1de44/bin/pw.x() [0x40d82c]
   /usr/local/fspiga-qe-gpu-7e1de44/bin/pw.x() [0x40d704]
   /lib64/libc.so.6(__libc_start_main+0xfd) [0x36f741ed5d]
   /usr/local/fspiga-qe-gpu-7e1de44/bin/pw.x() [0x40a1c9]
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 23370 on node n13 exited 
on signal 6 (Aborted).
--------------------------------------------------------------------------

When I ran the same job with "mpirun -np 8" then I've got

...
     Estimated total allocated dynamical RAM >   11048.12MB
      Generating pointlists ...
      new r_m :   0.0689 (alat units)  1.6647 (a.u.) for type    1
      new r_m :   0.0689 (alat units)  1.6647 (a.u.) for type    2
0: ALLOCATE: 315564672 bytes requested; status = 2(out of memory)
[a lot of error messages]

I cannot understand the source of the error but I guess that it has 
to do with the gpu card. Running the deviceQuery program that comes 
with CUDA I've got (among a lot of information)

Device 0: "TITAN X (Pascal)"
   CUDA Driver Version / Runtime Version          8.0 / 8.0
   CUDA Capability Major/Minor version number:    6.1
   Total amount of global memory:                 12189 MBytes 
(12781158400 bytes)
   (28) Multiprocessors, (128) CUDA Cores/MP:     3584 CUDA Cores
   GPU Max Clock rate:                            1531 MHz (1.53 GHz)
   Memory Clock rate:                             5005 Mhz
   Memory Bus Width:                              384-bit
   L2 Cache Size:                                 3145728 bytes

Any help is welcome. I can provide the proper input file with the 
corresponding pseudopotentials if requested.

Thanks in advance

Reinaldo Pis Diez
Center of Inorganic Chemistry
Natl Univ of La Plata
Argentina



More information about the users mailing list