[QE-users] [QE-GPU] How to "Fill the CPU with OpenMP threads" to run QE-GPU

Fri Oct 8 15:50:38 CEST 2021

Dear Anson,

I guess there is something wrong in your scripts.
Look at the following lines taken from your email:

----
Parallel version (MPI & OpenMP), running on     784 processor cores
Number of MPI processes:                28
Threads/MPI process:                    28
MPI processes distributed on     1 nodes
----

waaay too many processes spanning way too many threads for a single node.

Start simple: 1 MPI running on 1 GPU and no threads (export
OMP_NUM_THREADS=1). Then try to increase the number of threads.

Kind regards,
Pietro

On 10/8/21 15:17, Anson Thomas wrote:
> I am trying to run GPU enabled QE (QE 6.8 running on Ubuntu 18.04.5 LTS
> (GNU/Linux 4.15.0-135-generic x86_64) System Configuration: Processor:
> Intel Xeon Gold 5120 CPU 2.20 GHz (2 Processor) RAM: 96 GB HDD: 6 TB
> Graphics Card: NVIDIA Quadro P5000 (16 GB))
>
> I am successfully able to run small jobs (with dynamical ram ~1GB).
> However, when going to even larger systems (less than 16GB), the output
> abruptly stops during the first iteration(attached below)
>
>       Program PWSCF v.6.8 starts on  8Oct2021 at 10:33:9
>
>       This program is part of the open-source Quantum ESPRESSO suite
>       for quantum simulation of materials; please cite
>           "P. Giannozzi et al., J. Phys.:Condens. Matter 21 395502 (2009);
>           "P. Giannozzi et al., J. Phys.:Condens. Matter 29 465901 (2017);
>           "P. Giannozzi et al., J. Chem. Phys. 152 154105 (2020);
>            URL http://www.quantum-espresso.org
> <http://www.quantum-espresso.org/
>
>       in publications or presentations arising from this work. More
> details at
> http://www.quantum-espresso.org/quote
> <http://www.quantum-espresso.org/quote>>
>       Parallel version (MPI & OpenMP), running on     784 processor cores
>       Number of MPI processes:                28
>       Threads/MPI process:                    28
>
>       MPI processes distributed on     1 nodes
>       R & G space division:  proc/nbgrp/npool/nimage =      28
>       43440 MiB available memory on the printing compute node when the
> environment starts
>       Reading input from 001.in
> <http://001.in/>> Warning: card &CELL ignored
> Warning: card / ignored
>
>       Current dimensions of program PWSCF are:
>       Max number of different atomic species (ntypx) = 10
>       Max number of k-points (npk) =  40000
>       Max angular momentum in pseudopotentials (lmaxx) =  4
>       file Ti.pbe-spn-rrkjus_psl.1.0.0.upf: wavefunction(s)  3S 3D
> renormalized
>
>       gamma-point specific algorithms are used
>       Found symmetry operation: I + ( -0.0000 -0.5000  0.0000)
>       This is a supercell, fractional translations are disabled
>
>       Subspace diagonalization in iterative solution of the eigenvalue
> problem:
>       a serial algorithm will be used
>
>       Parallelization info
>       --------------------
>       sticks:   dense  smooth     PW     G-vecs:    dense   smooth      PW
>       Min         637     232     57                81572    18102    2258
>       Max         640     234     60                81588    18118    2266
>       Sum       17865    6549   1633              2284245   507201   63345
>       Using Slab Decomposition
>
>
>       bravais-lattice index     =           14
>       lattice parameter (alat)  =      21.0379  a.u.
>       unit-cell volume          =    9204.2807 (a.u.)^3
>       number of atoms/cell      =           36
>       number of atomic types    =            2
>       number of electrons       =       288.00
>       number of Kohn-Sham states=          173
>       kinetic-energy cutoff     =      55.0000  Ry
>       charge density cutoff     =     600.0000  Ry
>       scf convergence threshold =      1.0E-06
>       mixing beta               =       0.4000
>       number of iterations used =            8  local-TF  mixing
>       energy convergence thresh.=      1.0E-04
>       force convergence thresh. =      1.0E-03
>       Exchange-correlation= PBE
>                             (   1   4   3   4   0   0   0)
>       nstep                     =          500
>
>
>       GPU acceleration is ACTIVE.
>
>       Message from routine print_cuda_info:
>       High GPU oversubscription detected. Are you sure this is what you
> want?
>
>       GPU used by master process:
>
>          Device Number: 0
>          Device name: Quadro P5000
>          Compute capability : 61
>          Ratio of single to double precision performance  : 32
>          Memory Clock Rate (KHz): 4513000
>          Memory Bus Width (bits): 256
>          Peak Memory Bandwidth (GB/s): 288.83
>
>       celldm(1)=  21.037943  celldm(2)=   1.000000  celldm(3)=   2.419041
>       celldm(4)=  -0.766650  celldm(5)=  -0.766650  celldm(6)=   0.533303
>
>       crystal axes: (cart. coord. in units of alat)
>                 a(1) = (   1.000000   0.000000   0.000000 )
>                 a(2) = (   0.533303   0.845924   0.000000 )
>                 a(3) = (  -1.854558  -1.023161   1.168553 )
>
>       reciprocal axes: (cart. coord. in units 2 pi/alat)
>                 b(1) = (  1.000000 -0.630438  1.035056 )
>                 b(2) = ( -0.000000  1.182139  1.035056 )
>                 b(3) = (  0.000000  0.000000  0.855759 )
>
>
>       PseudoPot. # 1 for Ti read from file:
>       ../Ti.pbe-spn-rrkjus_psl.1.0.0.upf
>       MD5 check sum: e281089c08e14b8efcf92e44a67ada65
>       Pseudo is Ultrasoft + core correction, Zval = 12.0
>       Generated using "atomic" code by A. Dal Corso  v.6.2.2
>       Using radial grid of 1177 points,  6 beta functions with:
>                  l(1) =   0
>                  l(2) =   0
>                  l(3) =   1
>                  l(4) =   1
>                  l(5) =   2
>                  l(6) =   2
>       Q(r) pseudized with 0 coefficients
>
>
>       PseudoPot. # 2 for O  read from file:
>       ../O.pbe-n-rrkjus_psl.1.0.0.upf
>       MD5 check sum: 91400c9766925bcf19f520983a725ff0
>       Pseudo is Ultrasoft + core correction, Zval =  6.0
>       Generated using "atomic" code by A. Dal Corso  v.6.3MaX
>       Using radial grid of 1095 points,  4 beta functions with:
>                  l(1) =   0
>                  l(2) =   0
>                  l(3) =   1
>                  l(4) =   1
>       Q(r) pseudized with 0 coefficients
>
>
>       atomic species   valence    mass     pseudopotential
>          Ti            12.00    47.86700     Ti( 1.00)
>          O              6.00    15.99940     O ( 1.00)
>
>       Starting magnetic structure
>       atomic species   magnetization
>          Ti           0.200
>          O            0.000
>
>       No symmetry found
>
>
>                                      s                        frac. trans.
>
>        isym =  1     identity
>
>   cryst.   s( 1) = (     1          0          0      )
>                    (     0          1          0      )
>                    (     0          0          1      )
>
>   cart.    s( 1) = (  1.0000000  0.0000000  0.0000000 )
>                    (  0.0000000  1.0000000  0.0000000 )
>                    (  0.0000000  0.0000000  1.0000000 )
>
>
>       point group C_1 (1)
>       there are  1 classes
>       the character table:
>
>         E
> A      1.00
>
>       the symmetry operations in each class and the name of the first
> element:
>
>       E        1
>            identity
>
>     Cartesian axes
>
>       site n.     atom                  positions (alat units)
>           1           O   tau(   1) = (  -0.8353365  -0.5987815
>   0.7050395  )
>           2           Ti  tau(   2) = (  -0.6772809  -0.5115821
>   0.7050395  )
>           3           O   tau(   3) = (  -0.5192254  -0.4243827
>   0.7050395  )
>           4           Ti  tau(   4) = (  -0.9272815  -0.5115821
>   0.5842738  )
>           5           O   tau(   5) = (  -0.7692260  -0.4243827
>   0.5842738  )
>           6           O   tau(   6) = (  -0.3186838  -0.1758181
>   0.5842738  )
>           7           O   tau(   7) = (  -0.4520098  -0.3872999
>   0.4635080  )
>           8           Ti  tau(   8) = (  -0.2939543  -0.3001004
>   0.4635080  )
>           9           O   tau(   9) = (  -0.1358987  -0.2129011
>   0.4635080  )
>          10           O   tau(  10) = (  -0.5686844  -0.1758181
>   0.7050395  )
>          11           Ti  tau(  11) = (  -0.4106289  -0.0886188
>   0.7050395  )
>          12           O   tau(  12) = (  -0.2525734  -0.0014194
>   0.7050395  )
>          13           Ti  tau(  13) = (  -0.6606296  -0.0886188
>   0.5842738  )
>          14           O   tau(  14) = (  -0.5025740  -0.0014194
>   0.5842738  )
>          15           O   tau(  15) = (  -0.0520318   0.2471452
>   0.5842738  )
>          16           O   tau(  16) = (  -0.1853578   0.0356635
>   0.4635080  )
>          17           Ti  tau(  17) = (  -0.0273023   0.1228629
>   0.4635080  )
>          18           O   tau(  18) = (   0.1307533   0.2100623
>   0.4635080  )
>          19           O   tau(  19) = (  -0.3353351  -0.5987815
>   0.7050395  )
>          20           Ti  tau(  20) = (  -0.1772797  -0.5115821
>   0.7050395  )
>          21           O   tau(  21) = (  -0.0192241  -0.4243827
>   0.7050395  )
>          22           Ti  tau(  22) = (  -0.4272803  -0.5115821
>   0.5842738  )
>          23           O   tau(  23) = (  -0.2692247  -0.4243827
>   0.5842738  )
>          24           O   tau(  24) = (   0.1813175  -0.1758181
>   0.5842738  )
>          25           O   tau(  25) = (   0.0479915  -0.3872999
>   0.4635080  )
>          26           Ti  tau(  26) = (   0.2060470  -0.3001004
>   0.4635080  )
>          27           O   tau(  27) = (   0.3641026  -0.2129011
>   0.4635080  )
>          28           O   tau(  28) = (  -0.0686832  -0.1758181
>   0.7050395  )
>          29           Ti  tau(  29) = (   0.0893724  -0.0886188
>   0.7050395  )
>          30           O   tau(  30) = (   0.2474280  -0.0014194
>   0.7050395  )
>          31           Ti  tau(  31) = (  -0.1606282  -0.0886188
>   0.5842738  )
>          32           O   tau(  32) = (  -0.0025728  -0.0014194
>   0.5842738  )
>          33           O   tau(  33) = (   0.4479695   0.2471452
>   0.5842738  )
>          34           O   tau(  34) = (   0.3146435   0.0356635
>   0.4635080  )
>          35           Ti  tau(  35) = (   0.4726991   0.1228629
>   0.4635080  )
>          36           O   tau(  36) = (   0.6307546   0.2100623
>   0.4635080  )
>
>     Crystallographic axes
>
>       site n.     atom                  positions (cryst. coord.)
>           1           O   tau(   1) = (  0.2719137  0.0219125  0.6033439  )
>           2           Ti  tau(   2) = (  0.3749954  0.1249943  0.6033439  )
>           3           O   tau(   3) = (  0.4780771  0.2280761  0.6033439  )
>           4           Ti  tau(   4) = ( -0.0000046 -0.0000050  0.4999975  )
>           5           O   tau(   5) = (  0.1030772  0.1030768  0.4999975  )
>           6           O   tau(   6) = (  0.3969147  0.3969146  0.4999975  )
>           7           O   tau(   7) = (  0.2719156  0.0219145  0.3966511  )
>           8           Ti  tau(   8) = (  0.3749973  0.1249964  0.3966511  )
>           9           O   tau(   9) = (  0.4780790  0.2280781  0.3966511  )
>          10           O   tau(  10) = (  0.2719134  0.5219140  0.6033439  )
>          11           Ti  tau(  11) = (  0.3749952  0.6249957  0.6033439  )
>          12           O   tau(  12) = (  0.4780769  0.7280775  0.6033439  )
>          13           Ti  tau(  13) = ( -0.0000048  0.4999964  0.4999975  )
>          14           O   tau(  14) = (  0.1030769  0.6030781  0.4999975  )
>          15           O   tau(  15) = (  0.3969145  0.8969160  0.4999975  )
>          16           O   tau(  16) = (  0.2719153  0.5219160  0.3966511  )
>          17           Ti  tau(  17) = (  0.3749970  0.6249978  0.3966511  )
>          18           O   tau(  18) = (  0.4780787  0.7280796  0.3966511  )
>          19           O   tau(  19) = (  0.7719150  0.0219125  0.6033439  )
>          20           Ti  tau(  20) = (  0.8749966  0.1249943  0.6033439  )
>          21           O   tau(  21) = (  0.9780784  0.2280761  0.6033439  )
>          22           Ti  tau(  22) = (  0.4999967 -0.0000050  0.4999975  )
>          23           O   tau(  23) = (  0.6030784  0.1030768  0.4999975  )
>          24           O   tau(  24) = (  0.8969160  0.3969146  0.4999975  )
>          25           O   tau(  25) = (  0.7719169  0.0219145  0.3966511  )
>          26           Ti  tau(  26) = (  0.8749985  0.1249964  0.3966511  )
>          27           O   tau(  27) = (  0.9780803  0.2280781  0.3966511  )
>          28           O   tau(  28) = (  0.7719147  0.5219140  0.6033439  )
>          29           Ti  tau(  29) = (  0.8749965  0.6249957  0.6033439  )
>          30           O   tau(  30) = (  0.9780782  0.7280775  0.6033439  )
>          31           Ti  tau(  31) = (  0.4999965  0.4999964  0.4999975  )
>          32           O   tau(  32) = (  0.6030782  0.6030781  0.4999975  )
>          33           O   tau(  33) = (  0.8969158  0.8969160  0.4999975  )
>          34           O   tau(  34) = (  0.7719166  0.5219160  0.3966511  )
>          35           Ti  tau(  35) = (  0.8749983  0.6249978  0.3966511  )
>          36           O   tau(  36) = (  0.9780801  0.7280796  0.3966511  )
>
>       number of k points=     1  Gaussian smearing, width (Ry)=  0.0100
>                         cart. coord. in units 2pi/alat
>          k(    1) = (   0.0000000   0.0000000   0.0000000), wk =   1.0000000
>
>                         cryst. coord.
>          k(    1) = (   0.0000000   0.0000000   0.0000000), wk =   1.0000000
>
>       Dense  grid:  1142123 G-vectors     FFT dimensions: ( 180, 180, 400)
>
>       Smooth grid:   253601 G-vectors     FFT dimensions: ( 100, 100, 243)
>
>       Dynamical RAM for                 wfc:       2.99 MB
>
>       Dynamical RAM for     wfc (w. buffer):       2.99 MB
>
>       Dynamical RAM for           str. fact:       1.24 MB
>
>       Dynamical RAM for           local pot:       0.00 MB
>
>       Dynamical RAM for          nlocal pot:       7.05 MB
>
>       Dynamical RAM for                qrad:       3.93 MB
>
>       Dynamical RAM for          rho,v,vnew:      25.98 MB
>
>       Dynamical RAM for               rhoin:       8.66 MB
>
>       Dynamical RAM for           G-vectors:       2.40 MB
>
>       Dynamical RAM for          h,s,v(r/c):       2.74 MB
>
>       Dynamical RAM for          <psi|beta>:       0.54 MB
>
>       Dynamical RAM for                 psi:       5.98 MB
>
>       Dynamical RAM for                hpsi:       5.98 MB
>
>       Dynamical RAM for                spsi:       5.98 MB
>
>       Dynamical RAM for      wfcinit/wfcrot:       8.53 MB
>
>       Dynamical RAM for           addusdens:     131.34 MB
>
>       Dynamical RAM for          addusforce:     160.16 MB
>
>       Estimated static dynamical RAM per process >      76.37 MB
>
>       Estimated max dynamical RAM per process >     236.53 MB
>
>       Estimated total dynamical RAM >       6.47 GB
>
>       Check: negative core charge=   -0.000001
>       Generating pointlists ...
>       new r_m :   0.0722 (alat units)  1.5191 (a.u.) for type    1
>       new r_m :   0.0722 (alat units)  1.5191 (a.u.) for type    2
>
>       Initial potential from superposition of free atoms
>
>       starting charge  287.98222, renormalised to  288.00000
>
>       negative rho (up, down):  9.119E-05 6.477E-05
>       Starting wfcs are  216 randomized atomic wfcs
>
>       total cpu time spent up to now is       14.0 secs
>
>       Self-consistent Calculation
> [tb_dev] Currently allocated     2.23E+01 Mbytes, locked:    0 /   9
> [tb_pin] Currently allocated     0.00E+00 Mbytes, locked:    0 /   0
>
>       iteration #  1     ecut=    55.00 Ry     beta= 0.40
>       Davidson diagonalization with overlap
>
> ---- Real-time Memory Report at c_bands before calling an iterative solver
>             980 MiB given to the printing process from OS
>               0 MiB allocation reported by mallinfo(arena+hblkhd)
>           32000 MiB available memory on the node where the printing
> process lives
>       GPU memory used/free/total (MiB): 11117 / 5152 / 16270
> ------------------
>       ethr =  1.00E-02,  avg # of iterations =  1.5
> The CRASH file generated says
>
>   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>       task #        24
>       from  addusdens_gpu  : error #         1
>        cannot allocate aux2_d
>   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
>
>   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>       task #        14
>       from  addusdens_gpu  : error #         1
>        cannot allocate aux2_d
>   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
>
>   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>       task #         5
>       from  addusdens_gpu  : error #         1
>        cannot allocate aux2_d
>   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
>
>   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>       task #         7
>       from  addusdens_gpu  : error #         1
>        cannot allocate aux2_d
>   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
>
>   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>       task #        15
>       from  addusdens_gpu  : error #         1
>        cannot allocate aux2_d
>   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
>
>   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>       task #        17
>       from  addusdens_gpu  : error #         1
>        cannot allocate aux2_d
>   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
>
>   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>       task #        10
>       from  addusdens_gpu  : error #         1
>        cannot allocate aux2_d
>   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
>
>   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>       task #         9
>       from  addusdens_gpu  : error #         1
>        cannot allocate aux2_d
>   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
>
>   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>       task #        12
>       from  addusdens_gpu  : error #         1
>        cannot allocate aux2_d
>   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
>
>   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>       task #         4
>       from  addusdens_gpu  : error #         1
>        cannot allocate aux2_d
>   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
>
>   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>       task #        13
>       from  addusdens_gpu  : error #         1
>        cannot allocate aux2_d
>   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
>
>   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>       task #        19
>       from  addusdens_gpu  : error #         1
>        cannot allocate aux2_d
>   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
> Using -ndiag 1 and -ntg1 with pw.x also gave a similar output with the
> following additional lines
>
>       negative rho (up, down):  9.119E-05 6.477E-05
>       Starting wfcs are  216 randomized atomic wfcs
>
>       total cpu time spent up to now is       11.9 secs
>
>       Self-consistent Calculation
> [tb_dev] Currently allocated     3.21E+01 Mbytes, locked:    0 /   9
> [tb_pin] Currently allocated     0.00E+00 Mbytes, locked:    0 /   0
>
>       iteration #  1     ecut=    55.00 Ry     beta= 0.40
>       Davidson diagonalization with overlap
>
> ---- Real-time Memory Report at c_bands before calling an iterative solver
>            1036 MiB given to the printing process from OS
>               0 MiB allocation reported by mallinfo(arena+hblkhd)
>           36041 MiB available memory on the node where the printing
> process lives
>       GPU memory used/free/total (MiB): 8915 / 7354 / 16270
> ------------------
>       ethr =  1.00E-02,  avg # of iterations =  1.5
> 0: ALLOCATE: 156244752 bytes requested; status = 2(out of memory)
> 0: ALLOCATE: 156239280 bytes requested; status = 2(out of memory)
> 0: ALLOCATE: 156239280 bytes requested; status = 2(out of memory)
> 0: ALLOCATE: 156244752 bytes requested; status = 2(out of memory)
> 0: ALLOCATE: 156239280 bytes requested; status = 2(out of memory)
> 0: ALLOCATE: 156239280 bytes requested; status = 2(out of memory)
> 0: ALLOCATE: 156244752 bytes requested; status = 2(out of memory)
> 0: ALLOCATE: 156244752 bytes requested; status = 2(out of memory)
> 0: ALLOCATE: 156244752 bytes requested; status = 2(out of memory)
> 0: ALLOCATE: 156244752 bytes requested; status = 2(out of memory)
> 0: ALLOCATE: 156239280 bytes requested; status = 2(out of memory)
> 0: ALLOCATE: 156239280 bytes requested; status = 2(out of memory)
> 0: ALLOCATE: 156244752 bytes requested; status = 2(out of memory)
> 0: ALLOCATE: 156239280 bytes requested; status = 2(out of memory)
> 0: ALLOCATE: 156244752 bytes requested; status = 2(out of memory)
> 0: ALLOCATE: 156239280 bytes requested; status = 2(out of memory)
> --------------------------------------------------------------------------
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code. Per user-direction, the job has been aborted.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> mpirun detected that one or more processes exited with non-zero status,
> thus causing
> the job to be terminated. The first process to do so was:
>
>    Process name: [[58344,1],12]
>    Exit code:    127
> --------------------------------------------------------------------------
> I believe I am not "filling the CPUs with OpenMP threads", or running 1
> MPI on 1 GPU, as suggested in this document.
>
> Can someone please give some suggestions? Sorry for the long post. I am
> totally new to this field. Any help would be appreciated. Thanks in advance
> --
> Sent by *ANSON THOMAS*
> *M.Sc. Chemistry, IIT Roorkee, India*
> *
> *
> *
> *
>
> _______________________________________________
> Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
> users mailing list users at lists.quantum-espresso.org
> https://lists.quantum-espresso.org/mailman/listinfo/users
>

Firma il tuo 5x1000 all’Università di Parma, aiutaci a essere sempre più accoglienti e inclusivi verso le nostre studentesse e i nostri studenti - Indica 00308780345 nella tua denuncia dei redditi.