[QE-users] Error while running QE

Chandan Kumar Choudhury ckchoud at g.clemson.edu
Tue Mar 16 07:13:59 CET 2021


Thank you Pietro.
With export OMP_NUM_THREADS=1
I get similar errors:

     


     Program PWSCF v.6.7MaX starts on 16Mar2021 at  4:15:30

     This program is part of the open-source Quantum ESPRESSO suite
     for quantum simulation of materials; please cite
         "P. Giannozzi et al., J. Phys.:Condens. Matter 21 395502 (2009);
         "P. Giannozzi et al., J. Phys.:Condens. Matter 29 465901 (2017);
          URL http://www.quantum-espresso.org",
     in publications or presentations arising from this work. More details at
     http://www.quantum-espresso.org/quote

     Parallel version (MPI & OpenMP), running on      48 processor cores
     Number of MPI processes:                48
     Threads/MPI process:                     1

     MPI processes distributed on     1 nodes
     R & G space division:  proc/nbgrp/npool/nimage =      48
     Waiting for input...
     Reading input from standard input
Warning: card &CELL ignored
Warning: card / ignored

     Current dimensions of program PWSCF are:
     Max number of different atomic species (ntypx) = 10
     Max number of k-points (npk) =  40000
     Max angular momentum in pseudopotentials (lmaxx) =  3

     gamma-point specific algorithms are used

     Subspace diagonalization in iterative solution of the eigenvalue problem:
     one sub-group per band group will be used
     scalapack distributed-memory algorithm (size of sub-group:  6*  6 procs)


     Parallelization info
     --------------------
     sticks:   dense  smooth     PW     G-vecs:    dense   smooth      PW
     Min         923     369     91               146631    37087    4636
     Max         926     372     94               146640    37098    4642
     Sum       44393   17785   4445              7038461  1780431  222607



     bravais-lattice index     =            8
     lattice parameter (alat)  =      47.2432  a.u.
     unit-cell volume          =  105442.7265 (a.u.)^3
     number of atoms/cell      =           30
     number of atomic types    =            3
     number of electrons       =        74.00
     number of Kohn-Sham states=           44
     kinetic-energy cutoff     =      25.0000  Ry
     charge density cutoff     =     250.0000  Ry
     scf convergence threshold =      1.0E-06
     mixing beta               =       0.7000
     number of iterations used =            8  plain     mixing
     energy convergence thresh.=      1.0E-04
     force convergence thresh. =      1.0E-03
     Exchange-correlation= PBE
                           (   1   4   3   4   0   0   0)
     nstep                     =          500


     celldm(1)=  47.243153  celldm(2)=   1.000000  celldm(3)=   1.000000
     celldm(4)=   0.000000  celldm(5)=   0.000000  celldm(6)=   0.000000

...
...
free(): invalid next size (fast)
[qm-qe-1:06059] *** Process received signal ***
[qm-qe-1:06059] Signal: Aborted (6)
[qm-qe-1:06059] Signal code:  (-6)
[qm-qe-1:06059] [ 0] /usr/lib64/libpthread.so.0(+0x12b30)[0x7fed3ed71b30]
[qm-qe-1:06059] [ 1] /usr/lib64/libc.so.6(gsignal+0x10f)[0x7fed3e9d384f]
[qm-qe-1:06059] [ 2] /usr/lib64/libc.so.6(abort+0x127)[0x7fed3e9bdc45]
[qm-qe-1:06059] [ 3] /usr/lib64/libc.so.6(+0x7a9d7)[0x7fed3ea169d7]
[qm-qe-1:06059] [ 4] /usr/lib64/libc.so.6(+0x81ddc)[0x7fed3ea1dddc]
[qm-qe-1:06059] [ 5] /usr/lib64/libc.so.6(+0x83778)[0x7fed3ea1f778]
[qm-qe-1:06059] [ 6] /home/chandan_prescience_in/softwares/aocc-compiler-2.3.0/lib/libflang.so(f90_dealloc03a_i8+0xad)[0x7fed4031a9dd]
[qm-qe-1:06059] [ 7] pw.x[0x12db14a]
[qm-qe-1:06059] [ 8] pw.x[0x1248b93]
[qm-qe-1:06059] [ 9] pw.x[0x95cdd0]
[qm-qe-1:06059] [10] pw.x[0x95cba8]
[qm-qe-1:06059] [11] pw.x[0x95cad4]
[qm-qe-1:06059] [12] pw.x[0x95b20b]
[qm-qe-1:06059] [13] pw.x[0x9764e0]
[qm-qe-1:06059] [14] pw.x[0x70091a]
[qm-qe-1:06059] [15] pw.x[0x6fa425]
[qm-qe-1:06059] [16] pw.x[0x72c5ec]
[qm-qe-1:06059] [17] pw.x[0x4caa55]
[qm-qe-1:06059] [18] pw.x[0x1a03326]
[qm-qe-1:06059] [19] /usr/lib64/libc.so.6(__libc_start_main+0xf3)[0x7fed3e9bf803]
[qm-qe-1:06059] [20] pw.x[0x4ca7fe]
[qm-qe-1:06059] *** End of error message ***
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node qm-qe-1 exited on signal 6 (Aborted).

   

Please assist.

Thank you!
—

Chandan Kumar Choudhury, PhD
Senior Scientist (Computational Science)
Prescience.in

> On Mar 15, 2021, at 6:13 PM, Pietro Bonfa' <pietro.bonfa at unipr.it> wrote:
> 
> Dear Chandan,
> 
> your problem is likely due to the massive over-subscription of your resources, as shown by the following line
> 
> On 3/15/21 12:55 PM, Chandan Kumar Choudhury wrote:
>>      Parallel version (MPI & OpenMP), running on    2304 processor cores
> 
> 
> I would start with pure MPI parallelism by prepending
> 
> export OMP_NUM_THREADS=1
> 
> in your job scripts.
> 
> Best,
> Pietro
> _______________________________________________
> Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
> users mailing list users at lists.quantum-espresso.org
> https://lists.quantum-espresso.org/mailman/listinfo/users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20210316/74d8f53f/attachment.html>


More information about the users mailing list