[QE-users] HSE calculation for ZnO band structure converges slow

Tan Hengxin tanhx90 at gmail.com
Wed Aug 15 04:43:17 CEST 2018

Dear Users,

I am doing HSE calculations on wurtzite ZnO for the *band structure*.
However, the run converges slowly:
I use a 6*6*4 k-mesh. The cutoff energy is 60 Ry. Norm-conversing
pseudopotentials are used, and there are 12 and 6 valence electrons for Zn
and O respectively.
The job was run with 2 nodes (48 processors) with npool = 2. The run takes
45 hours.

Is this seems normal? Or what would be done to reduce the run time? Are
there any tricks that need to be paid special attention to ZnO?
Thanks for your help.

The input parameters and the tail of the output are copied below.
  calculation   = 'scf',
  prefix        = 'ZnO',
  pseudo_dir    = './',
  verbosity     = 'high',
  wf_collect    = .true.
  etot_conv_thr = 1.0D-6,
  forc_conv_thr = 1.0D-4,
  restart_mode  = 'from_scratch',
  outdir        = './temp_out',
  ibrav         = 0,
  nat           = 4,
  ntyp          = 2,
  ecutwfc       = 60,
  nbnd          = 36,
  input_dft     = 'hse',
  exx_fraction  = 0.25,
  nqx1 = 6, nqx2 = 6, nqx3 = 6,
  mixing_mode   = 'plain',
  mixing_beta   = 0.7,
  conv_thr      = 1.D-8,

 Zn 65.38     Zn_fan_nc_pbe_srl.upf
 O  15.999    O_web_nc_pbe.upf

     3.249         0.000         0.000
    -1.625         2.814         0.000
     0.000         0.000         5.205

Zn    0.333333343         0.666666687         0.000000000
Zn    0.666666627         0.333333313         0.500000000
O     0.333333343         0.666666687         0.382600009
O     0.666666627         0.333333313         0.882600009

K_POINTS {crystal}
  (*The 216 k points with weights (from 6*6*6 k-mesh)*)

!    total energy              =    -269.06624688 Ry
     Harris-Foulkes estimate   =    -277.03747208 Ry
     estimated scf accuracy    <          3.7E-09 Ry

     convergence has been achieved in   1 iterations

!    total energy              =    -269.06624688 Ry
     Harris-Foulkes estimate   =    -269.06624688 Ry
     est. exchange err (dexx)  =       0.00000000 Ry
     - averaged Fock potential =      15.94246020 Ry
     + Fock energy             =      -7.97123500 Ry

     EXX self-consistency reached

     Writing output data file ZnO.save

     init_run     :      1.05s CPU      1.15s WALL (       1 calls)
     electrons    : 150333.27s CPU 152710.02s WALL (       5 calls)

     Called by init_run:
     wfcinit      :      1.01s CPU      1.11s WALL (       1 calls)
     wfcinit:atom :      0.00s CPU      0.00s WALL (     108 calls)
     wfcinit:wfcr :      1.00s CPU      1.03s WALL (     108 calls)
     potinit      :      0.01s CPU      0.02s WALL (       1 calls)

     Called by electrons:
     c_bands      : 150323.40s CPU 152695.15s WALL (      25 calls)
     sum_band     :      7.77s CPU      8.01s WALL (      25 calls)
     v_of_rho     :      0.12s CPU      0.13s WALL (      27 calls)
     v_h          :      0.00s CPU      0.00s WALL (      27 calls)
     v_xc         :      0.12s CPU      0.12s WALL (      27 calls)
     mix_rho      :      0.01s CPU      0.01s WALL (      25 calls)

     Called by c_bands:
     init_us_2    :      0.21s CPU      0.24s WALL (    6480 calls)
     cegterg      : 149815.07s CPU 152046.20s WALL (    2700 calls)

     Called by sum_band:

     Called by *egterg:
     h_psi        : 149780.99s CPU 152011.56s WALL (    9389 calls)
     g_psi        :      0.08s CPU      0.13s WALL (    6581 calls)
     cdiaghg      :     29.34s CPU     29.97s WALL (    8849 calls)
     cegterg:over :      1.80s CPU      1.79s WALL (    6581 calls)
     cegterg:upda :      1.33s CPU      1.31s WALL (    6581 calls)
     cegterg:last :      0.86s CPU      0.84s WALL (    2808 calls)
     cdiaghg:chol :      1.34s CPU      1.48s WALL (    8849 calls)
     cdiaghg:inve :      0.96s CPU      0.98s WALL (    8849 calls)
     cdiaghg:para :      1.74s CPU      1.89s WALL (   17698 calls)

     Called by h_psi:
     h_psi:pot    :     39.93s CPU     40.82s WALL (    9389 calls)
     h_psi:calbec :      1.34s CPU      1.29s WALL (    9389 calls)
     vloc_psi     :     38.33s CPU     39.21s WALL (    9389 calls)
     add_vuspsi   :      0.23s CPU      0.28s WALL (    9389 calls)

     General routines
     calbec       :     30.41s CPU     30.38s WALL (   10361 calls)
     fft          :      0.08s CPU      0.09s WALL (     289 calls)
     fftw         :     44.34s CPU     44.67s WALL (  579174 calls)
     fftc         : 136683.94s CPU 138542.75s WALL (******** calls)
     fftcw        :     26.63s CPU     26.25s WALL (  339050 calls)

     Parallel routines
     fft_scatter  :  82451.93s CPU  77795.40s WALL (******** calls)

     EXX routines
     exx_grid     :      0.10s CPU      0.10s WALL (       1 calls)
     exxinit      :     38.19s CPU    176.06s WALL (       5 calls)
     vexx         : 149740.98s CPU 151970.61s WALL (    5822 calls)
     exxenergy    :   8652.21s CPU   8774.70s WALL (       9 calls)

     PWSCF        :     1d   20h10m CPU        1d   20h56m WALL

   This run was terminated on:   9:57:17  10Jul2017


Department of physics, THU.
Beijing 100084, China
Office: B403,New Science Building
E-mail:t <E-mail%3Athx13 at mails.tsinghua.edu.cn>anhx90 at gmail.com
