[QE-users] HSE calculation for ZnO band structure converges slow

Ari P Seitsonen Ari.P.Seitsonen at iki.fi
Wed Aug 15 16:13:05 CEST 2018


Dear Tan,

   Adding to the answer of Stefano (Baroni; only the latest versions of QE 
have the ACE algorithm implemented which speeds up the calculations) two 
notes: Probably you are not interested in having the natural symmetries in 
the lattice (avoiding the symmetrisation might make sense in calculation 
of a band structure with hybrid functionals indeed), but providing only 
three digits to the lattice parametre a[1,1] = 2.814 (Å) ensures that the 
hexagonal symmetry is not found (it would be easier to provide 'ibrav = 4 
/ a = 3.249 / c = 5.205' than the explicit lattice vectors); you could 
avoid the symmetrisation also with the keyword 'nosym'. Secondly, the 
cut-off energy could be still insufficient, at least with 
Troullier-Martins-type pseudo potentials 80 Ry and above is needed for 
good convergence; yet maybe the band structure converges faster.

   About other parametres, maybe 'ecutfock' would be useful too; I do not 
know/remember which one is the standard method of treating the divergence 
in the Fourier transform in the Fock operator, but that probably does not 
affect the computing time.

     Greetings from Sunny Paris,

        apsi

-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-
   Ari Paavo Seitsonen / Ari.P.Seitsonen at iki.fi / http://www.iki.fi/~apsi/
     Ecole Normale Supérieure (ENS), Département de Chimie, Paris
     Mobile (F) : +33 789 37 24 25    (CH) : +41 79 71 90 935


On Wed, 15 Aug 2018, Tan Hengxin wrote:

> Dear Users,
> 
> I am doing HSE calculations on wurtzite ZnO for the band structure.
> However, the run converges slowly:
> I use a 6*6*4 k-mesh. The cutoff energy is 60 Ry. Norm-conversing pseudopotentials are used, and there are 12 and 6 valence electrons for
> Zn and O respectively.
> The job was run with 2 nodes (48 processors) with npool = 2. The run takes 45 hours.
> 
> Is this seems normal? Or what would be done to reduce the run time? Are there any tricks that need to be paid special attention to ZnO? 
> Thanks for your help.
> 
> The input parameters and the tail of the output are copied below.
> input:
> &CONTROL
>   calculation   = 'scf',
>   prefix        = 'ZnO',
>   pseudo_dir    = './',
>   verbosity     = 'high',
>   wf_collect    = .true.
>   etot_conv_thr = 1.0D-6,
>   forc_conv_thr = 1.0D-4,
>   restart_mode  = 'from_scratch',
>   outdir        = './temp_out',
> /
> &SYSTEM
>   ibrav         = 0,
>   nat           = 4,
>   ntyp          = 2,
>   ecutwfc       = 60,
>   nbnd          = 36,
>   input_dft     = 'hse',
>   exx_fraction  = 0.25,
>   nqx1 = 6, nqx2 = 6, nqx3 = 6,
> /
> &ELECTRONS
>   mixing_mode   = 'plain',
>   mixing_beta   = 0.7,
>   conv_thr      = 1.D-8,
> /
> 
> ATOMIC_SPECIES
>  Zn 65.38     Zn_fan_nc_pbe_srl.upf
>  O  15.999    O_web_nc_pbe.upf
> 
> CELL_PARAMETERS (angstrom)
>      3.249         0.000         0.000
>     -1.625         2.814         0.000
>      0.000         0.000         5.205
> 
> ATOMIC_POSITIONS (crystal)
> Zn    0.333333343         0.666666687         0.000000000
> Zn    0.666666627         0.333333313         0.500000000
> O     0.333333343         0.666666687         0.382600009
> O     0.666666627         0.333333313         0.882600009
> 
> K_POINTS {crystal}
> 216
>   (The 216 k points with weights (from 6*6*6 k-mesh))
> output:
> ......
> 
> !    total energy              =    -269.06624688 Ry
>      Harris-Foulkes estimate   =    -277.03747208 Ry
>      estimated scf accuracy    <          3.7E-09 Ry
> 
>      convergence has been achieved in   1 iterations
> 
> !    total energy              =    -269.06624688 Ry
>      Harris-Foulkes estimate   =    -269.06624688 Ry
>      est. exchange err (dexx)  =       0.00000000 Ry
>      - averaged Fock potential =      15.94246020 Ry
>      + Fock energy             =      -7.97123500 Ry
> 
>      EXX self-consistency reached
> 
>      Writing output data file ZnO.save
> 
>      init_run     :      1.05s CPU      1.15s WALL (       1 calls)
>      electrons    : 150333.27s CPU 152710.02s WALL (       5 calls)
> 
>      Called by init_run:
>      wfcinit      :      1.01s CPU      1.11s WALL (       1 calls)
>      wfcinit:atom :      0.00s CPU      0.00s WALL (     108 calls)
>      wfcinit:wfcr :      1.00s CPU      1.03s WALL (     108 calls)
>      potinit      :      0.01s CPU      0.02s WALL (       1 calls)
> 
>      Called by electrons:
>      c_bands      : 150323.40s CPU 152695.15s WALL (      25 calls)
>      sum_band     :      7.77s CPU      8.01s WALL (      25 calls)
>      v_of_rho     :      0.12s CPU      0.13s WALL (      27 calls)
>      v_h          :      0.00s CPU      0.00s WALL (      27 calls)
>      v_xc         :      0.12s CPU      0.12s WALL (      27 calls)
>      mix_rho      :      0.01s CPU      0.01s WALL (      25 calls)
> 
>      Called by c_bands:
>      init_us_2    :      0.21s CPU      0.24s WALL (    6480 calls)
>      cegterg      : 149815.07s CPU 152046.20s WALL (    2700 calls)
> 
>      Called by sum_band:
> 
>      Called by *egterg:
>      h_psi        : 149780.99s CPU 152011.56s WALL (    9389 calls)
>      g_psi        :      0.08s CPU      0.13s WALL (    6581 calls)
>      cdiaghg      :     29.34s CPU     29.97s WALL (    8849 calls)
>      cegterg:over :      1.80s CPU      1.79s WALL (    6581 calls)
>      cegterg:upda :      1.33s CPU      1.31s WALL (    6581 calls)
>      cegterg:last :      0.86s CPU      0.84s WALL (    2808 calls)
>      cdiaghg:chol :      1.34s CPU      1.48s WALL (    8849 calls)
>      cdiaghg:inve :      0.96s CPU      0.98s WALL (    8849 calls)
>      cdiaghg:para :      1.74s CPU      1.89s WALL (   17698 calls)
> 
>      Called by h_psi:
>      h_psi:pot    :     39.93s CPU     40.82s WALL (    9389 calls)
>      h_psi:calbec :      1.34s CPU      1.29s WALL (    9389 calls)
>      vloc_psi     :     38.33s CPU     39.21s WALL (    9389 calls)
>      add_vuspsi   :      0.23s CPU      0.28s WALL (    9389 calls)
> 
>      General routines
>      calbec       :     30.41s CPU     30.38s WALL (   10361 calls)
>      fft          :      0.08s CPU      0.09s WALL (     289 calls)
>      fftw         :     44.34s CPU     44.67s WALL (  579174 calls)
>      fftc         : 136683.94s CPU 138542.75s WALL (******** calls)
>      fftcw        :     26.63s CPU     26.25s WALL (  339050 calls)
> 
>      Parallel routines
>      fft_scatter  :  82451.93s CPU  77795.40s WALL (******** calls)
> 
>      EXX routines
>      exx_grid     :      0.10s CPU      0.10s WALL (       1 calls)
>      exxinit      :     38.19s CPU    176.06s WALL (       5 calls)
>      vexx         : 149740.98s CPU 151970.61s WALL (    5822 calls)
>      exxenergy    :   8652.21s CPU   8774.70s WALL (       9 calls)
> 
>      PWSCF        :     1d   20h10m CPU        1d   20h56m WALL
> 
> 
>    This run was terminated on:   9:57:17  10Jul2017
> 
> =------------------------------------------------------------------------------=
>    JOB DONE.
> =------------------------------------------------------------------------------=
> 
> Tan,
>  Hengxin
> Department of physics, THU.
> Beijing 100084, China
> Office: B403,New Science Building
> E-mail:tanhx90 at gmail.com
> 
>


More information about the users mailing list