[Q-e-developers] error in restarting spin-polarized SCF with QE 5.1.1

Marco Govoni mgovoni at uchicago.edu
Wed Nov 26 22:37:45 CET 2014


Hi, 

Same problem on a O2 molecule (nspin = 2). 

The problem shows up when nspin = 2 and the SCF is interrupted after >= 3 iterations. 
For example if max_seconds is set to let the code terminate (cleanly) before iteration 4 of the SCF loop is done, the restart will try to read kpoint 3 instead of SCF iteration 3. But this is a gamma_only simulation + spin, so only 2 kpoints are present. Crash.  
If I reduce max_seconds and I let the code terminate (cleanly) before iteration 3 is done, the restart tries to read kpoint 2 which is possible in this case, there is no crash error. Not sure what is read is good though. 

Alright here is the input, scaled down to a molecule for fast debugging. Please try to play with max_seconds and restart_mode, so that you start from_scratch and have a clean interruption after 3 completed SCF iterations and then try to restart from there. 

Thanks for you support. 

Marco 


!
!  Input for O2 molecule triplet ground state.   At the experimental geometry 1.212 angstroms
!
!  Either can fix occupations and the spin state through: tot_magnetization = 2
!     or
!  can allow system to have flexibility to find local minimum spin state by commenting out tot_magnetization and then ucommenting the following:
!   occupations = 'smearing',
!   degauss     = 0.01D0,
!   smearing    = 'gauss',
!   starting_magnetization(1)=0.7,
!   starting_magnetization(2)=0.7,
!
!
&CONTROL
   calculation  = 'scf',
   verbosity    = 'high',
   outdir       = './',
   pseudo_dir   = './'
   prefix       = 'O2-triplet-PBE-SCF-tm80',
   max_seconds  = 60
   restart_mode = 'restart'
/
&SYSTEM
   nosym       = .TRUE.,
   ibrav       = 1,
   celldm(1)   = 50.d0,
   nspin       = 2,
   nat         = 2,
   ntyp        = 2,
   ecutwfc     = 80,
   tot_magnetization = 2,
   nbnd        = 10,
!   occupations = 'smearing',
!   degauss     = 0.01D0,
!   smearing    = 'gauss',
!   starting_magnetization(1)=0.7,
!   starting_magnetization(2)=0.7,
/
&ELECTRONS
   conv_thr    = 1.D-6,
   mixing_beta = 0.5D0,
/
ATOMIC_SPECIES
 O1   15.999   O.pbe-mt.UPF
 O2   15.999   O.pbe-mt.UPF
ATOMIC_POSITIONS { bohr }
 O1        0.000000000   0.000000000   0.000000000
 O2        2.400000000   0.000000000   0.000000000
K_POINTS { gamma }



--
----------------------------
Marco Govoni, Ph.D.
----------------------------
Institute for Molecular Engineering 
The University of Chicago
5747 South Ellis Avenue 
Chicago, IL 60637 
http://galligroup.uchicago.edu/People/mgovoni.php
----------------------------

On Nov 26, 2014, at 12:43 PM, Marco Govoni <mgovoni at uchicago.edu> wrote:

> Hi, 
> 
> I have problems in restarting the SCF simulation. 
> 
> I’m running a spin-polarized (nspin=2) SCF simulation (besides task and diag, I’m not activating other parallelization levels than R&G division). 
> I set max_seconds and from_scratch, yielding a clean interruption of the SCF (unconverged) cycle (few iterations only are done). 
> Then when I restart, the code crashes giving the follow message. 
> 
>     Calculation restarted from scf iteration #     4
> 
>     total cpu time spent up to now is       14.3 secs
> 
>     per-process dynamical memory:   131.3 Mb
> 
>     Self-consistent Calculation
> 
>     iteration #  4     ecut=   120.00 Ry     beta=0.20
>     Calculation restarted from kpoint #     3
>     Davidson diagonalization with overlap
>     ethr =  1.00E-02,  avg # of iterations = 11.0
> Application 228717 exit codes: 134
> Application 228717 exit signals: Killed
> 
> This is a gamma_only simulation so there must be a typo in “ kpoint # 3 “, maybe it is scf iteration. 
> Plus the code exists and a trace of the errors gives 
> 
> pw.x               0000000001C1CA89  Unknown               Unknown  Unknown
> pw.x               0000000001C1B35E  Unknown               Unknown  Unknown
> pw.x               0000000001BCF642  Unknown               Unknown  Unknown
> pw.x               0000000001B4B998  Unknown               Unknown  Unknown
> pw.x               0000000001B521B2  Unknown               Unknown  Unknown
> pw.x               0000000000CCBED0  Unknown               Unknown  Unknown
> pw.x               0000000000D787FB  Unknown               Unknown  Unknown
> pw.x               0000000001C38131  Unknown               Unknown  Unknown
> pw.x               0000000001A32B32  Unknown               Unknown  Unknown
> pw.x               0000000001A25710  Unknown               Unknown  Unknown
> pw.x               0000000001A258BD  Unknown               Unknown  Unknown
> pw.x               00000000019E3D43  Unknown               Unknown  Unknown
> pw.x               00000000007F4003  reduce_base_real_         223  mp_base.f90
> pw.x               00000000007E1AFC  mp_mp_mp_sum_rt_         1382  mp.f90
> pw.x               00000000005D17AC  sum_band_IP_sum_b         548  sum_band.f90
> pw.x               00000000005C4982  sum_band_                 123  sum_band.f90
> pw.x               00000000004786EF  electrons_scf_            478  electrons.f90
> pw.x               0000000000475C0D  electrons_                133  electrons.f90
> pw.x               00000000004011BC  run_pwscf_                 90  run_pwscf.f90
> pw.x               0000000000401023  MAIN__                     30  pwscf.f90
> pw.x               0000000000400F76  Unknown               Unknown  Unknown
> pw.x               0000000001C31D81  Unknown               Unknown  Unknown
> pw.x               0000000000400E41  Unknown               Unknown  Unknown
> 
> Let me know. 
> 
> Marco
> 
> 
> --
> ----------------------------
> Marco Govoni, Ph.D.
> ----------------------------
> Institute for Molecular Engineering 
> The University of Chicago
> 5747 South Ellis Avenue 
> Chicago, IL 60637 
> http://galligroup.uchicago.edu/People/mgovoni.php
> ----------------------------
> 





More information about the developers mailing list