[Q-e-developers] error in restarting spin-polarized SCF with QE 5.1.1
Paolo Giannozzi
paolo.giannozzi at uniud.it
Thu Nov 27 22:34:39 CET 2014
I can't reproduce the problem you mention (at least not
with a smaller cell size: 50 a.u. doesn't fit into my PC).
There are some "fluctuations" in the number of iterations
that suggest that the restarting algorithm may not be perfect,
but I don't get any error (at least in serial and parallel
execution on 2 processors).
P.
On Wed, 2014-11-26 at 15:37 -0600, Marco Govoni wrote:
> Hi,
>
> Same problem on a O2 molecule (nspin = 2).
>
> The problem shows up when nspin = 2 and the SCF is interrupted after >= 3 iterations.
> For example if max_seconds is set to let the code terminate (cleanly) before iteration 4 of the SCF loop is done, the restart will try to read kpoint 3 instead of SCF iteration 3. But this is a gamma_only simulation + spin, so only 2 kpoints are present. Crash.
> If I reduce max_seconds and I let the code terminate (cleanly) before iteration 3 is done, the restart tries to read kpoint 2 which is possible in this case, there is no crash error. Not sure what is read is good though.
>
> Alright here is the input, scaled down to a molecule for fast debugging. Please try to play with max_seconds and restart_mode, so that you start from_scratch and have a clean interruption after 3 completed SCF iterations and then try to restart from there.
>
> Thanks for you support.
>
> Marco
>
>
> !
> ! Input for O2 molecule triplet ground state. At the experimental geometry 1.212 angstroms
> !
> ! Either can fix occupations and the spin state through: tot_magnetization = 2
> ! or
> ! can allow system to have flexibility to find local minimum spin state by commenting out tot_magnetization and then ucommenting the following:
> ! occupations = 'smearing',
> ! degauss = 0.01D0,
> ! smearing = 'gauss',
> ! starting_magnetization(1)=0.7,
> ! starting_magnetization(2)=0.7,
> !
> !
> &CONTROL
> calculation = 'scf',
> verbosity = 'high',
> outdir = './',
> pseudo_dir = './'
> prefix = 'O2-triplet-PBE-SCF-tm80',
> max_seconds = 60
> restart_mode = 'restart'
> /
> &SYSTEM
> nosym = .TRUE.,
> ibrav = 1,
> celldm(1) = 50.d0,
> nspin = 2,
> nat = 2,
> ntyp = 2,
> ecutwfc = 80,
> tot_magnetization = 2,
> nbnd = 10,
> ! occupations = 'smearing',
> ! degauss = 0.01D0,
> ! smearing = 'gauss',
> ! starting_magnetization(1)=0.7,
> ! starting_magnetization(2)=0.7,
> /
> &ELECTRONS
> conv_thr = 1.D-6,
> mixing_beta = 0.5D0,
> /
> ATOMIC_SPECIES
> O1 15.999 O.pbe-mt.UPF
> O2 15.999 O.pbe-mt.UPF
> ATOMIC_POSITIONS { bohr }
> O1 0.000000000 0.000000000 0.000000000
> O2 2.400000000 0.000000000 0.000000000
> K_POINTS { gamma }
>
>
>
> --
> ----------------------------
> Marco Govoni, Ph.D.
> ----------------------------
> Institute for Molecular Engineering
> The University of Chicago
> 5747 South Ellis Avenue
> Chicago, IL 60637
> http://galligroup.uchicago.edu/People/mgovoni.php
> ----------------------------
>
> On Nov 26, 2014, at 12:43 PM, Marco Govoni <mgovoni at uchicago.edu> wrote:
>
> > Hi,
> >
> > I have problems in restarting the SCF simulation.
> >
> > I’m running a spin-polarized (nspin=2) SCF simulation (besides task and diag, I’m not activating other parallelization levels than R&G division).
> > I set max_seconds and from_scratch, yielding a clean interruption of the SCF (unconverged) cycle (few iterations only are done).
> > Then when I restart, the code crashes giving the follow message.
> >
> > Calculation restarted from scf iteration # 4
> >
> > total cpu time spent up to now is 14.3 secs
> >
> > per-process dynamical memory: 131.3 Mb
> >
> > Self-consistent Calculation
> >
> > iteration # 4 ecut= 120.00 Ry beta=0.20
> > Calculation restarted from kpoint # 3
> > Davidson diagonalization with overlap
> > ethr = 1.00E-02, avg # of iterations = 11.0
> > Application 228717 exit codes: 134
> > Application 228717 exit signals: Killed
> >
> > This is a gamma_only simulation so there must be a typo in “ kpoint # 3 “, maybe it is scf iteration.
> > Plus the code exists and a trace of the errors gives
> >
> > pw.x 0000000001C1CA89 Unknown Unknown Unknown
> > pw.x 0000000001C1B35E Unknown Unknown Unknown
> > pw.x 0000000001BCF642 Unknown Unknown Unknown
> > pw.x 0000000001B4B998 Unknown Unknown Unknown
> > pw.x 0000000001B521B2 Unknown Unknown Unknown
> > pw.x 0000000000CCBED0 Unknown Unknown Unknown
> > pw.x 0000000000D787FB Unknown Unknown Unknown
> > pw.x 0000000001C38131 Unknown Unknown Unknown
> > pw.x 0000000001A32B32 Unknown Unknown Unknown
> > pw.x 0000000001A25710 Unknown Unknown Unknown
> > pw.x 0000000001A258BD Unknown Unknown Unknown
> > pw.x 00000000019E3D43 Unknown Unknown Unknown
> > pw.x 00000000007F4003 reduce_base_real_ 223 mp_base.f90
> > pw.x 00000000007E1AFC mp_mp_mp_sum_rt_ 1382 mp.f90
> > pw.x 00000000005D17AC sum_band_IP_sum_b 548 sum_band.f90
> > pw.x 00000000005C4982 sum_band_ 123 sum_band.f90
> > pw.x 00000000004786EF electrons_scf_ 478 electrons.f90
> > pw.x 0000000000475C0D electrons_ 133 electrons.f90
> > pw.x 00000000004011BC run_pwscf_ 90 run_pwscf.f90
> > pw.x 0000000000401023 MAIN__ 30 pwscf.f90
> > pw.x 0000000000400F76 Unknown Unknown Unknown
> > pw.x 0000000001C31D81 Unknown Unknown Unknown
> > pw.x 0000000000400E41 Unknown Unknown Unknown
> >
> > Let me know.
> >
> > Marco
> >
> >
> > --
> > ----------------------------
> > Marco Govoni, Ph.D.
> > ----------------------------
> > Institute for Molecular Engineering
> > The University of Chicago
> > 5747 South Ellis Avenue
> > Chicago, IL 60637
> > http://galligroup.uchicago.edu/People/mgovoni.php
> > ----------------------------
> >
>
>
> _______________________________________________
> Q-e-developers mailing list
> Q-e-developers at qe-forge.org
> http://qe-forge.org/mailman/listinfo/q-e-developers
More information about the developers
mailing list