[Pw_forum] WFC convergence in NEB calculation
Janos Kiss
janos.kiss at theochem.ruhr-uni-bochum.de
Mon Oct 13 21:54:45 CEST 2008
J K wrote:
>> What else could one try, to get the wfc somehow converged?
Paolo Giannozzi wrote:
>hard to say (especially withouth the ouput). Are you sure that spin
>polarization
>is not a source of trouble close to the transition state?
I have not tried this up to now, because i would suspect that in a reaction
where a proton/hydride transfer is involved most species should be more like
ions and not radicals (of course i might be plain wrong here).
I was also concerned about this aspect, but when i did a quick and
rough test calculation, the wfc convergence was not particularly better, and
one SCF step took a lot longer time. Therefore, i just killed it impatiently
iirc.
Paolo Giannozzi wrote:
>One possible trick could be to go on with the calculation even if not
>converged.
>I am actually considering adding yet another option
>("sloppy_convergence"?)
>doing exactly this. It should be easy: in PW/electrons.f90, set
>"conv_elec" to
>.true. at the last iteration, after the call to "mix_rho". No warranty.
I have done this right after your first response, and recompiled the code with
the modified routine. I set the maximum number of SCF steps to 150, expecting
that either the wfc gonna converge in this 150 iterations, or i get
the forces with that wfc what i have in the given moment after
150 step. Insted of this i just got a crash and a core dump.
For my case this workaround is not really good anyway, because the wfc is so
bad (the total energy fluctuates in the very first decimal), that the forces
would not really make any physical sense.
J K wrote:
>> Another question: when i restart an NEB calculation, how can i
>> restart the wfc
>> for those images, which were already converged for a given NEB
>> iteration?
>> [...] even for those images, where the wfc was converged, i still
>> need to
>> spend again like 5-9 SCF cycles/image.
Paolo Giannozzi wrote:
>are you sure that the code is using the same set of coordinates that
>were used
>in the previous calculation? Maybe the restart doesn't work as expected.
I have looked into this, and i think you are completely right, because the
code restarts the nuclear positions from the *.path file, which in a case of
a crash contains the coordinates from the previous NEB iteration. I was
thinking that it is updated after the convergence is achieved on any movable
image, and not just after a successfull NEB optimization step.
I tried to use the cg scheme for those images where the wfc convergence was
tricky, but it was not successfull. I am really afraid, that somehow i
managed to misscompile the code, because if i try to crank up the density
cutoff above 5 times the pwcutoff, the code crashes right after the wfc
initialization in the wfc diagonalization. I think this is a really bad sign.
In fact, i completely forgot to mention, that i was able to produce a relative
stable binary with mkl 8.1 only. With newer mkl versions in a geometry
optimization after several successfull steps when the nuclei were close to
the minimum geometry the code crashed very often in chdiag. This was
reproduceable for a TiO2 and a Cu slab as well. With mkl 10 it was a pain to
actually produce somehow a binary (i had to modify some flags to get it
working), but the binary crashed right after the wfc initialisation for all
my test calculations.
Than i have seen in the forum, that mkl 10 does not give any gain compared
to older mkl versions, so i gave up on it. Of course, to clarify this i should
attache all the config files, the machine architecture, the ifc compiler and
mkl versions and so on.
Apparently i am not alone having this issue on the new quad core Intel Xeon
machines. For example, on dual Opteron i produced a rock solid binary with
Atlas and with mkl 9.0 as well. Now someone could say, that i am not really
supposed to use mkl on an Opteron, but it worked. It was only around 4%
slower than the Atlas version, and the numbers were looking the same (within
the numerical noise). Of course, i would prefer to use the dual Quad Xeon
machines, because they are faster. I would suspect, that this is more like an
issue with the Quad Xeon architecture, or with our particular system
environment/installation.
Yours Sincerely,
Janos.
==================================================================
Janos Kiss e-mail: janos.kiss at theochem.ruhr-uni-bochum.de
Lehrstuhl fuer Theoretische Chemie Phone: +49 (0)234/32-26485
NC 03/297 +49 (0)234 32 26754
Ruhr-Universitaet Bochum Fax: +49 (0)234/32-14045
D-44780 Bochum http://www.theochem.ruhr-uni-bochum.de
==================================================================
More information about the users
mailing list