[Pw_forum] Problem with NEB

Wed Apr 16 20:58:57 CEST 2008

Dear PWSCF mailing list members,

i try to do a nudged elastic band calculation with PWSCF 3.2.3 compiled as a 
parallel 64-bit executable with ifc 10.1, MKL 8.1 and Open MPI 1.2.4 on a Xeon 
cluster. Each node has 8 CPU cores, 16G physical memory and they use Gigabit 
LAN for communication between nodes.
I use the R- and G-space parallelization for a single image (over 8 CPU in one 
node), and the parallelization on the number of NEB images on top of that.
The problem is that the configurations after the linear interpolation between 
reactant end product are often 'unchemical' even if i include some 
intermediate images. The WF convergence for these images can be particularly 
problematic in the very first scf step. If the WF is not converged within the 
given maximum number of steps (set to 250 in my case) for one of the images 
the code stops for that given image complaining that ' convergence NOT 
achieved, stopping'. The executables keep running on all nodes apparently 
doing nothing, so i have to kill the job by hand. Although i have the 
write_save and wf_collect flags set to true if i try to restart the 
calculation the code stops immediately complaining  that:
     from  control_checkin     : error #         1
      calculation 'restart' not allowed

I naively thought that it checks the images, and it gonna reiterate those 
which were not converged. 
The restart is not working because of the parallelization upon NEB images,  
because the forces are not available for all images, or because it was killed 
and some files were not written out properly?
 What is the minimum requirement for a stopped/killed NEB calculation to be 
suitable for restart?
Most of the time the problematic configurations are far away from the reactant 
or product and they would have large forces acting on them anyway. 
Is there a way to relax the tightness of the WF convergence relative to the 
number of scf steps? 
Is it possible to print out the forces in an NEB calculation at the end of the 
WF optimization no matter how bad the estimated energy error is?
These forces should be still good enough for the first 1-2 NEB relaxation 
steps. For the new configurations convergence could be achieved within the 
maximum number of steps. I know that there are some ways around (using less 
tight threshold on all images and so on) but those are not so straightforward. 
All which popped in my mind would affect the convergence of the low energy 
images as well, where more accuracy is really needed, or would substantially 
increase the overall execution/setup speed.

All the best,
 Janos.

 ==================================================================
   Janos Kiss   e-mail: janos.kiss at theochem.ruhr-uni-bochum.de       
 Lehrstuhl fuer Theoretische Chemie  Phone: +49 (0)234/32-26485 
 NC 03/297                                  +49 (0)234 32 26754
 Ruhr-Universitaet Bochum            Fax:   +49 (0)234/32-14045
 D-44780 Bochum            http://www.theochem.ruhr-uni-bochum.de
 ==================================================================