[Pw_forum] Restarting phonon calculation with images, possibility of changing the number of images
Thomas Brumme
thomas.brumme at mpsd.mpg.de
Fri Sep 2 17:40:44 CEST 2016
OK, I think I found a possibility, which does not involve
writing input files like in the GRID example (i.e. finding out
which q points and representations finished and which
didn't which can be quite tedious) but maybe someone
can confirm...
I now have, e.g., four _ph folders. For two of the images the
calculations are nearly finished, while the other two haven't
finished a single calculation. Restarting with 4 images would
result in 2 of them just waiting... However, if I could restart
with more or less images the work should be more evenly
distributed.
Lets say the original image 1 and 3 are finished and 0 and 2
not, so "not / finished / not / finished"... In that case I could
restart with only two images and the work would be evenly
distributed.
On the other hand, if I would have something like:
"finished / finished / not / not"
reducing the number of images to 2 would not solve the
problem, but in the more general case with many more
images, doubling the number of images could at least
reduce the total number of CPUs which don't do anything.
So, I need to create _ph folders for the number of images
I want to use... Then I need to copy the directory
_ph0/$prefix.phsave/
of the original calculation into them in order to have the
patterns. Then I also copy all the dynmat files of all the
original images into those directories. If I now restart
the phonon code should always recognize if a calculation
has already been done...
Does this sound reasonable?
Kind regards
Thomas
On 09/02/2016 12:01 PM, Thomas Brumme wrote:
> Dear all,
>
> I have a question concerning the restart possibilities with image
> parallelization in a phonon calculation.
> I have the problem that for some of the images the calculation did not
> converge. I know that I can achieve
> convergence by reducing the mixing since I encountered the problem
> before for exactly the same system.
> Yet, now, as some of the images are finished with their task (or close
> to), I have only the possibility of either
> using only one image copying the dynmat.$iq.$ir.xml files to the
> _ph0/*.phsave/ directory, or to restart using
> the same number of images and live with the fact that some images will
> do nothing...
> Or is there a third possibility I don't know? Wouldn't it be better to
> first check what has already been done
> and then distributing the work among the images? Or is this too hard to
> code? (I haven't looked at this part
> of the code yet)
>
> OK, I think I could also use some kind of GRID parallelization and
> create some input files by hand, setting
> the start_irr, start_q, and so on, but this is rather tedious since I
> have a big system and a q-point grid...
> So, again the (maybe stupid) question: Is there another possibility?
>
> Regards
>
> Thomas
>
>
--
Dr. rer. nat. Thomas Brumme
Max Planck Institute for the Structure and Dynamics of Matter
Luruper Chaussee 149
22761 Hamburg
Tel: +49 (0)40 8998 6557
email: Thomas.Brumme at mpsd.mpg.de
More information about the users
mailing list