[Pw_forum] Restarting PWSCF - usability issues
Konstantin Kudin
konstantin_kudin at yahoo.com
Mon Mar 8 23:07:29 CET 2004
I've been running some PWSCF (v2.0.0 from the web)
jobs that needed restarts because they ran out of
"nstep". So I used restart_mode='restart' to handle
that.
Now, one of the issues that came up in the context of
multi-node jobs is the management of "*.wfc*" files.
It seems like only the master node stores all kinds of
temporary and semi-temporary files, while the slave
nodes only have "*.wfc*" files.
Ideally, I'd like a script that would move such files
from a central location to the master node before the
job starts, and then moves them back after the job
finishes. The presence of "*.wfc*" files complicates
the matter.
First of all, it is a rather non-trivial shell
programming to first copy the needed (and only needed)
for a given node *.wfc* files to its scratch space,
and then copy them back. If I copy all *.wfc* I have
for a given prefix, some of them won't be used on a
given node, and when I move them back, the last node
will overwrite the recent *.wfc* files with the older
ones. So one would need extra code to handle that.
More serious, however, is the fact that restarting a
calculation with a mismatched set of *.wfc* files
causes crashes.
These *.wfc* files also get in the way when one
changes the number of processors for the jobs. When I
restart a job on 8 cpus that ran before on 4 cpus,
there are missing *.wfc* files and the job seems to
hang forever right before the message "Starting wfc
from file". I guess some safeguards are missing there
at the moment, so this is a BUG.
I modified the code to do a restart with atomic wfc
(line change in PW/input.f90 "startingwfc =
'atomic'"), and it appears that as long as the *.rho
file is read in, it takes very little extra time to
converge the wfcs from an atomic guess. Thus keeping
these *.wfc* files around is at best a very minor
help, and at worst a very major headache in a parallel
environment.
So I suggest that either startingwfc=atomic could be
a default for restarts, or, the program should
understand the startingwfc=atomic line even when a
restart is requested. Of course, there is always that
option to merge the *wfc* files into one, but why
should one even bother to keep them if it does not
save anything in cpu time?
A second point is the file "*.rho". When changing
geometry manually and reading in a "*.rho" file, it
gives a very poor approximation to the first density
if the geometrical change was large. On the other
hand, the optimization process does something that
prints out the line: "NEW-OLD atomic charge density
approx. for the potential". This "NEW-OLD" approach
works quite well, and the first density after it is
quite good. Would it be possible to save some minimal
information on the file "*.rho" to do this "NEW-OLD"
thing whenever the "*.rho" file is used?
Alternatively, one could probably read the file
"*.save" to figure out the old coordinates, and do
this "NEW-OLD" projection on the density from the
"*.rho" file.
Kostya
__________________________________
Do you Yahoo!?
Yahoo! Search - Find what youre looking for faster
http://search.yahoo.com
More information about the users
mailing list