[Pw_forum] Restarting PWSCF - usability issues
giannozz at nest.sns.it
Tue Mar 9 22:37:12 CET 2004
On Monday 08 March 2004 23:07, Konstantin Kudin wrote:
> [...] Now, one of the issues that came up in the context
> of multi-node jobs is the management of "*.wfc*" files.
it's a known issue, and not a simple one. There is a tradeoff
between speed (maximised If each processor writes to its
local scratch system, as it happens now), portability of the
results (maximised if just one file is written, in a format that
can be read independently on the number of processors,
as it happens in Car-Parrinello codes), developers' time
(minimised if things are kept simple, or if they are kept the
way they are).
For machines having a parallel file system (i.e. the sp4 in
Cineca, Bologna), the problem is not that serious: all files
are in the same place, and the I/O is still very fast (or at
least, that is what I understood). Trouble arises only if you
want to restart with a different number of processors .
For machines without a parallel file system (i.e. the sp3 in
PMI, Princeton) your *wfc files are scattered on different file
systems in different processors (forget using I/O via NFS!).
Restarting is a mess: you need a copy of all files on all
processors, since there is no way to tell which physical
processor corresponds to which logical one in MPI.
Car-Parrinello people restart from wavefunctions more often
than not, so the possibility to restart from a single file with a
different number of processors is essential (and implemented)
for them. PWscf people tend to restart less often, so restarting
is a rougher process. This will be fixed sooner or later (I hope
sooner: all the needed stuff is already there, and maybe it's
even working: see __NEW_PUNCH)
> the program should understand the startingwfc=atomic
> line even when a restart is requested
doesn't it? not good
Paolo Giannozzi e-mail: giannozz at nest.sns.it
Scuola Normale Superiore Phone: +39/050-509876, Fax:-563513
Piazza dei Cavalieri 7 I-56126 Pisa, Italy
More information about the users