[Pw_forum] wfc files: heavy I/O, handling for restarts
S. K. S.
sks.jnc at gmail.com
Tue Sep 6 16:22:22 CEST 2011
Dear Prof. Paolo,
Thanks a lot for your reply.
> which QE code and which files are you referring to? <
That is phonon code (ph.x) and the files I mentioned before are given below
in detail.
With the earlier versions up to 4.1.3, phonon code runs fine and I got
following files
(bold and underlined items are folders) in the tmp directory of a local
disk.
The nodes which are used by the codes are node186, 036, 139.
node186:/tmpscratch/sksct8/tmp$ ls
> *pbmno.save* _phpbmno.com1 _phpbmno.dwf2 _phpbmno.igk3
> _phpbmno.prd3
> pbmno.wfc1 _phpbmno.com2 _phpbmno.dwf3 _phpbmno.mixd1
> _phpbmno.recover
> pbmno.wfc2 _phpbmno.com3 _phpbmno.ebar1 _phpbmno.mixd2
> _phpbmno.recover2
> pbmno.wfc3 _phpbmno.dvkb31 _phpbmno.ebar2 _phpbmno.mixd3
> _phpbmno.recover3
> _phpbmno.bar1 _phpbmno.dvkb32 _phpbmno.ebar3 *_phpbmno.phsave** * *
> _phpbmno.save*
> _phpbmno.bar2 _phpbmno.dvkb33 _phpbmno.igk _phpbmno.prd1
> _phpbmno.bar3 _phpbmno.dwf1 _phpbmno.igk2 _phpbmno.prd2
>
node036:/tmpscratch/sksct8/tmp$ ls
pbmno.wfc4 _phpbmno.dwf4 _phpbmno.mixd4 _phpbmno.recover4
pbmno.wfc5 _phpbmno.dwf5 _phpbmno.mixd5 _phpbmno.recover5
_phpbmno.bar4 _phpbmno.igk4 _phpbmno.prd4 _phpbmno.wfc4
_phpbmno.bar5 _phpbmno.igk5 _phpbmno.prd5 _phpbmno.wfc5
node139:/tmpscratch/sksct8/tmp$ ls
pbmno.wfc6 _phpbmno.bar8 _phpbmno.igk7 _phpbmno.recover6
_phpbmno.wfc8
pbmno.wfc7 _phpbmno.dwf6 _phpbmno.igk8 _phpbmno.recover7
pbmno.wfc8 _phpbmno.dwf7 _phpbmno.prd6 _phpbmno.recover8
_phpbmno.bar6 _phpbmno.dwf8 _phpbmno.prd7 _phpbmno.wfc6
_phpbmno.bar7 _phpbmno.igk6 _phpbmno.prd8 _phpbmno.wfc7
However, with the new version 4.3.1, for the exactly same input files and
job scripts I only get these files and nothing else :
node045:/tmpscratch/sksct84/tmp$ ls
*pbmno.save* pbmno.wfc1 *_ph0*
node111:/tmpscratch/sksct84/tmp$ ls
pbmno.wfc2 pbmno.wfc3
node092:/tmpscratch/sksct84/tmp$ ls
pbmno.wfc4 pbmno.wfc5
node080:/tmpscratch/sksct84/tmp$ ls
pbmno.wfc6 pbmno.wfc7
node072:/tmpscratch/sksct84/tmp$ ls
pbmno.wfc8
Note that, the used nodes in this time are node045, node111, node092,
node080, node072.
So it is clear from the above example that somehow in the new version
4.3.1, _phpbmno.save and
_phpbmno.phsave goes inside the directory "_ph0" ;
And the same phonon calculation, which was running fine with the earlier
version, now stops like this way (a bit
abruptly and rudely, with out much informations or error messages) :
Electric field:
Dielectric constant
Born effective charges in two ways
Atomic displacements:
There are 5 irreducible representations
Representation 1 3 modes -T_1u G_15 G_4- To be done
Representation 2 3 modes -T_1u G_15 G_4- To be done
Representation 3 3 modes -T_1u G_15 G_4- To be done
Representation 4 3 modes -T_1u G_15 G_4- To be done
Representation 5 3 modes -T_2u G_25 G_5- To be done
simply with this error :
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD
with errorcode 0.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun has exited due to process rank 2 with PID 8791 on
node node111.cvos.cluster exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
[node045:11580] 6 more processes have sent help message help-mpi-api.txt /
mpi-abort
[node045:11580] Set MCA parameter "orte_base_help_aggregate" to 0 to see all
help / error messages
Hope this email explains much better.
Thanks and Regards,
Saha SK
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20110906/78efc62c/attachment.html>
More information about the users
mailing list