[Pw_forum] Further reduce disk IO for ph.x

Axel Kohlmeyer akohlmey at gmail.com
Fri Dec 7 07:38:21 CET 2012


On Fri, Dec 7, 2012 at 7:25 AM, Yao Yao <yao.yao at unsw.edu.au> wrote:
> Dear All,

> I have been working with a lot of phononic calculations with ph.x. It worked
> very well. However, ph.x do a lot of writing. On a 48-core compute node, it
> takes ~35MB/s disk IO band width (estimated from network traffic of that node,
> which is without local disk). I tried "reduce_dio" option, which reduce the
> rate to ~20MB/s.
>
> However this is still not enough for me. Our cluster has ~40 such nodes. If I
> run more than ~7 jobs concurrently, my jobs will deplete the disk IO
> bandwidth. Then everything will be very slow. A simple "ls" command will take
> ~10 seconds to return.

>
> I'm seeking a way to further reduce disk IO for ph.x. I don't need recover
> though I have limited run duration per job, because I can continue with
> "start_q" and "last_q", which is very nice.
>
> For these disk IO, is it writing output files or swapping out data to save RAM?
> Since the total output file size is small compared to the disk IO bandwidth, I
> guess the answer is the latter one, right? If yes, where is the data swap file
> located? Can I customise the path? I want to put it in /dev/shm, which is a
> 48GB RAM disk (I have far more RAM than QE needs). On the other hand, if it's
> possible to avoid swapping in the expense of more RAM consumption, that's also
> OK for me.
>
> I guess putting the whole folder in /dev/shm may work but that's probably my
> last option because it requires copy back and forth.

no. if you have sufficient space in /dev/shm,
this should be your *first* option. this is by
far the fastest, and most elegant way to handle
the disk i/o issue on a diskless node. the stage
in and stage out of data is a minor inconvenience
that you'll have to deal with. however, please
be sure to erase all your files in /dev/shm at the
end of the job, so you won't make your sysadmin
unhappy and come up with ways to prohibit this.

as a side note, there are quite a few machines
(particularly high-end supercomputers), where this
kind of procedure is required and part of batch
processing, since the compute nodes have no
access to  your home directory at all (for reasons
of efficiency). thus you have to tell the batch system
which files to stage in and stage out as part of
the job.

axel.


>
> Thank you very much.
>
> Regards,
> Yao
>
> --
> Yao Yao
> PhD. Candidate
> 3rd Generation Strand
> ARC Photovoltaics Centre of Excellence
> University of New South Wales
> Sydney, NSW Australia
> _______________________________________________
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://pwscf.org/mailman/listinfo/pw_forum



--
Dr. Axel Kohlmeyer  akohlmey at gmail.com  http://goo.gl/1wk0
International Centre for Theoretical Physics, Trieste. Italy.



More information about the users mailing list