[Pw_forum] how to improve the calculation speed ?
Giovanni Cantele
Giovanni.Cantele at na.infn.it
Wed Sep 23 10:45:51 CEST 2009
wangqj1 wrote:
>
> Dear PWSCF users
> When I use R and G parallelization to run job ,it as if wait for the
> input .
What does it mean? Does it print the output header or the output up to
some point or nothing happens?
> According peoples advice ,I use k-point parallelization ,it runs well
> . But it runs too slow .The information I can offerred as following:
> (1) : CUP usage of one node is as
> Tasks: 143 total, 10 running, 133 sleeping, 0 stopped, 0 zombie
> Cpu0 : 99.7%us, 0.3%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu1 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu2 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu3 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu4 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu5 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu6 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu7 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Mem: 8044120k total, 6683720k used, 1360400k free, 1632k buffers
> Swap: 4192956k total, 2096476k used, 2096480k free, 1253712k cached
I'm not very expert about reading such information, but it seams that
your node is making swap, maybe because the job is requiring too much
memory with respect to the available one. This usually induces a huge
performance degradation.
In choosing the optimal number of nodes, processes per node, etc.,
several factors should be taken into account: memory requirements,
communication hardware, etc. You might want have a look to this page
from the user guide: http://www.quantum-espresso.org/user_guide/node33.html
Also, consider that, at least for not very very recent CPU generation,
using too many cores per CPU (e.g. if your cluster configuration is with
quad-core processors), might not improve (maybe also make worse) the
code performances (this is also reported in previous threads in this
forum, you can make a search).
Also this can be of interest for you:
http://www.quantum-espresso.org/wiki/index.php/Frequently_Asked_Questions#Why_is_my_parallel_job_running_in_such_a_lousy_way.3F
> I don't know why it run so slow ,how to solve this problem ? Any
> advice will be appreciated !
Apart from better suggestions coming from more expert people, it would
be important to see what kind of job you are trying to run. For example:
did you start directly with a "production run" (many k-points and/or
large unit cells and/or large cut-off)? Did pw.x ever run on your
cluster with simple jobs, like bulk silicon or any other (see the
examples directory)?
Another possibility would be starting with the serial executable
(disabling parallel at configure time) and then switch to parallel once
you check that everything is working OK.
Unfortunately, in many cases the computation requires lot of work to
correctly set-up and optimize compilation, performances, etc. (not to
speak about results convergence issues!!!!).
The only way is trying to isolate problems and solve one by one. Yet, I
would say that in this respect quantum-espresso is one of the best
choices, being the code made to properly work in as many cases as
possible, rather then implementing all the human knowledge but just for
those who wrote it!!!
;-)
Good luck,
Giovanni
--
Dr. Giovanni Cantele
Coherentia CNR-INFM and Dipartimento di Scienze Fisiche
Universita' di Napoli "Federico II"
Complesso Universitario di Monte S. Angelo - Ed. 6
Via Cintia, I-80126, Napoli, Italy
Phone: +39 081 676910
Fax: +39 081 676346
E-mail: giovanni.cantele at cnr.it
giovanni.cantele at na.infn.it
Web: http://people.na.infn.it/~cantele
Research Group: http://www.nanomat.unina.it
Skype contact: giocan74
More information about the users
mailing list