[Pw_forum] Re: Comparison of 3.1.1 and 3.2 (cvs)

Eduardo Ariel Menendez P emenendez at macul.ciencias.uchile.cl
Fri Nov 24 13:22:41 CET 2006

I posted this yesterday, but it seems to have gone to /dev/null.
I insist because I think it is important to benchmark with a "large" job.
I would like to add that the scaling of jobs that take a few minutes of
CPU is totally different to jobs of a few hours. For instance, for a small
calculation with 1 or 2 atoms  using the pw.x compiled and linked to
the Intel MKL library makes no
difference with compiling the source, at least if the Intel compiler is
used. However, for a job of 64
atoms that runs in 2 hours or more,  linking either to  MKL or to the
compiled from source BLAS/LAPCK, the difference can be a factor of 5.
This is an example
<cdte.mdb.in >> cdte.scf10.out
real    149m1.400s
user    147m39.509s
sys     1m21.829s
<cdte.mdb.in >> cdte.scf10.out
real    627m25.528s
user    626m4.072s
sys     1m22.537s
Using the fftw of MKL and using the source makes no difference, at least I
have not found any improvement, but the BLAS/LAPACK is important. In
summary, do  benchmark for large systems, because the scaling of a small
system is not important by itself, and it cannot be extrapolated (I
cannot say why).

