[Pw_forum] Re: Comparison of 3.1.1 and 3.2 (cvs)

Giovanni Cantele Giovanni.Cantele at na.infn.it
Fri Nov 24 14:58:59 CET 2006

Eduardo Ariel Menendez P wrote:
> Hello,
> I posted this yesterday, but it seems to have gone to /dev/null.
> I insist because I think it is important to benchmark with a "large" job.
> **********************************
> I would like to add that the scaling of jobs that take a few minutes of
> CPU is totally different to jobs of a few hours. For instance, for a small
> calculation with 1 or 2 atoms  using the pw.x compiled and linked to
> the Intel MKL library makes no
> difference with compiling the source, at least if the Intel compiler is
> used. However, for a job of 64
> atoms that runs in 2 hours or more,  linking either to  MKL or to the
> compiled from source BLAS/LAPCK, the difference can be a factor of 5.
> This is an example
> /home/eduardo/Chemutils/Espresso/espresso-cvs/espresso/bin-ifort-serial-fftwmkl/pw.x
> <cdte.mdb.in >> cdte.scf10.out
> real    149m1.400s
> user    147m39.509s
> sys     1m21.829s
> /home/eduardo/Chemutils/Espresso/espresso-cvs/espresso/bin-ifort-serial/pw.x
> <cdte.mdb.in >> cdte.scf10.out
> real    627m25.528s
> user    626m4.072s
> sys     1m22.537s
> Using the fftw of MKL and using the source makes no difference, at least I
> have not found any improvement, but the BLAS/LAPACK is important. In
> summary, do  benchmark for large systems, because the scaling of a small
> system is not important by itself, and it cannot be extrapolated (I
> cannot say why).
> Eduardo

I was just wondering why my experience gave exactly the opposite results 
as the one posted yesterday,
namely CVS (3.2) version faster than 3.1.1.

My run was on 8 alpha ev7 CPUs, 1150 MHz.
I found CPU time of 2h7m with 3.1.1, 1h23m (1h26m wall time) using cvs. 
Please note that in 3.1.1 parallel diagonalization was used, whereas in 
cvs it wasn't! Does it make sense?

The runs was relaxation of an Sr-terminated SrTiO3-110 surface, 
involving 19 atoms (152 electrons):
        celldm(1) = 7.37
        celldm(2) = 1.41421356
        celldm(3) = 6.717514421
K_POINTS { automatic }
4  4  1    1  1  0
(due to symmetry the calculation reduces to 4 k-points)

I've tried to make the same test as the one reported in the forum 
(mgal2o4-cf.scf.in) and got the following results (CPU time / wall time):

#CPU           3.1.1                          CVS
   1           9m58 s/ 10m12s        11m02s / 11m23s
   2           5m34s /   5m46s          7m03s / 7m12s
   4           2m54s  /  2m60s          3m32s / 3m38s

In this case CVS is slower than 3.1.1, indeed.



Dr. Giovanni Cantele
Coherentia CNR-INFM and Dipartimento di Scienze Fisiche
Universita' di Napoli "Federico II"
Complesso Universitario di Monte S. Angelo - Ed. G
Via Cintia, I-80126, Napoli, Italy
Phone: +39 081 676910
Fax:   +39 081 676346
E-mail: Giovanni.Cantele at na.infn.it
Web: http://people.na.infn.it/~cantele

More information about the users mailing list