[Pw_forum] openmp vs mpich performance with MKL 10.x
Eduardo Ariel Menendez Proupin
eariel99 at gmail.com
Tue May 6 21:44:03 CEST 2008
> there are two issue that need to be considered.
>
> 1) how large are your test jobs? if they are not large enough, timings are
> pointless.
about 15 minutes in Intel Quadcore. 66 atoms: Cd_30Te_30O_6. 576 electrons
in total.
My test may be very particular. If you a have a balanced benchmark, I would
like to run it.
> 2) it is most likely, that you are still tricked by the
> auto-parallelization of intel MKL. the export OMP_NUM_THREADS
> will usually only work for the _local_ copy, for some
> MPI startup mechanisms not at all. thus your MPI jobs will
> be slowed down.
I am using only SMP. Sorry, I still haven't a cluster of Quadcores.
>
>
> to make certain that you only like the serial version of
> MKL with your MPI executable, please replace -lmkl_em64t
> in your make.sys file with
> -lmkl_intel_lp64 -lmkl_sequential -lmkl_core
Yes, I also tried that. The test runs in 14m2s. Using only -lmkl_em64t it
runs in 14m31s. Using serial compilations it ran in 12m20s.
Thanks,
Eduardo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20080506/ef1744f2/attachment.html>
More information about the users
mailing list