[Pw_forum] openmp vs mpich performance with MKL 10.x

Eduardo Ariel Menendez Proupin eariel99 at gmail.com
Tue May 6 21:44:03 CEST 2008


> there are two issue that need to be considered.
>
> 1) how large are your test jobs? if they are not large enough, timings are
> pointless.


about 15 minutes in Intel Quadcore. 66 atoms: Cd_30Te_30O_6. 576 electrons
in total.
My test may be very particular. If you a have a balanced benchmark, I would
like to run it.


> 2) it is most likely, that you are still tricked by the
>   auto-parallelization of intel MKL. the export OMP_NUM_THREADS
>   will usually only work for the _local_ copy, for some
>   MPI startup mechanisms not at all. thus your MPI jobs will
>   be slowed down.

I am using only SMP. Sorry, I still haven't a cluster of Quadcores.

>
>
>   to make certain that you only like the serial version of
>   MKL with your MPI executable, please replace  -lmkl_em64t
>   in your make.sys file with
>   -lmkl_intel_lp64 -lmkl_sequential -lmkl_core


Yes, I also tried that. The test runs in 14m2s. Using only -lmkl_em64t it
runs in 14m31s. Using serial compilations it ran in 12m20s.



Thanks,
Eduardo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20080506/ef1744f2/attachment.html>


More information about the users mailing list