[Pw_forum] How to tune cputimes using mpich2-1.0.8

Mahmoud Payami Shabestari mpayami at aeoi.org.ir
Sat Feb 21 20:18:22 CET 2009


>
> how are the timings if you don't use all 8 cores?
> does the jobs get faster again?
>
N_core       tot cputime for one iter
-------       -----------------------
1                398.89 sec
2                200.14
3                134.61
4                101.46
5                85.24
6                71.60
7                63.31
8                57.16

Is it surprising?
 

> if you are still running the same "benchmark" that
> you were running before. your comparisons are most
> likely severely flawed.

No, axel! It was a two-atom Au cluster with "relax" calculations ON
pentium 4, 3.2GHz, dual core. Now it is SLAB.

> you never could prove to me,
> that you are running a correctly compiled executable
> and mpi installation. so you may be comparing apples
> and oranges. you openmpi timings are highly suspicious.
>
I do not want to prove anything. I just announce my experience.
Everybody interested can verify by him/herself.


> i was showing you, that openmpi _does_ behave properly
> on an example that does specifically test MPI performance
> and not depend on anything else (like NFS i/o).

I agree with you. In that case you were comparing apple with orange!

>
> MP> > if you see these kinds of differences, then there is something
> MP> > else causing problems.
> MP> >
> MP> > are you using processor and memory affinity with openmpi?
> MP>
> MP> I have no idea on these concepts. I just use (practice as a good(?)
> MP> student) what you taught me during the hpc08.
>
> processor affinity is tying a process to a specific CP.
> in multi-processor/multi-core environments, this has
> severe performance implications, as it improves cpu
> cache utilization. just stick those keyword into google
> and you'll see.
>
> MP> >
> MP> > what kind of processor is this exactly?
> MP> It is 5420.
>
> ok. so that is intel quad-core. i have a bunch of 5430s
> available to me. please redo those tests with the 32-water
> cp.x input from example21 of the Q-E distribution. and
> then we can start dicussing seriously. for as long as
> nobody can reproduce your benchmarks, they are useless.

I do not have any experience with CPMD.
>
> also you still have a huge difference between wall
> clock and cpu clock. in short, you are trying to solve
> the least important problem first.
>
> i'd kindly ask to not to make claims about mpi implementations
> being "better" unless you can prove that the difference in
> timings are really due to the mpi implementation and not due
> to improper use of the machine or inadequate hardware.
I just expressed my findings, and tried to share it.

regards,mahmoud


>
> cheers,
>    axel.
>
>
>
> MP>
> MP> regards,mahmoud
> MP>
> MP>  
> MP> > axel.
> MP> >
> MP> > MP>
> MP> > MP> Cheers,
> MP> > MP>        mahmoud
> MP> > MP>
> MP> > MP>
> MP> >
> MP> > --
> MP> >
> =======================================================================
> MP> > Axel Kohlmeyer   akohlmey at cmm.chem.upenn.edu  
> http://www.cmm.upenn.edu
> MP> >    Center for Molecular Modeling   --   University of
> Pennsylvania
> MP> > Department of Chemistry, 231 S.34th Street, Philadelphia, PA
> 19104-6323
> MP> > tel: 1-215-898-1582,  fax: 1-215-573-6233,  office-tel:
> 1-215-898-5425
> MP> >
> =======================================================================
> MP> > If you make something idiot-proof, the universe creates a better
> idiot.
> MP>
>
> --
> =======================================================================
> Axel Kohlmeyer   akohlmey at cmm.chem.upenn.edu   http://www.cmm.upenn.edu
>    Center for Molecular Modeling   --   University of Pennsylvania
> Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323
> tel: 1-215-898-1582,  fax: 1-215-573-6233,  office-tel: 1-215-898-5425
> =======================================================================
> If you make something idiot-proof, the universe creates a better idiot.




More information about the users mailing list