[Pw_forum] LSF problem?

Axel Kohlmeyer akohlmey at cmm.chem.upenn.edu
Thu Apr 24 21:39:58 CEST 2008


On Thu, 24 Apr 2008, Charles wrote:

C> Dear PWSCF users

dear charles,

C> 1. run locally on the 8 CPU cluster, requesting 4 CPUs for the job
C> 
C> the reported time used by PWSCF is 4.38s CPU time and  2.02s wall time.

you should keep in mind, that this is a _very_ small job 
and thus does not parallelize very well.

C> It does not look bad to me.
C> 
C> 2. run on 128-CPU cluster, requesting 4 CPUs for the job
C> 
C>     PWSCF reports"
C>     Number of processors in use:       4
C>     R & G space division:  proc/pool =    4
C>     PWSCF        : 35m48.41s CPU time,     3m14.98s wall time

ahhhh. i know that one. you are most likely another victim 
of the atomatic multi-threading of intel MKL.

your MKL will try to run as many threads as your machine
has CPU cores, that is why your CPU time is so much higher
than your wall time. since spawning that many threads 
costs so much time (particularly if the batch system confines
you to the number of nodes that you request) your wall time
goes up enormously as well.

you should do three things: 
a) you can set up your environment to have the environment
variable OMP_NUM_THREADS set to 1. that should make your
executable behave reasonably, provided there are no other
users on that machine suffering from the same problem and
that LSF actually supports and activates processor sets.
b) you should have a word with your system admins to 
forcibly set OMP_NUM_THREADS by default to 1 for everybody.
c) you can link QE executables with mkl in a way that 
you using the non-threaded core libraries.

just change your make.sys so that LAPACK_LIBS is empty and BLAS_LIBS

BLAS_LIBS      = -L$(MKL_PATH) -lmkl_intel_lp64 -lmkl_sequential -lmkl_core

you have to set MKL_PATH to the directory of where your MKL is installed.

BTW, the same works for opteron/em64t-xeons.

C> This is too slow I guess and the system is overloaded if I run some big 
C> system.
C> 
C> It looks there is something wrong with the system. But the system 
C> administrator can not provide useful support on this case.

get a new sysadmin then. 

C> Could anybody have the similar system tell me how to tweak the compilation?

yep. you can also look up the docs on the altix of the PSC:

http://www.psc.edu/machines/sgi/altix/pople.php

the _do_ have good sysadmins. ;-)

cheers,
   axel.

C> 
C> Thanks!
C> 
C> Charles
C> 
C> _______________________________________________
C> Pw_forum mailing list
C> Pw_forum at pwscf.org
C> http://www.democritos.it/mailman/listinfo/pw_forum
C> 

-- 
=======================================================================
Axel Kohlmeyer   akohlmey at cmm.chem.upenn.edu   http://www.cmm.upenn.edu
   Center for Molecular Modeling   --   University of Pennsylvania
Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323
tel: 1-215-898-1582,  fax: 1-215-573-6233,  office-tel: 1-215-898-5425
=======================================================================
If you make something idiot-proof, the universe creates a better idiot.



More information about the users mailing list