[Pw_forum] QE with ESSL on BlueGene

Dr Brent Walker b.walker at irl.cri.nz
Wed Nov 14 10:25:14 CET 2007


Hi Axel,

Thanks very much for your insightful comments.

Just before seeing your email I actually managed to get PWscf going on
our BG/L. I found it was crashing deep inside a call to the "dgetri"
routine (called in invmat.f90) which is provided by essl. Without the
source for the implementation of that routine in essl it was hard to
see what was happening, however, I managed to get around the problem
firstly by setting a significantly larger "work" array. The vendor
provided documentation says that for "optimal performance" you should
set the size of the work array to be >= 100 times the size of the
array being inverted. I guess by "optimal performance" they meant
"won't cause your code to crash". I then found that the essl
implementation of that routine has an extension to the LAPACK spec
whereby you can have it automatically determine and allocate the
optimal size for its work array (this setting works also and is what I
guess I will use now, though one could imagine scenarios where this
might go wrong with the fairly severe memory limitations on the BG/L).

So far, I'd say the BG/L isn't the easiest piece of equipment I've
used (especially in comparison with say a p5 series machine, where
things like QE tend to work "out of the box"). At the moment however
that's what I've got available, so of course I'll use it as best I
can! As a very crude guess (that is, on one test system containing
~250 electrons) I'd say using the vendor-provided libs amounts to a 30
% improvement over locally compiled blas/lapack/fftw, though for
larger numbers of CPUs (~512) this seems to be much smaller (which is
probably related to the issues you mention regarding the ill fit of
PWscf to BG/L). All that being said, I won't be surprised if I can
break things again ....

Your comments about the perfomance of LAMMPS are encouaging
however. Some of my "non-DFT" colleagues are setting themselves up to
use that code, so it's good to know that they should be able to expect
reasonable performance from it. And while I'm getting a bit off-topic,
it seems my troubles getting PWscf/QE running on BG/L are much less
severe than another of my colleagues who is attempting to run a
popular commercial plane-wave DFT code.

Kindest regards,

Brent.

On Wed, 14 Nov 2007, Axel Kohlmeyer wrote:

> On Wed, 14 Nov 2007, Dr Brent Walker wrote:
> 
> BW> Hi all,
> 
> hi brent,
> 
> BW> Does anyone know whether QE has been used successfully with the IBM
> BW> ESSL libraries on the Blue Gene L architecture (running linux)? Google
> BW> unfortunately hasn't provided me with much in relation to this.
> 
> yep. i've managed to compile and run QE on a BG/L.
> 
> BW> I have spent some time trying to get QE (well really PWscf) running on
> BW> such a machine and am at the stage of deciding whether to persevere or
> BW> just give up and use locally compiled versions of fftw and lapack/blas
> BW> (following say the provided "Make.bgl" file, which seems to work fine
> BW> for me).
> 
> due to the (lack of) features in the BG/L cpu, you may actually
> get reasonable performance with regular BLAS/LAPACK. you can
> try the "double hummer" libraries, but then you are limited to
> coprocessor mode. this is probably needed anyways, because the
> limitations to jobs on BG/L are very hard. there is no local
> storage, so pw.x with its default setting of storing wavefunctions
> as files, is not scaling well. you'll have to use the (experimental)
> feature of storing those in memory, but with 512MB/node there is
> not much memory available. on top of that, the cpus on BG/L are
> very slow, so you need to parallelize across a large number of
> cpus to get decent performance. in my view for a code like pw.x
> it is currently not worth the hassle. your chances with cp.x 
> are much better, but then again, you are limited by the supported
> feature set of cp.x. altogether, you have to keep in mind, that
> BG/L is mainly a machine to get a good ranking in the top500
> and thus please administrators, politicians and generally people
> who are not using it. from the user's perspective it is a constant
> struggle and a PITA. if i had the choice, i'd rather skip the
> top500 placement and get a machine that is usable. the majority
> of QE jobs are run on rather small clusters, so to run well on
> those machines is where most of the effort goes.
> 
> BW> Is this worth pursuing or should I just file it in the "too hard"
> BW> basket for the time being? If people think there is some hope that I
> BW> can get this to work, I'll provide more details (make.sys, etc.).
> BW> 
> BW> Thanks very much for any information/thoughts/anecdotes on this!
> 
> well, i've been struggling a lot with finding _any_ project that
> runs well on a BG/L that does not run better on a cray xt3/xt4
> or even a reasonably well laid out PC cluster with DDR infiniband.
> 
> my best results were so far with classical MD using LAMMPS on
> systems that have no coulomb interactions. there i am scaling
> out on the BG/L at half the performance of the scaleout timing
> on a cray xt3. for most codes, particularly plane wave 
> pseudopotential DFT the difference is about a factor of 10. 
> 
> so before putting in more effort, it might be worth to discuss
> what kind of calculations you intend to run and how much 
> cpu time across how many nodes you have at your disposal.
> 
> cheers,
>    axel.
> 
> BW> 
> BW> Brent.
> BW> 
> BW> PS. I have noted AK's comment "good luck (you'll be needing it)"
> BW> regarding compilation of QE on BG/L on 31 Aug, which of course doesn't
> BW> bode well!
> BW> 
> BW> 
> 
> 

-- 

##############################
Dr Brent Walker
Industrial Research Limited,
P.O. Box 31-310,
69 Gracefield Road,
Lower Hutt 5040,
New Zealand.
Email: b.walker at irl.cri.nz
Phone: +64 (0) 4 931 3783 
             (extn: 4783)
Fax:   +64 (0) 4 931 3754
http://www.irl.cri.nz
http://www.vuw.ac.nz/scps/research/compnanotech
##############################


This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.




More information about the users mailing list