[Pw_forum] Libraries on Blue Gene/L
giannozz at democritos.it
Wed Nov 5 19:02:03 CET 2008
zhaofeng at Princeton.EDU wrote:
> [...] We look into it and find that there are calls of DGEMM and ZHPEV
> which are quite time consuming(in wf.f90). [...]
> Since ZGEMM and ZHPEV are both the calls in the library of LAPACK, we
> conclude it is because of bad choice of lapack library, is it correct?
maybe, or maybe this is just part of the story. I don't know what those
routines are used for in wf.f90, but presumably they operate on NxN
matrices, where N is the number of Kohn-Sham states (if not, what
follows may not apply). There used to be many such matrices in both
CP and PW, replicated on all processors, with the related matrix
operations (matrix-matrix multiplication and diagonalization) also
replicated and not parallelized. The conventional wisdom was, "they
are small". When you run on many processors, however, those "small"
matrices gobble up all the memory, while the related matrix operations
gobble up all the CPU. Almost all of them are now distributed across
processors, the related operations parallelized. Unfortunately the
Wannier-function CP code is a different branch that has somewhat
fallen behind the rest of the distribution.
Paolo Giannozzi, Democritos and University of Udine, Italy
More information about the users