[Pw_forum] PW taskgroups and a large run on a BG/P
Paolo Giannozzi
giannozz at democritos.it
Thu Feb 12 15:54:35 CET 2009
On Feb 12, 2009, at 15:06 , David Farrell wrote:
> I found when I took the 432 atom system I sent you, and ran it on
> 128 cores in smp mode
> (1 MPI process/node - 2 GB per process) it did work (-ntg 32 -ndiag
> 121
32 task groups? that's a lot
> as well as -ntg 4 -ndiag 121)
4 looks more reasonable in my opinion
> - the system didn't fit into memory in vn mode (4 mpi processes/
> node -
> 512 MB per process)
that job requires approx. 100Mb of dynamically allocated RAM per
process, plus
a few tens of Mb of work space. Why it does not fit into 512Mb is a
mystery,
unless each process comes with a copy of all libraries. If this is
the case, the
maximum you can fit into 512Mb is a code printing "Hello world" in
parallel.
By the way: the default number of bands in metallic calculations can
be trimmed
by a significant amount (e.g. 500 instead of 576)
> I then tried the system in dual mode (2 mpi processes/node - 1 GB
> per process)
> using -ntg 4 and -ndiag 121. In this case, the cholesky error came up:
the code performs exactly the same operations, independently on how
the MPI
processes are distributed. It looks like yet another BlueGene
weirdness, like this:
http://www.democritos.it:8888/O-sesame/chngview?cn=5777
http://www.democritos.it:8888/O-sesame/chngview?cn=5932
that however affected only the internal parallel diagonalization, not
the new
scalapack algorithm. I do not see any evidence that there is anything
wrong
with the code itself.
Paolo
---
Paolo Giannozzi, Democritos and University of Udine, Italy
More information about the users
mailing list