[Pw_forum] PW taskgroups and a large run on a BG/P
David Farrell
davidfarrell2008 at u.northwestern.edu
Thu Feb 12 15:06:35 CET 2009
Just to add to my last email, I tried to reproduce your results (and
it looks like you emailed the list back before I had a chance to
finish this)
I found when I took the 432 atom system I sent you, and ran it on 128
cores in smp mode (1 MPI process/node - 2 GB per process) it did work
(-ntg 32 -ndiag 121 as well as -ntg 4 -ndiag 121) - the system didn't
fit into memory in vn mode (4 mpi processes/node - 512 MB per process).
the error in the vn mode case was:
'"fft_parallel.f90", line 104: 1525-108 Error encountered while
attempting to allocate a data object. The program will stop.'
I then tried the system in dual mode (2 mpi processes/node - 1 GB per
process) using -ntg 4 and -ndiag 121. In this case, the cholesky error
came up:
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%
task # 12
from pdpotf : error # 1
problems computing cholesky decomposition
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%
I then tried the dual mode case with -ntg 4 and -ndiag 100
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%
task # 11
from pdpotf : error # 1
problems computing cholesky decomposition
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%
So now I am left wondering if I am not trying to spread the work out
too much (sounds like this was the case) and that is what is leading
to these cholesky errors. But the dual mode error seems to point to
something else going on.
Dave
On Feb 12, 2009, at 6:53 AM, Paolo Giannozzi wrote:
> On Feb 11, 2009, at 19:41 , David Farrell wrote:
>
>> Let me know if the attachment doesn't make it through
>
> it didn't to the mailing list (max attachment size 40kB), but
> I received it. Attached is what I got (for a scf calculation) on
> a cray xt5. Apart from the bogus values of planes printed in
> the output, everything else seems ok, including parallel
> subspace diagonalization (scalapack) on 121 processors.
>
> Paolo
>
> <test432.out.gz>
> ---
> Paolo Giannozzi, Democritos and University of Udine, Italy
>
>
David E. Farrell
Post-Doctoral Fellow
Department of Materials Science and Engineering
Northwestern University
email: d-farrell2 at northwestern.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20090212/f70be33a/attachment.html>
More information about the users
mailing list