[Pw_forum] parallel FFTs - important MPI considerations

Mon Feb 6 18:08:42 CET 2006

 Hi all,

 I have investigated a bit the FFT issue during parallel runs. The key
MPI communication routines are fft_transpose and fft_scatter from the
module Modules/fft_base.f90

 First, it appears that by default "fft_transpose" uses a bunch of
"mpi_isend" and "mpi_irecv" instead of "mpi_alltoall". Most MPI
implementations suck a lot more with these isend and irecv as opposed
to alltoall (which is way easier to handle efficiently). The routine
using mpi_alltoall can be readily enabled in fft_base.f90 by a
preprocessing directive, and it appears to work. So why is it not the
default ?

 Another potentially sucky part is "mpi_alltoallv" in the fft_scatter.
This one depends on the cleverness of the MPI implementation
significantly. 

 There is some hope that Open-MPI people will pay some attention to
these performance issues, and so that could become the MPI of choice
for the usual circumstances.

 Finally, has there been any activity on fftw3 ? Since it appears that
parallel transforms from fftw2 are not used, then fftw3 should be just
as usable ...

 Kostya

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com