[Pw_forum] LAM 7.1.1 and PW2.1.2

Axel Kohlmeyer axel.kohlmeyer at theochem.ruhr-uni-bochum.de
Fri Apr 8 08:43:18 CEST 2005


On Thu, 7 Apr 2005, Matteo Cococcioni wrote:
 
MC> Dear all,

dear matteo and everybody else on the list.
please let me comment a little on that issue.
i had experienced a very similar problem with 
some specific version of MPI quite some time ago
and did investigate a little further.

the problem seems to be caused by a bad interaction 
of a compiler problem with some MPI implementation 
specific issues and the way the espresso package 
interfaces to MPI.

as you might know, the MPI API defines some constants
(i.e. PARAMETER in fortran lingo) in the header file 
'mpif.h' for signaling the type of data to be transferred
and where and how it should be sent or received. usually
this file in included explicitely in the file where the
constants are used. espresso, however, uses a more elegant
way by putting this include into a fortran 90 module and
use that module instead, which would allow to offload 
platform and preprocessor related stuff into the include 
file and have a consistent interface for the rest of the code.

unfortunately, not all compilers pick that up correctly. sometimes
the parameters are defined, but not properly initialized. 
in my case it was set to 0, which would not be a problem for
LAM, since they define MPI_COMM_WORLD to 0, but for MPICH
based MPI implementations, this is not the case.

ok, now what to do about this. from my point of view there are
two 'lines of attack':

1.) if you are a user, you have to double and triple check, 
that you are including the correct mpif.h file under all circumstances
(best you make sure, that there is only one) and that you don't 
accidentally mix different implementations. if you have trouble with
one implementation, you may want to try a different one, sometimes
even changing the version seems to help.

2.) as far as the espresso code is concerned, the obvious workaround
would be to have a module where all the parameters are explicitely
set as it is already done with the results of MPI_COMM_SPLIT, so 
you avoid using the implicite include alltogether. but of course
this is a matter of opinion, as it is clearly a bug of the compiler.
nevertheless, i still think it would be a good idea, since it will 
encapsulate platform of library specific stuff even more and since 
i personally prefer a defensive style of programming as long as the
readability of the resulting source code is preserved. 

i hope i have not bored you too much with this quite 
technical and implementation specific stuff. i thought
some of you might be interested to know about it, though.

with best regards,
	axel.

MC> 
MC> I have to apologize again. In the message below I said that LAM 7.2.1 has been
MC> released but it's not true. I got confused.
MC> 
MC> So Lam 7.0.6 with ifort 8.1 work well with PW 2.1.2. I'm not sure whether the
MC> latest 8.1 compiler can work with lam 7.1.1; I haven't tried.
MC> 
MC> That's all I know. Sorry again for the confusion.
MC> 
MC> Matteo
MC> 
MC> 
MC> Quoting Matteo Cococcioni <matteoc at MIT.EDU>:
MC> 
MC> > 
MC> > Dear all,
MC> > 
MC> > I apologize for the message in italian. I was just telling Marco that we
MC> > have
MC> > experienced some problems with LAM 7.1.1 and PW 2.1.2 but, as far as I
MC> > remember, the problem also appeared with older version of the codes (so that
MC> > we
MC> > decided to use lam 7.0.6).
MC> > Now lam 7.2.1 is out and also 8.1 intel compilers but I haven't tried yet to
MC> > use
MC> >  the combination of them. Maybe they work.
MC> > 
MC> > sorry again,
MC> > 
MC> > Matteo 
MC> > 
MC> > 
MC> > Quoting Marco Fornari <fornari at phy.cmich.edu>:
MC> > 
MC> > > Dear all,
MC> > > 
MC> > > I'm having this problem running PW 2.1.2 on a Pentium cluster
MC> > > with LAM 7.1.1.1 and IFORT 8.0:
MC> > > 
MC> > >     G cutoff =  189.7462  (   2733 G-vectors)     FFT grid: ( 20, 20, 20)
MC> > > 
MC> > >     nbndx  =     4  nbnd   =     4  natomwfc =     8  npwx   =     181
MC> > >     nelec  =    8.00 nkb   =     8  ngl    =      65
MC> > > MPI_Allreduce: invalid operation: Invalid argument (rank 0,
MC> > MPI_COMM_WORLD)
MC> > > Rank (0, MPI_COMM_WORLD): Call stack within LAM:
MC> > > Rank (0, MPI_COMM_WORLD):  - MPI_Allreduce()
MC> > > Rank (0, MPI_COMM_WORLD):  - main()
MC> > >
MC> > -----------------------------------------------------------------------------
MC> > > 
MC> > > 
MC> > > One of the processes started by mpirun has exited with a nonzero exit
MC> > > code.  This typically indicates that the process finished in error.
MC> > > If your process did not finish in error, be sure to include a "return
MC> > > 0" or "exit(0)" in your C code before exiting the application.
MC> > > 
MC> > > PID 10516 failed on node n0 (10.0.0.1) with exit status 22.
MC> > >
MC> > -----------------------------------------------------------------------------
MC> > > 
MC> > > 
MC> > > 
MC> > > PW 2.1.1 works well on the same cluster. Did you encountered this
MC> > > issue already?
MC> > > 
MC> > > Thanks,
MC> > > Marco
MC> > > 
MC> > > _______________________________________________
MC> > > Pw_forum mailing list
MC> > > Pw_forum at pwscf.org
MC> > > http://www.democritos.it/mailman/listinfo/pw_forum
MC> > > 
MC> > 
MC> > 
MC> > 
MC> > _______________________________________________
MC> > Pw_forum mailing list
MC> > Pw_forum at pwscf.org
MC> > http://www.democritos.it/mailman/listinfo/pw_forum
MC> > 
MC> 
MC> 
MC> 
MC> _______________________________________________
MC> Pw_forum mailing list
MC> Pw_forum at pwscf.org
MC> http://www.democritos.it/mailman/listinfo/pw_forum
MC> 
MC> 

-- 

=======================================================================
Dr. Axel Kohlmeyer   e-mail: axel.kohlmeyer at theochem.ruhr-uni-bochum.de
Lehrstuhl fuer Theoretische Chemie          Phone: ++49 (0)234/32-26673
Ruhr-Universitaet Bochum - NC 03/53         Fax:   ++49 (0)234/32-14045
D-44780 Bochum  http://www.theochem.ruhr-uni-bochum.de/~axel.kohlmeyer/
=======================================================================
If you make something idiot-proof, the universe creates a better idiot.




More information about the users mailing list