[Pw_forum] error on XD1 platform

Sergey Lisenkov proffess at yandex.ru
Thu Mar 15 20:13:30 CET 2007

 Dear All,

  I have a problem with running PW-3.2 on Cray XD1 platform. Sometimes it works, but very often it crashes. It doesn't matter which version of PGI compiler is used (6.1.1, 6.1.4 or 7.0-2). The code was linked with ACML library and FFTW from QE distribution.

 The problem always happens after several ionic steps or during scf cyles. For example, I see the following in the output file:

Writing output data file XXX.save
Process 0 lost connection: exiting
mpiexec: Error: read_rai_startup_ports: Failed to read barrier entry token from rank 1 process on c645n2.

Process 38 lost connection: exiting
 ask 128 got 56  at line 863 in file /var/tmp/mpich-1.2.6/mpid/rai/raifma.cPProcess 16 lost connection: exiting

I tried several PW-versions including CVS one, but the same problem happens. Cray people don't think that it is MPICH problem. 

Did anybody see this before? Googling didn't help too much.


