[Pw_forum] about one "fatal error"

L.F.Huang lfhuang at theory.issp.ac.cn
Fri Aug 29 08:13:40 CEST 2008

Dear prof. Giannozzi
  Thank you very much for your kind attention. I am afraid that I can
only have the access to the machine of "SGI origin3900". But the problem
seems solved, as I have mentioned before, when I change to 13 nodes that
can divide the 52 reduced k points. And my calculation goes on well with
13 nodes. Then, that may tell me the error belongs to the machine, or
"communication" as Derek has mentioned before.

Best Wishes!

Yours Sincerely!

> On Aug 26, 2008, at 4:03 , L.F.Huang wrote:
>>   I am calculating graphene supercell with one vacancy impurity,
>> whose size
>> is 4*4*1. It is an magnetic system whose magnetization is 0.8 bohr
>> magneton.
>> And there are 31 atoms and 93 representations with one mode for
>> each. At
>> first, ph.x is executing well, however, when the 23th
>> representation is being
>> done, there comes out some strange thing:
>> **********************************************************************
>> FATAL ERROR on MPI node 3 (ganode054): GM send to MPI node 10 (???
>> [00:60:dd:49:08:2a]) failed: status 18 (target node was
>> unreachable) check
>> the target host, mapping or cables
>> Small/Ctrl message completion error!
>> forrtl: error (76): IOT trap signal
>> [...]
>> Does anyone knows what is the reason of this error?
> of course, nobody knows, but somebody might argue that it looks like a
> problem in node 10 rather than in q-e. Any evidence that the problem is
> really on the q-e side (i.e. the problem, or some problem, is
> reproducible
> on a different machine)?
> Paolo
> ---
> Paolo Giannozzi, Dept of Physics, University of Udine
> via delle Scienze 208, 33100 Udine, Italy
> Phone +39-0432-558216, fax +39-0432-558222

