<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content=text/html;charset=gb2312>
<META content="MSHTML 6.00.6000.16735" name=GENERATOR></HEAD>
<BODY id=MailContainerBody
style="PADDING-RIGHT: 10px; PADDING-LEFT: 10px; PADDING-TOP: 15px"
bgColor=#ffffff leftMargin=0 topMargin=0 CanvasTabStop="true"
name="Compose message area">
<DIV><FONT face=Arial size=2>Dear all,</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>I am suffering from the error message like
this,</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial
size=2>[node1][0,1,12][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv]
mca_btl_tcp_frag_recv: readv failed with errno=104<BR>forrtl: error (78):
process killed
(SIGTERM)<BR>Image
PC
Routine
Line
Source
<BR>libblas.so.3 55A8D51C
Unknown
Unknown
Unknown<BR>pw.x
081CBD7B
Unknown
Unknown
Unknown<BR>pw.x
0823A95E
Unknown
Unknown
Unknown<BR>pw.x
08239C4A
Unknown
Unknown
Unknown<BR>pw.x
081DEDC9
Unknown
Unknown
Unknown<BR>pw.x
081D4E9C
Unknown
Unknown
Unknown<BR>Unknown
FFFFD060
Unknown
Unknown Unknown<BR> <BR>Stack trace terminated abnormally.<BR>forrtl:
error (78): process killed
(SIGTERM)<BR>Image
PC
Routine
Line
Source
<BR>.
FFFFE410
Unknown
Unknown Unknown<BR>mca_oob_tcp.so 55F911B4
Unknown
Unknown
Unknown<BR>Unknown
00000001
Unknown
Unknown Unknown<BR>forrtl: error (78): process killed
(SIGTERM)<BR>Image
PC
Routine
Line
Source
<BR>pw.x
0813EE72
Unknown
Unknown
Unknown<BR>pw.x
0813E577
Unknown
Unknown Unknown<BR> <BR>Stack trace terminated abnormally.<BR>forrtl:
error (78): process killed
(SIGTERM)<BR>Image
PC
Routine
Line
Source
<BR>.
FFFFE410
Unknown
Unknown Unknown<BR>forrtl: error (78): process killed
(SIGTERM)<BR>Image
PC
Routine
Line
Source
<BR>.
FFFFE40E
Unknown
Unknown Unknown<BR>forrtl: error (78): process killed
(SIGTERM)<BR>Image
PC
Routine
Line
Source
<BR>.
FFFFE410
Unknown
Unknown Unknown<BR>forrtl: error (78): process killed
(SIGTERM)<BR>Image
PC
Routine
Line
Source
<BR>.
FFFFE40E
Unknown
Unknown Unknown<BR>forrtl: error (78): process killed
(SIGTERM)<BR>Image
PC
Routine
Line
Source
<BR>libblas.so.3 55A8C50F
Unknown
Unknown
Unknown<BR>pw.x
081CBD7B
Unknown
Unknown
Unknown<BR>pw.x
0823A95E
Unknown
Unknown
Unknown<BR>pw.x
08239C4A
Unknown
Unknown
Unknown<BR>pw.x
081DEDC9
Unknown
Unknown
Unknown<BR>pw.x
081D4E9C
Unknown
Unknown
Unknown<BR>Unknown
FFFFD060
Unknown
Unknown Unknown<BR> <BR>Stack trace terminated abnormally.<BR>forrtl:
error (78): process killed
(SIGTERM)<BR>Image
PC
Routine
Line
Source
<BR>.
FFFFE410
Unknown
Unknown Unknown<BR>forrtl: error (78): process killed
(SIGTERM)<BR>Image
PC
Routine
Line
Source
<BR>.
FFFFE40E
Unknown
Unknown Unknown<BR>forrtl: error (78): process killed
(SIGTERM)<BR>Image
PC
Routine
Line
Source
<BR>.
FFFFE410
Unknown
Unknown Unknown<BR>forrtl: error (78): process killed
(SIGTERM)<BR>Image
PC
Routine
Line
Source
<BR>libblas.so.3 55A8C50F
Unknown
Unknown
Unknown<BR>pw.x
081CBD7B
Unknown
Unknown
Unknown<BR>pw.x
0823A95E
Unknown
Unknown
Unknown<BR>pw.x
08239C4A
Unknown
Unknown
Unknown<BR>pw.x
081DEDC9
Unknown
Unknown
Unknown<BR>pw.x
081D4E9C
Unknown
Unknown
Unknown<BR>Unknown
FFFFD060
Unknown
Unknown Unknown<BR> <BR>Stack trace terminated abnormally.<BR>forrtl:
error (78): process killed
(SIGTERM)<BR>Image
PC
Routine
Line
Source
<BR>.
FFFFE410
Unknown
Unknown Unknown<BR>forrtl: error (78): process killed
(SIGTERM)<BR>Image
PC
Routine
Line
Source
<BR>.
FFFFE410
Unknown
Unknown
Unknown<BR>Unknown
00000003
Unknown
Unknown Unknown<BR>forrtl: error (78): process killed
(SIGTERM)<BR>Image
PC
Routine
Line
Source
<BR>.
FFFFE410
Unknown
Unknown Unknown<BR>forrtl: error (78): process killed
(SIGTERM)<BR>Image
PC
Routine
Line
Source
<BR>libblas.so.3 55A8C50B
Unknown
Unknown
Unknown<BR>pw.x
081CBD7B
Unknown
Unknown
Unknown<BR>pw.x
0823A95E
Unknown
Unknown
Unknown<BR>pw.x
08239C4A
Unknown
Unknown
Unknown<BR>pw.x
081DEDC9
Unknown
Unknown
Unknown<BR>pw.x
081D4E9C
Unknown
Unknown
Unknown<BR>Unknown
FFFFCDC0
Unknown
Unknown Unknown<BR> <BR>Stack trace terminated abnormally.<BR>forrtl:
error (78): process killed
(SIGTERM)<BR>Image
PC
Routine
Line
Source
<BR>.
FFFFE410
Unknown
Unknown
Unknown<BR>[node1][0,1,12][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv]
mca_btl_tcp_frag_recv: readv failed with errno=104<BR>forrtl: error (78):
process killed
(SIGTERM)<BR>Image
PC
Routine
Line
Source
<BR>.
FFFFE410
Unknown
Unknown Unknown<BR>forrtl: error (78): process killed
(SIGTERM)<BR>Image
PC
Routine
Line
Source
<BR>.
FFFFE410
Unknown
Unknown Unknown<BR>forrtl: error (78): process killed
(SIGTERM)<BR>Image
PC
Routine
Line
Source
<BR>libblas.so.3 55A8BF47
Unknown
Unknown
Unknown<BR>pw.x
080EA567
Unknown
Unknown Unknown<BR> <BR>Stack trace terminated abnormally.<BR>forrtl:
error (78): process killed
(SIGTERM)<BR>Image
PC
Routine
Line
Source
<BR>libblas.so.3 55A8BF3B
Unknown
Unknown
Unknown<BR>pw.x
081E3C7B
Unknown
Unknown Unknown<BR> <BR>Stack trace terminated abnormally.<BR>forrtl:
error (78): process killed
(SIGTERM)<BR>Image
PC
Routine
Line
Source
<BR>.
FFFFE410
Unknown
Unknown
Unknown<BR>[node8][0,1,23][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv]
mca_btl_tcp_frag_recv: readv failed with errno=104<BR>forrtl: error (78):
process killed
(SIGTERM)<BR>Image
PC
Routine
Line
Source
<BR>.
FFFFE410
Unknown
Unknown Unknown<BR>mpirun noticed that job rank 14 with PID 3519 on node
node3 exited on signal 11 (Segmentation fault). <BR></FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>I could relax 72 atoms successfully with my
system using openmpi. But when I wanted to relax 84 atoms, the error
message stoped my calculation. Then I tried the mpich2 using the same system.
With the help of mpich2 I could relax 120 atoms instead. But the error message
bothered me again when I wanted to relax 132 atoms. I was get <SPAN
class=heighlight><FONT color=#466fcd>entangle </FONT><FONT color=#000000>by
</FONT></SPAN>his troublesome thing for quite a long time. Could someone give me
some suggestions to cope with this?</FONT></DIV>
<DIV><FONT face=Arial size=2>for better understanding my question, I will show
the detail of my systems as follows,</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>there are 8 nodes in my cluster with the
Ethernet.</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>CPU intel Q6600</FONT></DIV>
<DIV><FONT face=Arial size=2>Memory 8G per
node</FONT></DIV>
<DIV><FONT face=Arial size=2>Main Board intel S3000AH</FONT></DIV>
<DIV><FONT face=Arial size=2>hard disk seagate 750G (7200)</FONT></DIV>
<DIV><FONT face=Arial size=2>OS redhat linux enterprise 4 as 4 update 4
</FONT></DIV>
<DIV><FONT face=Arial size=2>Fortran intel ifort 10.1.015</FONT></DIV>
<DIV><FONT face=Arial size=2>C intel icc
10.1.015</FONT></DIV>
<DIV><FONT face=Arial
size=2>MPI mpich2/openmpi</FONT></DIV>
<DIV><FONT face=Arial size=2>FFTW fftw 2.1.5</FONT></DIV>
<DIV><FONT face=Arial size=2>MKL 10.0.1.014</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV>
<DIV><FONT face=Arial size=2>thank you for reading. </FONT><FONT face=Arial
size=2>any hints will be deeply appreciated.</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>vega</FONT></DIV>
<DIV> </DIV></DIV>
<DIV><FONT face=Arial
size=2>=================================================================================<BR>Vega
Lew (weijia liu)<BR>PH.D Candidate in Chemical Engineering<BR>State Key
Laboratory of Materials-oriented Chemical Engineering<BR>College of Chemistry
and Chemical Engineering<BR>Nanjing University of Technology, 210009, Nanjing,
Jiangsu, China</FONT></DIV></BODY></HTML>