[Pw_forum] MPI & disk unavailability

Konstantin Kudin konstantin_kudin at yahoo.com
Mon Nov 28 19:38:12 CET 2005


 Axel and Gerardo,

 Thanks for the replies! 

> I suspect this is an operating system level thing. When the operating
>  
> system "talks" to the disk and doesn't get a reply, it can either  
> wait, or give up and report failure. Probably there is a configurable
>  
> timeout. If it's a networked filesystem, most likely the filesystem  
> daemons are responsible for that.
> 
> Actually, if this hasn't changed recently, Espresso doesn't use the  
> MPI I/O functions: all I/O is handled by cpu 0 that reads and writes 
> 
> locally. The "local" disk may be (and most often is) a networked  
> filesystem; but again, this is handled by the operating system, and  
> completely transparent to Espresso.

 OK. But here is my interpretation. Suppose cpu0 wants to write
everything to the NFS disk, which is not there. Then, should not this
write hang and wait till the disk become available? This is not really
far fetched, such a behaviour easily happens for serial executables.

 Now, if cpu0 indeed hangs, but other MPI cpus do not get any answers
within the timeout frame, they start to quit because cpu0 is not
responding.

 Any ideas?

 Kostya

  



		
__________________________________ 
Yahoo! Music Unlimited 
Access over 1 million songs. Try it free. 
http://music.yahoo.com/unlimited/



More information about the users mailing list