[Pw_forum] Problem with parallel execution
Elie M
elie.moujaes at hotmail.co.uk
Sun Feb 5 02:37:56 CET 2012
Dear all,
I am running a parallel execution (pw.x) on a SLURM LINUX interface and once I run the command sbatch filename.srm, the calculation starts running and then stops with the follwing error:
"mpiexec_veredas5: cannot
connect to local mpd
(/tmp/mpd2.console_sushil); possible causes:
1. no mpd is running on this host
2. an mpd is running but was started without
a "console" (-n option)
In case 1, you
can start an mpd on this host with:
mpd &
and you will be
able to run jobs just on this host.
For more details
on starting mpds on a set of hosts, see
the MPICH2
Installation Guide."
I saw a previous message posted in 2009 about this error. I followed what prof. andrea did: I created a file elie.mpd.hosts and included one line in it (localhosts) then ran the coomad mpdboot -f ~/elie.mpd.hosts and run sbatch command again but in 3 seconds time, it stops with the same error. can anyone help..
N.B: veredas 5 is the node at which I am executing the command but whatever node I try on , I get the same error
Elie MoujaesUniversity of NottinghamNG7 2RDUK
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20120205/ac4eb61e/attachment.html>
More information about the users
mailing list