[Pw_forum] HELP: How to submit job on multi-node cluster?
rollyng at gmail.com
Wed May 25 17:53:03 CEST 2016
May I ask what kind of connection do you have between the two nodes when
you run in parallel? Is it Gigabit Ethernet or something else?
I would like to do the same thing but now have to pause as your experience.
On 05/25/2016 06:50 PM, PRATIK DAS wrote:
> Dear QE users,
> I am not a new user of quantum espresso. Previously I worked on a
> single machine of 4 core. I simply worked with mpi by 'mpirun -np 4',
> and the job was faster than a serial run.
> Now I have a 2 node cluster. Each node consists of 2 processors of 12
> core each. Hence I made a hostfile(for a single node) by specifying
> "node1 slots=24".
> Now my command is
> "mpirun --hostfile my_hostfile -np 24 ph.x <input.in
> <http://input.in>> output.out". This run takes 9hours and 39 minutes.
> But when I want to use both nodes by
> "node1 slots=24
> node2 slots=24"
> & command
> "mpirun --hostfile my_hostfile -np 48 ph.x <input.in
> <http://input.in>> output.out -ni 6 -nk 2",
> It is taking too much time to complete the job. Even it is now almost
> 26 hours over. 2nd q point calculation is going on.
> NB: The job has only 4 q points.
> I have also seen this in pw.x run. In a single node, it is taking only
> Command: mpirun --hostfile my_hostfile -np 24 pw.x <input.in
> <http://input.in>> output.out.
> But in case of both nodes, it is taking almost half an hour.
> Command: mpirun --hostfile my_hostfile -np 48 pw.x <input.in
> <http://input.in>> output.out
> Can anyone suggest me that why the jobs running on 2 nodes are slower
> than the jobs in single node?
> Thanks in advance.
> Pratik Kumar Das
> /*Pratik Kr. Das*/
> Research Fellow
> High Pressure and Temperature Lab
> Faculty of Science
> Jadavpur University
> Kolkata 700032
> Pw_forum mailing list
> Pw_forum at pwscf.org
PhD. Research Fellow,
Dept. of Physics & Materials Science,
City University of Hong Kong
Tel: +852 3442 4000
Fax: +852 3442 0538
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the users