[Pw_forum] HELP: How to submit job on multi-node cluster?

PRATIK DAS pratikdas63 at gmail.com
Wed May 25 12:50:57 CEST 2016


Dear QE users,
I am not a new user of quantum espresso. Previously I worked on a single
machine of 4 core. I simply worked with mpi by 'mpirun -np 4', and the job
was faster than a serial run.
Now I have a 2 node cluster. Each node consists of 2 processors of 12 core
each. Hence I made a hostfile(for a single node) by specifying
"node1 slots=24".
Now my command is
"mpirun --hostfile my_hostfile -np 24 ph.x <input.in> output.out". This run
takes 9hours and 39 minutes.
But when I want to use both nodes by
"node1 slots=24
node2 slots=24"
& command
"mpirun --hostfile my_hostfile -np 48 ph.x <input.in> output.out -ni 6 -nk
2",
 It is taking too much time to complete the job. Even it is now almost 26
hours over. 2nd q point calculation is going on.

NB: The job has only 4 q points.

I have also seen this in pw.x run. In a single node, it is taking only
2mins.
Command: mpirun --hostfile my_hostfile -np 24 pw.x <input.in> output.out.

But in case of both nodes, it is taking almost half an hour.
Command: mpirun --hostfile my_hostfile -np 48 pw.x <input.in> output.out

Can anyone suggest me that why the jobs running on 2 nodes are slower than
the jobs in single node?

Thanks in advance.


Regards
Pratik Kumar Das
*Pratik Kr. Das*
Research Fellow
High Pressure and Temperature Lab
Faculty of Science
Jadavpur University
Kolkata 700032
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20160525/4b430065/attachment.html>


More information about the users mailing list