<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content="text/html; charset=utf-8" http-equiv=Content-Type>
<STYLE>@import url( C:\Users\定夫\AppData\Roaming\Foxmail7\Temp-5608\scrollbar_5608.css );
BLOCKQUOTE {
MARGIN-BOTTOM: 0px; MARGIN-LEFT: 2em; MARGIN-TOP: 0px
}
OL {
MARGIN-BOTTOM: 0px; MARGIN-TOP: 0px
}
UL {
MARGIN-BOTTOM: 0px; MARGIN-TOP: 0px
}
P {
MARGIN-BOTTOM: 0px; MARGIN-TOP: 0px
}
BODY {
FONT-SIZE: 10.5pt; FONT-FAMILY: Microsoft YaHei UI; COLOR: #000080; LINE-HEIGHT: 1.5
}
</STYLE>
<META name=GENERATOR content="MSHTML 11.00.9600.16659"></HEAD>
<BODY style="MARGIN: 10px">
<DIV>
<DIV> </DIV>
<DIV>dear Axel and Paolo,</DIV>
<DIV> </DIV>
<DIV>My situation is the nscf can not run at all with the very dense
kpoints.</DIV>
<DIV> </DIV>
<DIV>I used a small cluster of our group and I am the administrator
of it, so I am sure it did not overload the computing resources.</DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV>I tested it with serveral situations. Such as that the
calculation with or without mpi. When the number of kpoints is
smaller, it runs in both situation.</DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV>When the number of kpoints is very large, using pw.x without mpi, the work
runs. But when i use mpi, no matter how much of the processes used, it can
not run at all, and lead to such errors immediately. </DIV>
<DIV> </DIV>
<DIV>I used MPICH3. Maybe i should re<SPAN class=def sizset="2"
sizcache06001470713189573="2">configure it? Or use open-mpi
instead?</SPAN></DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV>Thanks a lot.</DIV>
<DIV> </DIV>
<DIV>Yours Dingfu </DIV>
<DIV> </DIV>
<DIV> </DIV></DIV>
<HR style="HEIGHT: 1px; WIDTH: 210px" align=left color=#b5c4df SIZE=1>
<DIV><SPAN>
<DIV>Dingfu Shao, Ph.D</DIV>
<DIV> Institute of Solid State Physics</DIV>
<DIV> Chinese Academy of Sciences</DIV>
<DIV>P. O. Box 1129</DIV>
<DIV> Hefei 230031</DIV>
<DIV> Anhui Province</DIV>
<DIV>P. R. China</DIV>
<DIV> </DIV>
<DIV>Email: dingfu.shao@gmail.com</DIV></SPAN></DIV>
<DIV> </DIV>
<DIV
style="BORDER-TOP: #b5c4df 1pt solid; BORDER-RIGHT: medium none; BORDER-BOTTOM: medium none; PADDING-BOTTOM: 0cm; PADDING-TOP: 3pt; PADDING-LEFT: 0cm; BORDER-LEFT: medium none; PADDING-RIGHT: 0cm">
<DIV
style="FONT-SIZE: 12px; FONT-FAMILY: tahoma; BACKGROUND: #efefef; COLOR: #000000; PADDING-BOTTOM: 8px; PADDING-TOP: 8px; PADDING-LEFT: 8px; PADDING-RIGHT: 8px">
<DIV><B>From:</B> <A href="mailto:akohlmey@gmail.com">Axel
Kohlmeyer</A></DIV>
<DIV><B>Date:</B> 2014-04-21 16:02</DIV>
<DIV><B>To:</B> <A href="mailto:pw_forum@pwscf.org">PWSCF Forum</A></DIV>
<DIV><B>Subject:</B> Re: [Pw_forum] error in the NSCF calculation with a
large number of kpoints.</DIV></DIV></DIV>
<DIV>
<DIV>On Mon, Apr 21, 2014 at 12:08 AM, Dingfu Shao <dingfu.shao@gmail.com>
wrote:</DIV>
<DIV>> Dear QE users,</DIV>
<DIV>></DIV>
<DIV>> I want to plot a fermi surface with a very dense kpoints, so that
I</DIV>
<DIV>> can do some calculation such as the Fermi surface nesting function.
A</DIV>
<DIV>> smaller kpoints number, such as 1200, works fine. But when I take
the</DIV>
<DIV>> nscf calculation with a large kpoints number, such as 2000, the
error</DIV>
<DIV>> happens as forllowing:</DIV>
<DIV>></DIV>
<DIV>> [proxy:0:0 at node6] HYD_pmcd_pmip_control_cmd_cb</DIV>
<DIV>> (./pm/pmiserv/pmip_cb.c:939): process reading stdin too slowly;
can't</DIV>
<DIV>> keep up</DIV>
<DIV>> [proxy:0:0 at node6] HYDT_dmxu_poll_wait_for_event</DIV>
<DIV>> (./tools/demux/demux_poll.c:77): callback returned error status</DIV>
<DIV>> [proxy:0:0 at node6] main (./pm/pmiserv/pmip.c:206): demux engine
error</DIV>
<DIV>> waiting for event</DIV>
<DIV>> [mpiexec at node0] control_cb (./pm/pmiserv/pmiserv_cb.c:202):
assert</DIV>
<DIV>> (!closed) failed</DIV>
<DIV>> [mpiexec at node0] HYDT_dmxu_poll_wait_for_event</DIV>
<DIV>> (./tools/demux/demux_poll.c:77): callback returned error status</DIV>
<DIV>> [mpiexec at node0] HYD_pmci_wait_for_completion</DIV>
<DIV>> (./pm/pmiserv/pmiserv_pmci.c:197): error waiting for event</DIV>
<DIV>> [mpiexec at node0] main (./ui/mpich/mpiexec.c:331): process manager
error</DIV>
<DIV>> waiting for completion</DIV>
<DIV>></DIV>
<DIV>></DIV>
<DIV>></DIV>
<DIV>> Does anybody know what is the porblem?</DIV>
<DIV> </DIV>
<DIV>first of all, there are messages from your MPI library. my guess is</DIV>
<DIV>that you are overloading your compute node with i/o. probably caused</DIV>
<DIV>by excessive swapping. do you use a sufficient number of nodes and</DIV>
<DIV>k-point parallelization? you most likely best talk to your system</DIV>
<DIV>administrators, since overloading nodes can cause disruptions in the</DIV>
<DIV>overall service of the cluster and then work with them to run your</DIV>
<DIV>calculation better.</DIV>
<DIV> </DIV>
<DIV>axel.</DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV>></DIV>
<DIV>></DIV>
<DIV>> Best regards,</DIV>
<DIV>></DIV>
<DIV>> Yours Dingfu Shao</DIV>
<DIV>></DIV>
<DIV>></DIV>
<DIV>> --</DIV>
<DIV>></DIV>
<DIV>> Dingfu Shao, Ph.D</DIV>
<DIV>></DIV>
<DIV>> Institute of Solid State Physics</DIV>
<DIV>></DIV>
<DIV>> Chinese Academy of Sciences</DIV>
<DIV>></DIV>
<DIV>> P. O. Box 1129</DIV>
<DIV>></DIV>
<DIV>> Hefei 230031</DIV>
<DIV>></DIV>
<DIV>> Anhui Province</DIV>
<DIV>></DIV>
<DIV>> P. R. China</DIV>
<DIV>></DIV>
<DIV>> Email: dingfu.shao@gmail.com</DIV>
<DIV>></DIV>
<DIV>> ________________________________</DIV>
<DIV>></DIV>
<DIV>></DIV>
<DIV>></DIV>
<DIV>> _______________________________________________</DIV>
<DIV>> Pw_forum mailing list</DIV>
<DIV>> Pw_forum@pwscf.org</DIV>
<DIV>> http://pwscf.org/mailman/listinfo/pw_forum</DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV>-- </DIV>
<DIV>Dr. Axel Kohlmeyer akohlmey@gmail.com http://goo.gl/1wk0</DIV>
<DIV>College of Science & Technology, Temple University, Philadelphia PA,
USA</DIV>
<DIV>International Centre for Theoretical Physics, Trieste. Italy.</DIV>
<DIV>_______________________________________________</DIV>
<DIV>Pw_forum mailing list</DIV>
<DIV>Pw_forum@pwscf.org</DIV>
<DIV>http://pwscf.org/mailman/listinfo/pw_forum</DIV></DIV></BODY></HTML>