<DIV>在009-09-23 18:46:48,pw_forum-request@pwscf.org 写道:<BR>>Send Pw_forum mailing list submissions to<BR>> pw_forum@pwscf.org<BR>><BR>>To subscribe or unsubscribe via the World Wide Web, visit<BR>> http://www.democritos.it/mailman/listinfo/pw_forum<BR>>or, via email, send a message with subject or body 'help' to<BR>> pw_forum-request@pwscf.org<BR>><BR>>You can reach the person managing the list at<BR>> pw_forum-owner@pwscf.org<BR>><BR>>When replying, please edit your Subject line so it is more specific<BR>>than "Re: Contents of Pw_forum digest..."<BR>><BR>><BR>>Today's Topics:<BR>><BR>> 1. how to improve the calculation speed ? (wangqj1)<BR>> 2. Re: how to improve the calculation speed ? (Giovanni Cantele)<BR>> 3. Re: how to improve the calculation speed ? (Lorenzo Paulatto)<BR>> 4. write occupancy (ali kazempour)<BR>> 5. Re: write occupancy (Prasenjit Ghosh)<BR>><BR>><BR>>----------------------------------------------------------------------<BR>><BR>>Message: 1<BR>>Date: Wed, 23 Sep 2009 16:05:46 +0800 (CST)<BR>>From: wangqj1 <wangqj1@126.com><BR>>Subject: [Pw_forum] how to improve the calculation speed ?<BR>>To: pw_forum <pw_forum@pwscf.org><BR>>Message-ID:<BR>> <21870763.369701253693146938.JavaMail.coremail@bj126app103.126.com><BR>>Content-Type: text/plain; charset="gbk"<BR>><BR>><BR>>Dear PWSCF users<BR>> When I use R and G parallelization to run job ,it as if wait for the input . According peoples advice ,I use k-point parallelization ,it runs well . But it runs too slow .The information I can offerred as following:<BR>>(1) : CUP usage of one node is as<BR>>Tasks: 143 total, 10 running, 133 sleeping, 0 stopped, 0 zombie<BR>>Cpu0 : 99.7%us, 0.3%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st<BR>>Cpu1 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st<BR>>Cpu2 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st<BR>>Cpu3 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st<BR>>Cpu4 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st<BR>>Cpu5 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st<BR>>Cpu6 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st<BR>>Cpu7 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st<BR>>Mem: 8044120k total, 6683720k used, 1360400k free, 1632k buffers<BR>>Swap: 4192956k total, 2096476k used, 2096480k free, 1253712k cached<BR>><BR>>(2) The input file of PBS<BR>>#!/bin/sh<BR>>#PBS -j oe<BR>>#PBS -N pw<BR>>#PBS -l nodes=1:ppn=8<BR>>#PBS -q small <BR>>cd $PBS_O_WORKDIR<BR>>hostname<BR>>/usr/local/bin/mpirun -np 8 -machinefile $PBS_NODEFILE /home/wang/bin/pw.x -npool 8 -in ZnO.pw.inp>ZnO.pw.out<BR>>(3)<BR>>wang@node22:~> netstat -s<BR>>Ip:<BR>> 1894215181 total packets received<BR>> 0 forwarded<BR>> 0 incoming packets discarded<BR>> 1894215181 incoming packets delivered<BR>> 979205769 requests sent out<BR>> 30 fragments received ok<BR>> 60 fragments created<BR>>Icmp:<BR>> 2 ICMP messages received<BR>> 1 input ICMP message failed.<BR>> ICMP input histogram:<BR>> destination unreachable: 2<BR>> 2 ICMP messages sent<BR>> 0 ICMP messages failed<BR>> ICMP output histogram:<BR>> destination unreachable: 2<BR>>IcmpMsg:<BR>> InType3: 2<BR>> OutType3: 2<BR>>Tcp:<BR>> 5662 active connections openings<BR>> 9037 passive connection openings<BR>> 68 failed connection attempts<BR>> 1 connection resets received<BR>> 18 connections established<BR>> 1894049565 segments received<BR>> 979043182 segments send out<BR>> 284 segments retransmited<BR>> 0 bad segments received.<BR>> 55 resets sent<BR>>Udp:<BR>> 165614 packets received<BR>> 0 packets to unknown port received.<BR>> 0 packet receive errors<BR>> 162301 packets sent<BR>> RcvbufErrors: 0<BR>> SndbufErrors: 0<BR>>UdpLite:<BR>> InDatagrams: 0<BR>> NoPorts: 0<BR>> InErrors: 0<BR>> OutDatagrams: 0<BR>> RcvbufErrors: 0<BR>> SndbufErrors: 0<BR>>TcpExt:<BR>> 10 resets received for embryonic SYN_RECV sockets<BR>> ArpFilter: 0<BR>> 5691 TCP sockets finished time wait in fast timer<BR>> 25 time wait sockets recycled by time stamp<BR>> 17369935 delayed acks sent<BR>> 1700 delayed acks further delayed because of locked socket<BR>> 18 packets directly queued to recvmsg prequeue.<BR>> 8140 packets directly received from backlog<BR>> 1422037027 packets header predicted<BR>> 7 packets header predicted and directly queued to user<BR>> TCPPureAcks: 2794058<BR>> TCPHPAcks: 517887764<BR>> TCPRenoRecovery: 0<BR>> TCPSackRecovery: 56<BR>> TCPSACKReneging: 0<BR>> TCPFACKReorder: 0<BR>> TCPSACKReorder: 0<BR>> TCPRenoReorder: 0<BR>> TCPTSReorder: 0<BR>> TCPFullUndo: 0<BR>> TCPPartialUndo: 0<BR>> TCPDSACKUndo: 0<BR>> TCPLossUndo: 1<BR>> TCPLoss: 357<BR>> TCPLostRetransmit: 6<BR>> TCPRenoFailures: 0<BR>> TCPSackFailures: 0<BR>> TCPLossFailures: 0<BR>> TCPFastRetrans: 235<BR>> TCPForwardRetrans: 46<BR>> TCPSlowStartRetrans: 0<BR>> TCPTimeouts: 3<BR>> TCPRenoRecoveryFail: 0<BR>> TCPSackRecoveryFail: 0<BR>> TCPSchedulerFailed: 0<BR>> TCPRcvCollapsed: 0<BR>> TCPDSACKOldSent: 0<BR>> TCPDSACKOfoSent: 0<BR>> TCPDSACKRecv: 2<BR>> TCPDSACKOfoRecv: 0<BR>> TCPAbortOnSyn: 0<BR>> TCPAbortOnData: 0<BR>> TCPAbortOnClose: 0<BR>> TCPAbortOnMemory: 0<BR>> TCPAbortOnTimeout: 0<BR>> TCPAbortOnLinger: 0<BR>> TCPAbortFailed: 0<BR>> TCPMemoryPressures: 0<BR>> TCPSACKDiscard: 0<BR>> TCPDSACKIgnoredOld: 1<BR>> TCPDSACKIgnoredNoUndo: 0<BR>> TCPSpuriousRTOs: 0<BR>> TCPMD5NotFound: 0<BR>> TCPMD5Unexpected: 0<BR>>IpExt:<BR>> InNoRoutes: 0<BR>> InTruncatedPkts: 0<BR>> InMcastPkts: 0<BR>> OutMcastPkts: 0<BR>> InBcastPkts: 0<BR>> OutBcastPkts: 0<BR>>when I install the PWSCF ,I only use the command line :<BR>>./configure <BR>>make all .<BR>>And it install successful .<BR>> <BR>>I don't know why it run so slow ,how to solve this problem ? Any advice will be appreciated !<BR>> <BR>>Best regard<BR>>Q . J. Wang <BR>>XiangTan University <BR>><BR>><BR>> <BR>>-------------- next part --------------<BR>>An HTML attachment was scrubbed...<BR>>URL: http://www.democritos.it/pipermail/pw_forum/attachments/20090923/f82797a3/attachment-0001.htm <BR>><BR>>------------------------------<BR>><BR>>Message: 2<BR>>Date: Wed, 23 Sep 2009 10:45:51 +0200<BR>>From: Giovanni Cantele <Giovanni.Cantele@na.infn.it><BR>>Subject: Re: [Pw_forum] how to improve the calculation speed ?<BR>>To: PWSCF Forum <pw_forum@pwscf.org><BR>>Message-ID: <4AB9E03F.7080600@na.infn.it><BR>>Content-Type: text/plain; charset=x-gbk; format=flowed<BR>><BR>>wangqj1 wrote:<BR>>><BR>>> Dear PWSCF users<BR>>> When I use R and G parallelization to run job ,it as if wait for the <BR>>> input .<BR>><BR>>What does it mean? Does it print the output header or the output up to <BR>>some point or nothing happens?</DIV><PRE>It only print the output heander and not have iteration .</PRE><PRE>
>
>> According peoples advice ,I use k-point parallelization ,it runs well
>> . But it runs too slow .The information I can offerred as following:
>> (1) : CUP usage of one node is as
>> Tasks: 143 total, 10 running, 133 sleeping, 0 stopped, 0 zombie
>> Cpu0 : 99.7%us, 0.3%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
>> Cpu1 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
>> Cpu2 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
>> Cpu3 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
>> Cpu4 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
>> Cpu5 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
>> Cpu6 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
>> Cpu7 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
>> Mem: 8044120k total, 6683720k used, 1360400k free, 1632k buffers
>> Swap: 4192956k total, 2096476k used, 2096480k free, 1253712k cached
>I'm not very expert about reading such information, but it seams that
>your node is making swap, maybe because the job is requiring too much
>memory with respect to the available one. This usually induces a huge
>performance degradation.
>
>In choosing the optimal number of nodes, processes per node, etc.,
>several factors should be taken into account: memory requirements,
>communication hardware, etc. You might want have a look to this page
>from the user guide: <A href="http://www.quantum-espresso.org/user_guide/node33.html">http://www.quantum-espresso.org/user_guide/node33.html</A></PRE><PRE>8 processes per node in our cluster,</PRE><PRE>model name : Intel(R) Xeon(R) CPU E5410 @ 2.33GHz<BR>stepping : 10<BR>cpu MHz : 2327.489<BR>cache size : 6144 KB</PRE><PRE><BR>>Also, consider that, at least for not very very recent CPU generation, </PRE>
>using too many cores per CPU (e.g. if your cluster configuration is with
>quad-core processors), might not improve (maybe also make worse) the
>code performances (this is also reported in previous threads in this
>forum, you can make a search).>
>Also this can be of interest for you:
>http://www.quantum-espresso.org/wiki/index.php/Frequently_Asked_Questions#Why_is_my_parallel_job_running_in_such_a_lousy_way.3F<PRE>I am not the supperuser,I don't know how to Set the environment variable OPEN_MP_THREADS to 1,I can't find where is OPEN_MP_THREADS ?</PRE><PRE>
>
>> I don't know why it run so slow ,how to solve this problem ? Any
>> advice will be appreciated !
>Apart from better suggestions coming from more expert people, it would
>be important to see what kind of job you are trying to run. For example:
>did you start directly with a "production run" (many k-points and/or
>large unit cells and/or large cut-off)? Did pw.x ever run on your
>cluster with simple jobs, like bulk silicon or any other (see the
>examples directory)?</PRE><PRE>The input file I had run on my single computer(4 CPUs).It runs well .</PRE><PRE>
>
>Another possibility would be starting with the serial executable
>(disabling parallel at configure time) and then switch to parallel once
>you check that everything is working OK.
>
>
>
>Unfortunately, in many cases the computation requires lot of work to
>correctly set-up and optimize compilation, performances, etc. (not to
>speak about results convergence issues!!!!).
>The only way is trying to isolate problems and solve one by one. Yet, I
>would say that in this respect quantum-espresso is one of the best
>choices, being the code made to properly work in as many cases as
>possible, rather then implementing all the human knowledge but just for
>those who wrote it!!!
>;-)
>
>Good luck,
>
>Giovanni
>
>
>--
>
>
>
>Dr. Giovanni Cantele
>Coherentia CNR-INFM and Dipartimento di Scienze Fisiche
>Universita' di Napoli "Federico II"
>Complesso Universitario di Monte S. Angelo - Ed. 6
>Via Cintia, I-80126, Napoli, Italy
>Phone: +39 081 676910
>Fax: +39 081 676346
>E-mail: giovanni.cantele@cnr.it
> giovanni.cantele@na.infn.it
>Web: http://people.na.infn.it/~cantele
>Research Group: http://www.nanomat.unina.it
>Skype contact: giocan74
>
>
>
>------------------------------
>
>Message: 3
>Date: Wed, 23 Sep 2009 10:50:48 +0200
>From: "Lorenzo Paulatto" <paulatto@sissa.it>
>Subject: Re: [Pw_forum] how to improve the calculation speed ?
>To: Giovanni.Cantele@na.infn.it, "PWSCF Forum" <pw_forum@pwscf.org>
>Message-ID: <op.u0pb6yqfa8x26q@paulax>
>Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes
>
>In data 23 settembre 2009 alle ore 10:45:51, Giovanni Cantele
><Giovanni.Cantele@na.infn.it> ha scritto:
>> I'm not very expert about reading such information, but it seams that
>> your node is making swap, maybe because the job is requiring too much
>> memory with respect to the available one. This usually induces a huge
>> performance degradation.
>
>If this is the case, reducing the number of pools will reduce the amount
>of memory required per node.
>
>cheers
>
>
>--
>Lorenzo Paulatto
>SISSA & DEMOCRITOS (Trieste)
>phone: +39 040 3787 511
>skype: paulatz
>www: http://people.sissa.it/~paulatto/
>
> *** save italian brains ***
> http://saveitalianbrains.wordpress.com/
>
>
>------------------------------
>
>Message: 4
>Date: Wed, 23 Sep 2009 03:13:18 -0700 (PDT)
>From: ali kazempour <kazempoor2000@yahoo.com>
>Subject: [Pw_forum] write occupancy
>To: pw <pw_forum@pwscf.org>
>Message-ID: <432077.46189.qm@web112513.mail.gq1.yahoo.com>
>Content-Type: text/plain; charset="us-ascii"
>
>Hi
>How do we can force the code to print the occupancy in simple scf run?
>I know partial dos calculation , but I don't know wheather another way also exist or not?
>thanks a lot
>
> Ali Kazempour
>Physics department, Isfahan University of Technology
>84156 Isfahan, Iran. Tel-1: +98 311 391 3733
>Fax: +98 311 391 2376 Tel-2: +98 311 391 2375
>
>
>
>
>-------------- next part --------------
>An HTML attachment was scrubbed...
>URL: http://www.democritos.it/pipermail/pw_forum/attachments/20090923/97a9ee58/attachment-0001.htm
>
>------------------------------
>
>Message: 5
>Date: Wed, 23 Sep 2009 12:46:45 +0200
>From: Prasenjit Ghosh <prasenjit.jnc@gmail.com>
>Subject: Re: [Pw_forum] write occupancy
>To: PWSCF Forum <pw_forum@pwscf.org>
>Message-ID:
> <627e0ffa0909230346wbdf3399i1b2a48f4edfa9c65@mail.gmail.com>
>Content-Type: text/plain; charset="iso-8859-1"
>
>use verbosity='high'
>
>Prasenjit.
>
>2009/9/23 ali kazempour <kazempoor2000@yahoo.com>
>
>> Hi
>> How do we can force the code to print the occupancy in simple scf run?
>> I know partial dos calculation , but I don't know wheather another way also
>> exist or not?
>> thanks a lot
>>
>> Ali Kazempour
>> Physics department, Isfahan University of Technology
>> 84156 Isfahan, Iran. Tel-1: +98 311 391 3733
>> Fax: +98 311 391 2376 Tel-2: +98 311 391 2375
>>
>>
>> _______________________________________________
>> Pw_forum mailing list
>> Pw_forum@pwscf.org
>> http://www.democritos.it/mailman/listinfo/pw_forum
>>
>>
>
>
>--
>PRASENJIT GHOSH,
>POST-DOC,
>ROOM NO: 265, MAIN BUILDING,
>CM SECTION, ICTP,
>STRADA COSTERIA 11,
>TRIESTE, 34104,
>ITALY
>PHONE: +39 040 2240 369 (O)
> +39 3807528672 (M)
>-------------- next part --------------
>An HTML attachment was scrubbed...
>URL: http://www.democritos.it/pipermail/pw_forum/attachments/20090923/bc129707/attachment.htm
>
>------------------------------
>
>_______________________________________________
>Pw_forum mailing list
>Pw_forum@pwscf.org
>http://www.democritos.it/mailman/listinfo/pw_forum
>
>
>End of Pw_forum Digest, Vol 27, Issue 74
>****************************************
</PRE><br><br><span title="neteasefooter"/><hr/>
<a href="http://news.163.com/madeinchina/index.html?from=mailfooter">"中国制造",讲述中国60年往事</a>
</span>