<html><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div>I set up an 864 atom run (so this has 1920 electrons, 960 states), and varied the number of nodes (the nodes were operating in smp mode so 1 node == 1 core)</div><div><br></div><div>I ran the following cases successfully (with the current CVS version of pw, scalapack enabled):</div><div><br></div><div>512 cores, -ntg 8 -ndiag 256</div><div>256 cores, -ntg 8 -ndiag 256</div><div>128 cores, -ntg 4 -ndiag 121</div><div>64 cores, -ntg 4 -ndiag 64</div><div>32 cores, -ntg 4 -ndiag 25 </div><div><br></div><div>it seemed to crash at 16 cores, -ntg 4 -ndiag 16 - the error was:</div><div><br></div><div>'pw.x: /bglhome/bgbuild/V1R2M0_200_2008-080513P/ppc/bgp/comm/sys/build-dcmf/include/devices/dma/DMAMulticast.h:568: static int DCMF::DMA::DMAMulticast<TDesc>::Registration::McastLongPacketHandler(void*, DCMF::DMA::DMAMulticast<TDesc>::PacketHeader*, void*, char*, int) [with TDesc = DMA_MemoryFifoDescriptor]: Assertion `recv != __null' failed.'</div><div><br></div><div>Comparing this to the 432 atom run at 128 cores, I am not convinced that the issue between different modes is due to not enough memory per core. This system is twice as large, so shouldn't have worked on 32 cores if we assume linear scaling in memory usage.</div><div><br></div><div>To put it another way:</div><div><br></div><div>if we assume that the 432 atom case required more than 1 GB RAM/core to run at 128 cores (despite the output saying otherwise), then 864 would require more than 2 GB RAM/core at 128 cores. If we assume that memory consumption scales linearly, then at 64 cores, I should have seen a problem (since it would presumably need something more than 2 GB/core). Certainly by 32 cores (where it would need 4 GB/core). But I don't.</div><div><br></div><div>On the up side, this likely means that I *can* run a big system (I'll have to play around to find out how big - especially when I get into adding a vacuum region). The down side is that I have to do it in smp mode and have 3 idle cores/node.</div><div><br></div><div>Dave</div><div><br></div><div><br></div><br><div><div>On Feb 12, 2009, at 1:16 PM, Nichols A. Romero wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div>David,<br><br>Just to clarify you mean.<br><br>128 nodes in vn mode = 512 cores<br>128 nodes in dual mode = 256 cores<br>128 nodes in smp mode = 128 cores<br><br>Nichols A. Romero, Ph.D.<br>Argonne Leadership Computing Facility<br>Argonne National Laboratory<br>Building 360 Room L-146<br>9700 South Cass Avenue<br>Argonne, IL 60490<br>(630) 252-3441<br><br><br>----- Original Message -----<br>From: "David Farrell" <<a href="mailto:davidfarrell2008@u.northwestern.edu">davidfarrell2008@u.northwestern.edu</a>><br>To: "PWSCF Forum" <<a href="mailto:pw_forum@pwscf.org">pw_forum@pwscf.org</a>><br>Cc: "Nichols A. Romero" <<a href="mailto:naromero@alcf.anl.gov">naromero@alcf.anl.gov</a>><br>Sent: Thursday, February 12, 2009 12:24:38 PM GMT -06:00 US/Canada Central<br>Subject: Re: [Pw_forum] PW taskgroups and a large run on a BG/P<br><br><br>I pulled down the current CVS version, compiled as I did with the previous snapshot and got the same behavior: <br><br><br>When I ran on 128 cores in vn mode with -ntg 4 -ndiag 121, I got a cholesky error: <br><br><br>When I ran on 128 cores in dual mode with -ntg 4 -ndiag 121, I got the cholesky error: <br><br><br><br>When I ran on 128 cores in smp mode with -ntg 4 -ndiag 121, it ran fine. <br><br><br>I guess I have 2 options: <br><br><br>1) try larger systems in SMP mode with the CVS version, see how big I can get before things blow up. I'll just have to deal with the extra cost of the idle CPUs. <br><br><br>2) climb into the code with a debugger to see if I can see anything going on (things I am interested in now are how much memory is actually available to the code, how much it is using, if there is something funny going on in the different modes). I'll probably have to construct a smaller system that does the same thing first. <br><br><br>I don't want to abandon PW/CP just yet because this code has demonstrated decent physics, and other codes would require me to do develop PPs that give me results I can be confident in or way too much work to get them scalable. Unfortunately - I also need to get it running on the BG/P as I have a big allocation on that machine that is otherwise wasted. <br><br><br>Dave <br><br><br><br><br><br><br><br>David E. Farrell <br>Post-Doctoral Fellow <br>Department of Materials Science and Engineering <br>Northwestern University <br>email: <a href="mailto:d-farrell2@northwestern.edu">d-farrell2@northwestern.edu</a> <br></div></blockquote></div><br><div apple-content-edited="true"> <span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-size: 14px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0; "><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><font face="Helvetica" size="3" style="font: normal normal normal 12px/normal Helvetica; font-size: 12px; ">David E. Farrell</font></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px; ">Post-Doctoral Fellow</span></font></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><font class="Apple-style-span" size="3"><span class="Apple-style-span" style="font-size: 12px; ">Department of Materials Science and Engineering</span></font></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><font face="Helvetica" size="3" style="font: normal normal normal 12px/normal Helvetica; font-size: 12px; ">Northwestern University</font></div><div style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><font face="Helvetica" size="3" style="font: normal normal normal 12px/normal Helvetica; font-size: 12px; ">email: <a href="mailto:d-farrell2@northwestern.edu">d-farrell2@northwestern.edu</a></font></div></div></div></span> </div><br></body></html>