<div dir="ltr">And I still couldn't find where this large array comes from.  </div><div class="gmail_extra"><br><br><div class="gmail_quote">On Sat, Mar 1, 2014 at 3:49 PM, Paolo Giannozzi <span dir="ltr"><<a href="mailto:paolo.giannozzi@uniud.it" target="_blank">paolo.giannozzi@uniud.it</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="">On Sat, 2014-03-01 at 14:27 -0600, Peng Chen wrote:<br>

><br>

> And the system is not that large(32 atoms, 400 nband, 8*8*8 kpoints)<br>

> which is run in 128 cores.  I think you are probably right that QE is<br>

> trying to allocate a large array somehow.<br>

<br>

</div>... and ?<br>

<div class="HOEnZb"><div class="h5"><br>

> On Fri, Feb 28, 2014 at 10:35 AM, Paolo Giannozzi<br>

> <<a href="mailto:paolo.giannozzi@uniud.it">paolo.giannozzi@uniud.it</a>> wrote:<br>

>         On Fri, 2014-02-28 at 09:12 -0600, Peng Chen wrote:<br>

><br>

>         > I think it is memory, because the error message is like:<br>

>         > : 02/27/2014 14:06:20|  main|zeta27|W|job 221982 exceeds job<br>

>         hard<br>

>         > limit "h_vmem" of queue (2871259136.00000 ><br>

>         limit:2147483648.00000) -<br>

>         > sending SIGKILL<br>

><br>

><br>

>         there are a few hints on how to reduce memory usage to the<br>

>         strict<br>

>         minimum here:<br>

>         <a href="http://www.quantum-espresso.org/wp-content/uploads/Doc/pw_user_guide/node19.html#SECTION000600100000000000000" target="_blank">http://www.quantum-espresso.org/wp-content/uploads/Doc/pw_user_guide/node19.html#SECTION000600100000000000000</a><br>


>         If the FFT grid is large, reduce mixing_ndim from its default<br>

>         value (8)<br>

>         to 4 or so. If the number of bands is large, distribute<br>

>         nbnd*nbnd<br>

>         matrices using "-ndiag". If you have many k-points, save to<br>

>         disk with<br>

>         disk_io='medium'. The message you get: "2871259136 ><br>

>         limit:<a href="tel:2147483648" value="+12147483648">2147483648</a>"<br>

>         makes me think that you crash when trying to allocate an array<br>

>         whose<br>

>         size is at least 2871259136-2147483648=a lot. It shouldn' be<br>

>         difficult<br>

>         to figure out where such a large array comes from<br>

><br>

>         Paolo<br>

><br>

><br>

>         ><br>

>         > I normally used h_stak=128M, it is working fine.<br>

>         ><br>

>         ><br>

>         ><br>

>         ><br>

>         ><br>

>         ><br>

>         > On Fri, Feb 28, 2014 at 7:30 AM, Paolo Giannozzi<br>

>         > <<a href="mailto:paolo.giannozzi@uniud.it">paolo.giannozzi@uniud.it</a>> wrote:<br>

>         >         On Thu, 2014-02-27 at 17:30 -0600, Peng Chen wrote:<br>

>         >         > P.S. Most of the jobs failed at the beginning of<br>

>         scf<br>

>         >         calculation, and<br>

>         >         > the length of output scf file is zero.<br>

>         ><br>

>         ><br>

>         >         are you sure the problem is the size of the RAM and<br>

>         not the<br>

>         >         size of<br>

>         >         the stack?<br>

>         ><br>

>         >         P.<br>

>         ><br>

>         ><br>

>         >         ><br>

>         >         ><br>

>         >         > On Thu, Feb 27, 2014 at 5:09 PM, Peng Chen<br>

>         >         <<a href="mailto:pchen229@illinois.edu">pchen229@illinois.edu</a>><br>

>         >         > wrote:<br>

>         >         >         Dear QE users,<br>

>         >         ><br>

>         >         ><br>

>         >         >         Recently, our workstation is updated and<br>

>         there is a<br>

>         >         hard limit<br>

>         >         >         on memory (2G per core). Some of QE jobs<br>

>         are<br>

>         >         constantly failed<br>

>         >         >         (not always) because one of the MPI<br>

>         processes<br>

>         >         exceeded the RAM<br>

>         >         >         limit and was killed. I am wondering if<br>

>         there is a<br>

>         >         way to<br>

>         >         >         distribute using memory more evenly in<br>

>         every core.<br>

>         >         ><br>

>         >         ><br>

>         ><br>

>         >         > _______________________________________________<br>

>         >         > Pw_forum mailing list<br>

>         >         > <a href="mailto:Pw_forum@pwscf.org">Pw_forum@pwscf.org</a><br>

>         >         > <a href="http://pwscf.org/mailman/listinfo/pw_forum" target="_blank">http://pwscf.org/mailman/listinfo/pw_forum</a><br>

>         ><br>

>         ><br>

>         >         --<br>

>         >          Paolo Giannozzi, Dept.<br>

>         Chemistry&Physics&Environment,<br>

>         >          Univ. Udine, via delle Scienze 208, 33100 Udine,<br>

>         Italy<br>

>         >          Phone <a href="tel:%2B39-0432-558216" value="+390432558216">+39-0432-558216</a>, fax <a href="tel:%2B39-0432-558222" value="+390432558222">+39-0432-558222</a><br>

>         ><br>

>         >         _______________________________________________<br>

>         >         Pw_forum mailing list<br>

>         >         <a href="mailto:Pw_forum@pwscf.org">Pw_forum@pwscf.org</a><br>

>         >         <a href="http://pwscf.org/mailman/listinfo/pw_forum" target="_blank">http://pwscf.org/mailman/listinfo/pw_forum</a><br>

>         ><br>

>         ><br>

>         ><br>

>         > _______________________________________________<br>

>         > Pw_forum mailing list<br>

>         > <a href="mailto:Pw_forum@pwscf.org">Pw_forum@pwscf.org</a><br>

>         > <a href="http://pwscf.org/mailman/listinfo/pw_forum" target="_blank">http://pwscf.org/mailman/listinfo/pw_forum</a><br>

><br>

>         --<br>

>          Paolo Giannozzi, Dept. Chemistry&Physics&Environment,<br>

>          Univ. Udine, via delle Scienze 208, 33100 Udine, Italy<br>

>          Phone <a href="tel:%2B39-0432-558216" value="+390432558216">+39-0432-558216</a>, fax <a href="tel:%2B39-0432-558222" value="+390432558222">+39-0432-558222</a><br>

><br>

>         _______________________________________________<br>

>         Pw_forum mailing list<br>

>         <a href="mailto:Pw_forum@pwscf.org">Pw_forum@pwscf.org</a><br>

>         <a href="http://pwscf.org/mailman/listinfo/pw_forum" target="_blank">http://pwscf.org/mailman/listinfo/pw_forum</a><br>

><br>

><br>

><br>

> _______________________________________________<br>

> Pw_forum mailing list<br>

> <a href="mailto:Pw_forum@pwscf.org">Pw_forum@pwscf.org</a><br>

> <a href="http://pwscf.org/mailman/listinfo/pw_forum" target="_blank">http://pwscf.org/mailman/listinfo/pw_forum</a><br>

<br>

--<br>

Paolo Giannozzi, Dept. Chemistry&Physics&Environment,<br>

Univ. Udine, via delle Scienze 208, 33100 Udine, Italy<br>

Phone <a href="tel:%2B39-0432-558216" value="+390432558216">+39-0432-558216</a>, fax <a href="tel:%2B39-0432-558222" value="+390432558222">+39-0432-558222</a><br>

<br>

_______________________________________________<br>

Pw_forum mailing list<br>

<a href="mailto:Pw_forum@pwscf.org">Pw_forum@pwscf.org</a><br>

<a href="http://pwscf.org/mailman/listinfo/pw_forum" target="_blank">http://pwscf.org/mailman/listinfo/pw_forum</a><br>

</div></div></blockquote></div><br></div>