[Pw_forum] Memory distribution problem

Peng Chen pchen229 at illinois.edu
Sun Mar 2 04:13:45 CET 2014


And I still couldn't find where this large array comes from.


On Sat, Mar 1, 2014 at 3:49 PM, Paolo Giannozzi <paolo.giannozzi at uniud.it>wrote:

> On Sat, 2014-03-01 at 14:27 -0600, Peng Chen wrote:
> >
> > And the system is not that large(32 atoms, 400 nband, 8*8*8 kpoints)
> > which is run in 128 cores.  I think you are probably right that QE is
> > trying to allocate a large array somehow.
>
> ... and ?
>
> > On Fri, Feb 28, 2014 at 10:35 AM, Paolo Giannozzi
> > <paolo.giannozzi at uniud.it> wrote:
> >         On Fri, 2014-02-28 at 09:12 -0600, Peng Chen wrote:
> >
> >         > I think it is memory, because the error message is like:
> >         > : 02/27/2014 14:06:20|  main|zeta27|W|job 221982 exceeds job
> >         hard
> >         > limit "h_vmem" of queue (2871259136.00000 >
> >         limit:2147483648.00000) -
> >         > sending SIGKILL
> >
> >
> >         there are a few hints on how to reduce memory usage to the
> >         strict
> >         minimum here:
> >
> http://www.quantum-espresso.org/wp-content/uploads/Doc/pw_user_guide/node19.html#SECTION000600100000000000000
> >         If the FFT grid is large, reduce mixing_ndim from its default
> >         value (8)
> >         to 4 or so. If the number of bands is large, distribute
> >         nbnd*nbnd
> >         matrices using "-ndiag". If you have many k-points, save to
> >         disk with
> >         disk_io='medium'. The message you get: "2871259136 >
> >         limit:2147483648"
> >         makes me think that you crash when trying to allocate an array
> >         whose
> >         size is at least 2871259136-2147483648=a lot. It shouldn' be
> >         difficult
> >         to figure out where such a large array comes from
> >
> >         Paolo
> >
> >
> >         >
> >         > I normally used h_stak=128M, it is working fine.
> >         >
> >         >
> >         >
> >         >
> >         >
> >         >
> >         > On Fri, Feb 28, 2014 at 7:30 AM, Paolo Giannozzi
> >         > <paolo.giannozzi at uniud.it> wrote:
> >         >         On Thu, 2014-02-27 at 17:30 -0600, Peng Chen wrote:
> >         >         > P.S. Most of the jobs failed at the beginning of
> >         scf
> >         >         calculation, and
> >         >         > the length of output scf file is zero.
> >         >
> >         >
> >         >         are you sure the problem is the size of the RAM and
> >         not the
> >         >         size of
> >         >         the stack?
> >         >
> >         >         P.
> >         >
> >         >
> >         >         >
> >         >         >
> >         >         > On Thu, Feb 27, 2014 at 5:09 PM, Peng Chen
> >         >         <pchen229 at illinois.edu>
> >         >         > wrote:
> >         >         >         Dear QE users,
> >         >         >
> >         >         >
> >         >         >         Recently, our workstation is updated and
> >         there is a
> >         >         hard limit
> >         >         >         on memory (2G per core). Some of QE jobs
> >         are
> >         >         constantly failed
> >         >         >         (not always) because one of the MPI
> >         processes
> >         >         exceeded the RAM
> >         >         >         limit and was killed. I am wondering if
> >         there is a
> >         >         way to
> >         >         >         distribute using memory more evenly in
> >         every core.
> >         >         >
> >         >         >
> >         >
> >         >         > _______________________________________________
> >         >         > Pw_forum mailing list
> >         >         > Pw_forum at pwscf.org
> >         >         > http://pwscf.org/mailman/listinfo/pw_forum
> >         >
> >         >
> >         >         --
> >         >          Paolo Giannozzi, Dept.
> >         Chemistry&Physics&Environment,
> >         >          Univ. Udine, via delle Scienze 208, 33100 Udine,
> >         Italy
> >         >          Phone +39-0432-558216, fax +39-0432-558222
> >         >
> >         >         _______________________________________________
> >         >         Pw_forum mailing list
> >         >         Pw_forum at pwscf.org
> >         >         http://pwscf.org/mailman/listinfo/pw_forum
> >         >
> >         >
> >         >
> >         > _______________________________________________
> >         > Pw_forum mailing list
> >         > Pw_forum at pwscf.org
> >         > http://pwscf.org/mailman/listinfo/pw_forum
> >
> >         --
> >          Paolo Giannozzi, Dept. Chemistry&Physics&Environment,
> >          Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
> >          Phone +39-0432-558216, fax +39-0432-558222
> >
> >         _______________________________________________
> >         Pw_forum mailing list
> >         Pw_forum at pwscf.org
> >         http://pwscf.org/mailman/listinfo/pw_forum
> >
> >
> >
> > _______________________________________________
> > Pw_forum mailing list
> > Pw_forum at pwscf.org
> > http://pwscf.org/mailman/listinfo/pw_forum
>
> --
> Paolo Giannozzi, Dept. Chemistry&Physics&Environment,
> Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
> Phone +39-0432-558216, fax +39-0432-558222
>
> _______________________________________________
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://pwscf.org/mailman/listinfo/pw_forum
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20140301/ac243545/attachment.html>


More information about the users mailing list