<div dir="ltr"><div>Hi All,</div><div><br></div><div>I work for the Centre for High
Performance Computing in Cape Town, South Africa. we have many users of
QE, and it's not very uncommon for jobs to crash. Most codes permit
check pointing, but as I understand QE does not really have this
facility anymore. One can use max_seconds, but this can help mainly for
jobs where one exceeds permitted walltimes on an HPC system. However,
restarting ability from crashed jobs is important.</div><div><br></div><div>What options are available please? Any advice is most welcome please.</div><div><br></div><div>Much appreciated,</div><div>Anton<div class="gmail-yj6qo gmail-ajU"><div id="gmail-:1qv" class="gmail-ajR" tabindex="0"><img class="gmail-ajT" src="https://ssl.gstatic.com/ui/v1/icons/mail/images/cleardot.gif"></div></div></div><br clear="all"><br>-- <br><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div>Anton Lopis<br>CHPC<br>+27 21 658 2746 (W)<br>+27 72 461 3794 (Cell)<br>+27 21 658 2744 (Fax)<br></div></div></div></div>