<div dir="ltr"><div>I see.<br></div><div>You are right. Just checked on 6.1. No more scratch files during the run.<br><br></div><div>The large file issue happened in QE 5.3. Some offsets used to access a record exceed 32bit int and goes negative.<br></div><div>The 6.1 seems still rely on iotk but the svn repo eliminates iotk for writing WF. Applause!<br></div><div><br></div><div>Best,<br></div><div>Ye<br></div><div><br></div></div><div class="gmail_extra"><br clear="all"><div><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr">===================<br>

Ye Luo, Ph.D.<br>

Leadership Computing Facility<br>

Argonne National Laboratory</div></div></div>

<br><div class="gmail_quote">2017-05-25 10:24 GMT-05:00 Paolo Giannozzi <span dir="ltr"><<a href="mailto:paolo.giannozzi@uniud.it" target="_blank">paolo.giannozzi@uniud.it</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Dear Ye<br>

<span class=""><br>

> I noticed yesterday that the wf_collect is set true as default. Probably you<br>

> already had a lot of discussion on your side<br>

<br>

</span>not really: having realized that I seldom get useful answers when I<br>

ask opinions about this or that, I no longer ask, just do this or that<br>

and wait for complaints.<br>

<span class=""><br>

> Do we have confident that wf_collect will not add significant of time on<br>

> large simulations?<br>

<br>

</span>At this stage, performances are not my major concern: correctness and<br>

maintainability are. The I/O of QE has become a mess beyond control,<br>

it needs to be simplified and documented, the old stuff must disappear<br>

ASAP. Note that the possibility to write "distributed" files is still<br>

there (at least for PW; for CP it can be easily re-added),<br>

<span class=""><br>

> Is the performance good on both lustre and GPFS file system?<br>

<br>

</span>No idea.<br>

<span class=""><br>

> I didn't have much experience with the recent added hdf5 feature.<br>

<br>

</span>Nor have I (by the way, it is not yet working with new I/O)<br>

<br>

> Does the WF collect use  parallel collective I/O? or like the old fashion<br>

<span class="">> collect the WF on the master  and write by it.<br>

<br>

</span>For the time being: old fashion. Of course, everybody is welcome to<br>

implement with "the latest and the greatest", but first of all, we<br>

need something working and that can be modified without hours of<br>

reverse engineering.<br>

<span class=""><br>

> Is the performance good? Measured bandwidth?<br>

<br>

</span>I hope it will be, but I leave to other people measuring bandwidths.<br>

<span class=""><br>

> On the machines I use, the GPFS has 8 aggregators by default and PIO<br>

> performance is better than creating individual files. The lustre does the<br>

> opposite and has only 1 OST by default and thus write sequentially with PIO.<br>

> Writing individual files becomes faster. Of course you can tune both of<br>

> them, just very tricky.<br>

><br>

> Do QE still create the file per MPI rank from the beginning? 4k empty file<br>

> is a bit slow to create and pain to 'ls'. When I do DFT+U basically the<br>

> number of files doubles or triples I don't remember exactly.<br>

<br>

</span>This story of QE opening too many files is becoming an urban legend<br>

like "QE is 10 times slower than VASP". The default setting, since<br>

some time, is that QE doesn't open anything while running.<br>

<span class=""><br>

> PS: In the past, I had the experience that QE was not able to ready its own<br>

> collected WF when the record (using IOTK) is very large >100GB.<br>

<br>

</span>this doesn't look like a QE problem, though, unless it is a iotk<br>

problem. IOTK is no longer used to write binaries by the way.<br>

<br>

Paolo<br>

<span class=""><br>

> Not collecting the WF was the preferred way for me. It should not be a problem<br>

> with hdf5 since the dataset is per band and much smaller.<br>

><br>

> Thanks,<br>

> Ye<br>

> ===================<br>

> Ye Luo, Ph.D.<br>

> Leadership Computing Facility<br>

> Argonne National Laboratory<br>

><br>

</span>> ______________________________<wbr>_________________<br>

> Q-e-developers mailing list<br>

> <a href="mailto:Q-e-developers@qe-forge.org">Q-e-developers@qe-forge.org</a><br>

> <a href="http://qe-forge.org/mailman/listinfo/q-e-developers" rel="noreferrer" target="_blank">http://qe-forge.org/mailman/<wbr>listinfo/q-e-developers</a><br>

><br>

<span class="HOEnZb"><font color="#888888"><br>

<br>

<br>

--<br>

Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,<br>

Univ. Udine, via delle Scienze 208, 33100 Udine, Italy<br>

Phone <a href="tel:%2B39-0432-558216" value="+390432558216">+39-0432-558216</a>, fax <a href="tel:%2B39-0432-558222" value="+390432558222">+39-0432-558222</a><br>

</font></span></blockquote></div><br></div>