<div dir="ltr">On Sun, Jan 8, 2017 at 3:43 AM, <a href="mailto:jqli14@fudan.edu.cn">jqli14@fudan.edu.cn</a> <span dir="ltr"><<a target="_blank" href="mailto:jqli14@fudan.edu.cn">jqli14@fudan.edu.cn</a>></span> wrote:<br><br><div class="gmail_extra"><div class="gmail_quote"><blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class="gmail_quote"><div>

<div><span></span>      When the number of points of <span style="font-size:11pt;line-height:1.5;background-color:window"> </span><span style="font-size:11pt;line-height:1.5;background-color:window">of FFT mesh is </span><span style="font-size:11pt;line-height:1.5;background-color:window">divisible by </span><span style="font-size:11pt;line-height:1.5;background-color:window">CPU cores </span><span style="font-size:11pt;line-height:1.5;background-color:window">, every CPU store the equal number of wavefunctions</span></div></div></blockquote><div><br></div><div>no: every CPU stores an equal number of grid points for all wavefunctions, at least with the default parallelization<br> <br></div><blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class="gmail_quote"><div><div><span style="font-size:11pt;line-height:1.5;background-color:window">If not </span><span style="font-size:11pt;line-height:1.5;background-color:window">divisible, there would be many zero values in the variable </span><span style="font-size:11pt;line-height:1.5;background-color:window"><u>psic</u> after <u>call </u></span><span style="font-size:11pt;line-height:1.5"><u>invfft ('Wave', psic, <wbr>dffts)</u> in subroutine <u>sum_band</u>, i.e., wavefunctions in real space.</span> <br></div></div></blockquote><blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class="gmail_quote"><div><div><span style="font-size:11pt;line-height:1.5"> </span><span style="font-size:11pt;line-height:1.5;background-color:window">So if I calculate something like <i|j>, I just evaluate it as sum( psi_i*psj ) /(nr1*nr2*nr3) but not </span><span style="font-size:11pt;line-height:1.5;background-color:window"> </span><span style="font-size:11pt;line-height:1.5;background-color:window">sum( psi_i*psj ) /(nr1x*nr2x*nr3x),  where nr1, nr2, nr3 shown as " </span><span style="font-size:11pt;line-height:1.5;background-color:window">Dense  grid:   XXXX G-vectors     FFT dimensions: <wbr>(nr1, nr2, nr3)" in standard output of QE.</span></div></div></blockquote><div><br></div><div>you are mixing up two different aspects. The dimensions of arrays: nr1x,nr2x,nr3x, may exceed the dimensions of FFT transforms: nr1,nr2,nr3, for some values of the latter. Typically, it is convenient to set nr1x =nr1+1 if nr1 is a power of 2, for obscure reasons related to how the memory is organized and accessed. This holds for either serial or parallel execution. In parallel execution, the size of the "slices" of the real-space FFT grid may be different on different processors, but this is taken into account by the code. In all cases, the correct volume element for integration is Omega/<span style="font-size:11pt;line-height:1.5;background-color:window">(nr1*nr2*nr3), not </span>Omega/<span style="font-size:11pt;line-height:1.5;background-color:window">(nr1x*nr2x*nr3x), so <\psi_i|\psi_j> = \sum_k</span><span style="font-size:11pt;line-height:1.5;background-color:window"><span style="font-size:11pt;line-height:1.5;background-color:window">\psi</span><span style="font-size:11pt;line-height:1.5;background-color:window"><span style="font-size:11pt;line-height:1.5;background-color:window">^*</span>_i(k)</span></span><span style="font-size:11pt;line-height:1.5;background-color:window">\psi</span><span style="font-size:11pt;line-height:1.5;background-color:window">_j(k) / </span><span style="font-size:11pt;line-height:1.5;background-color:window">(nr1*nr2*nr3) (in parallel: summed over processors) where k runs over FFT grid points (in parallel: over the local FFT grid) <br><br></span></div><div><span style="font-size:11pt;line-height:1.5;background-color:window">Paolo<br></span></div><div><span style="font-size:11pt;line-height:1.5;background-color:window"></span><span style="font-size:11pt;line-height:1.5;background-color:window"><span style="font-size:11pt;line-height:1.5;background-color:window"></span></span><div><div><span style="font-size:11pt;line-height:1.5;background-color:window"><br></span></div></div><span style="font-size:11pt;line-height:1.5;background-color:window">Paolo<br></span></div><div><br><div><div><span style="font-size:11pt;line-height:1.5;background-color:window"><br></span></div><div><span style="font-size:11pt;line-height:1.5;background-color:window">Best regards!</span></div><div><span style="font-size:11pt;line-height:1.5;background-color:window">Jiqiang Li</span></div><div><span style="font-size:11pt;line-height:1.5;background-color:window">Fudan University</span></div><hr style="width:210px;height:1px;display:none" align="left" color="#b5c4df" size="1">

<div><span></span></div>

</div><br>______________________________<wbr>_________________<br>

Pw_forum mailing list<br>

<a href="mailto:Pw_forum@pwscf.org">Pw_forum@pwscf.org</a><br>

<a target="_blank" rel="noreferrer" href="http://pwscf.org/mailman/listinfo/pw_forum">http://pwscf.org/mailman/<wbr>listinfo/pw_forum</a><br></div></div><br><br clear="all"><br>-- <br><div class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div>Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,<br>Univ. Udine, via delle Scienze 208, 33100 Udine, Italy<br>Phone +39-0432-558216, fax +39-0432-558222<br><br></div></div></div></div></div>

</div></div>