<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Hi Chris <br>
it might it be happening exactly the opposite. <br>
if you don't specify anything the configure tries all the options
from the best to the worse and the usage for mkl is tested as
first guess if I am not wrong. If you pass it a specific path
just tries that one and deals with it as expecting ordinary fftw
library, so it may be failing in finding a working fft and turns
on the internal one. <br>
<br>
Could you send the make.inc files in the 2 cases or the config log
? <br>
Pietro <br>
<br>
On 03/01/2019 11:13 AM, Christoph Wolf wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAMC_G_44XRJtUKinXWm-k-p8bq=Lc2P1Ls+TArX2v_GOp8CDhA@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">Dear all,
<div><br>
</div>
<div>please forgive this "beginner" question but I am facing
a weird problem. When compiling qe-6.4 (intel compiler,
intel MPI+OpenMP) with or without intel's fftw libs I find
that in openMP with 2 threads per core the intel fftw
version is roughly "twice as slow" as the internal one</div>
<div><br>
</div>
<div>"internal"</div>
<div>
<div> General routines</div>
<div> calbec : 2.69s CPU 2.70s WALL (
382 calls)</div>
<div> fft : 0.47s CPU 0.47s WALL (
122 calls)</div>
<div> ffts : 0.05s CPU 0.05s WALL (
12 calls)</div>
<div> fftw : 49.97s CPU 50.12s WALL (
14648 calls)</div>
<div> </div>
<div> Parallel routines</div>
<div> </div>
<div> PWSCF : 1m45.03s CPU 1m46.59s WALL</div>
<div><br>
</div>
<div>"intel fftw"</div>
<div>
<div> General routines</div>
<div> calbec : 6.36s CPU 3.20s WALL
( 382 calls)</div>
<div> fft : 0.93s CPU 0.47s WALL
( 121 calls)</div>
<div> ffts : 0.10s CPU 0.05s WALL
( 12 calls)</div>
<div> fftw : 109.63s CPU 55.23s WALL
( 14648 calls)</div>
<div> </div>
<div> Parallel routines</div>
<div> </div>
<div> PWSCF : 3m18.32s CPU 1m41.01s WALL</div>
</div>
<div><br>
</div>
<div>as a benchmark I am running a perovskite with 120
k-points on 30 processors (one node); There is no
(noticeable) difference if I export OMP_NUM_THREADS=1
(only MPI) so I guess I made some mistake during the
build with regards to the libraries.</div>
<div><br>
</div>
<div>Build process is as below</div>
<div><br>
</div>
<div>
<p
class="gmail-m_-8828478814826450334gmail-m_436820312386234918MsoListParagraph"
style="margin:0cm 0cm 0.0001pt
56pt;font-size:12pt;font-family:굴림;text-align:justify"><span
style="font-size:10pt;font-family:"\00b9d1\00c740
\00ace0\00b515"" lang="EN-US">module load
intel19/compiler-19</span></p>
<p
class="gmail-m_-8828478814826450334gmail-m_436820312386234918MsoListParagraph"
style="margin:0cm 0cm 0.0001pt
56pt;font-size:12pt;font-family:굴림;text-align:justify"><span
style="font-size:10pt;font-family:"\00b9d1\00c740
\00ace0\00b515"" lang="EN-US">module load
intel19/impi-19</span><br>
</p>
<p
class="gmail-m_-8828478814826450334gmail-m_436820312386234918MsoListParagraph"
style="margin:0cm 0cm 0.0001pt
56pt;font-size:12pt;font-family:굴림;text-align:justify"><span
style="font-size:10pt;font-family:"\00b9d1\00c740
\00ace0\00b515"" lang="EN-US"><br>
</span></p>
<p
class="gmail-m_-8828478814826450334gmail-m_436820312386234918MsoListParagraph"
style="margin:0cm 0cm 0.0001pt
56pt;text-align:justify"><span lang="EN-US"><font
face="맑은 고딕"><span style="font-size:13.3333px">export
FFT_LIBS="-L$MKLROOT/intel64"</span></font><br>
</span></p>
<p
class="gmail-m_-8828478814826450334gmail-m_436820312386234918MsoListParagraph"
style="margin:0cm 0cm 0.0001pt
56pt;text-align:justify"><span lang="EN-US"><font
face="맑은 고딕"><span style="font-size:13.3333px">export
LAPACK_LIBS="-lmkl_blacs_intelmpi_lp64"</span></font><br>
</span></p>
<div>
<p
class="gmail-m_-8828478814826450334gmail-m_436820312386234918MsoListParagraph"
style="margin:0cm 0cm 0.0001pt
56pt;font-size:12pt;font-family:굴림;text-align:justify"><span
style="font-size:10pt;font-family:"\00b9d1\00c740
\00ace0\00b515"" lang="EN-US"><span
class="gmail-m_-8828478814826450334gmail-il">export</span> CC=icc <span
class="gmail-m_-8828478814826450334gmail-il">FC</span>=ifort
F77=ifort MPIF90=mpiifort MPICC=mpiicc</span></p>
<p
class="gmail-m_-8828478814826450334gmail-m_436820312386234918MsoListParagraph"
style="margin:0cm 0cm 0.0001pt
56pt;font-size:12pt;font-family:굴림;text-align:justify"><span
style="font-size:10pt;font-family:"\00b9d1\00c740
\00ace0\00b515"" lang="EN-US"><br>
</span></p>
<p
class="gmail-m_-8828478814826450334gmail-m_436820312386234918MsoListParagraph"
style="margin:0cm 0cm 0.0001pt
56pt;text-align:justify"><font face="맑은 고딕"><span
style="font-size:13.3333px">./configure
--enable-parallel --with-scalapack=intel
--enable-openmp</span></font><br>
</p>
<p
class="gmail-m_-8828478814826450334gmail-m_436820312386234918MsoListParagraph"
style="margin:0cm 0cm 0.0001pt
56pt;text-align:justify"><font face="맑은 고딕"><span
style="font-size:13.3333px"><br>
</span></font></p>
This detects BLAS_LIBS, LAPACK_LIBS, SCALAPACK_LIBS
and FFT_LIBS.</div>
</div>
<div><br>
</div>
<div>I am not experienced with benchmarking so if my
benchmark is garbage please suggest a suitable system!</div>
<div><br>
</div>
<div>Thanks in advance!</div>
<div>Chris </div>
<div><br>
</div>
-- <br>
<div dir="ltr" class="gmail_signature">
<div dir="ltr">Postdoctoral Researcher<br>
Center for Quantum Nanoscience, Institute for Basic
Science<br>
Ewha Womans University, Seoul, South Korea</div>
</div>
</div>
</div>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
users mailing list
<a class="moz-txt-link-abbreviated" href="mailto:users@lists.quantum-espresso.org">users@lists.quantum-espresso.org</a>
<a class="moz-txt-link-freetext" href="https://lists.quantum-espresso.org/mailman/listinfo/users">https://lists.quantum-espresso.org/mailman/listinfo/users</a></pre>
</blockquote>
<p><br>
</p>
</body>
</html>