[Pw_forum] Energy variations in noncolin-constrain_total.in with OpenMP & MKL
p.giannozzi at gmail.com
Tue Jan 26 17:55:45 CET 2016
Very interesting, and excellent questions, for which unfortunately I have
no clear answer (nor has anybody else, I am afraid).
One should obtain the same numbers - within the errors due to roundoff,
though - in serial, OpenMP, MPI execution, and on different machines, and
with different compilers and mathematical libraries. In practice, there are
invariably small differences, that sometimes do not completely disappear
even if one pushes convergence thresholds to very strict limits. In
addition to noncolin-constrain_total.in, another notable offender is
This could signal a small bug, but in my experience, most of those cases
can be linked to specific optimized mathematical libraries or compiler
versions. As long as we can blame somebody else, not a big problem :-)
On Tue, Jan 26, 2016 at 12:23 PM, Nick Wilson <
nw.qeforge.5211 at family-wilson.me.uk> wrote:
> I’ve been testing the OpenMP build of Quantum Espresso 5.3.0 on our
> system using the Intel compiler and MKL and have a question about variation
> of energy with the number of OpenMP threads used.
> I ran all the plane wave tests in the test-suite directory using between 1
> and 16 OpenMP threads and they all gave consistent results apart from
> pw_noncolin/noncolin-constrain_total.in which showed variation in
> between -55.54478325 Ry and -55.54478414 Ry.
> I ran the test through the Intel Inspector tool but that didn’t show up
> any threading deadlocks or data races.
> I dropped the compiler optimisation to -O0 and added the “-fp-model
> strict” and “-fp-model source” compiler flags but that had no effect.
> I tried using some of the relevant environment variables
> (KMP_DETERMINISTIC_REDUCTION=1 and MKL_CBWR=COMPATIBLE) which also had no
> Changing to use the internal BLAS library resolved the issue so it looks
> to be MKL-related. It’s present with both the GNU and Intel compilers.
> I dropped back to an earlier version of MKL but the effect was still
> As it was thread-related I tried linking against the sequential version of
> MKL but that didn’t help.
> So, I guess my questions are:
> Should the results always be invariant of the number of OpenMP threads?
> Is there anything unique about the noncolin-constrain_total.in test case
> which would cause it to behave differently to the rest of the tests?
> Best regards,
> Nick Wilson
> System details:
> Intel Sandy Bridge E5-2650 CPU
> CentOS Linux release 7.2.1511
> MKL from Intel compiler 16.0.0
> GNU compiler version 4.8.5
> Pw_forum mailing list
> Pw_forum at pwscf.org
Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
Phone +39-0432-558216, fax +39-0432-558222
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the users