[Pw_forum] Looking for some guidance with failing tests

Barry Moore moore0557 at gmail.com
Mon May 15 18:52:04 CEST 2017


Hello All,

I have recently compiled QE 6.1 and 5.4 with GCC 4.8.5, OpenMPI 2.0.2, and
MKL 2017, but I am having some failing tests when run in parallel (they are
okay when run serially). The problem persists if I remove MKL, try GCC
5.4.0, or Intel + IntelMPI (2017).

Here is the output from `make run-custom-test-parallel testdir=pw_langevin`:

Using executable: /ihome/crc/build/quantumespresso/qe-gcc-
openmpi-6.1/test-suite/..//test-suite/run-pw.sh.
Test id: 110517.
Benchmark: 6.1.

pw_langevin - langevin.in: **FAILED**.
el1
    ERROR: absolute error 5.21e-02 greater than 1.00e-02. (Test: 0.7628.
Benchmark: 0.8149.)
el1
    ERROR: absolute error 1.04e-02 greater than 1.00e-02. (Test: 0.7855.
Benchmark: 0.7959.)
el1
    ERROR: absolute error 2.95e-02 greater than 1.00e-02. (Test: 0.7629.
Benchmark: 0.7924.)
el1
    ERROR: absolute error 8.92e-02 greater than 1.00e-02. (Test: 0.7354.
Benchmark: 0.6462.)
el1
    ERROR: absolute error 3.95e-02 greater than 1.00e-02. (Test: 0.7338.
Benchmark: 0.6943.)
el1
    ERROR: absolute error 5.16e-02 greater than 1.00e-02. (Test: 0.7795.
Benchmark: 0.7279.)
el1
    ERROR: absolute error 3.84e-01 greater than 1.00e-02. (Test: 0.7775.
Benchmark: 0.3932.)
el1
    ERROR: absolute error 3.56e-01 greater than 1.00e-02. (Test: 0.7699.
Benchmark: 0.4142.)
el1
    ERROR: absolute error 3.93e-02 greater than 1.00e-02. (Test: 0.7495.
Benchmark: 0.7888.)
e1
    ERROR: absolute error 7.13e-04 greater than 1.00e-06. (Test:
-2.414157.  Benchmark: -2.413444.)
eh1
    ERROR: absolute error 1.72e-01 greater than 1.00e-02. (Test: -10.6793.
Benchmark: -10.507.)
eh1
    ERROR: absolute error 3.52e-02 greater than 1.00e-02. (Test: -10.604.
Benchmark: -10.5688.)
eh1
    ERROR: absolute error 9.85e-02 greater than 1.00e-02. (Test: -10.6789.
Benchmark: -10.5804.)
eh1
    ERROR: absolute error 3.14e-01 greater than 1.00e-02. (Test: -10.7728.
Benchmark: -11.0869.)
eh1
    ERROR: absolute error 3.84e-01 greater than 1.00e-02. (Test: -10.7792.
Benchmark: -10.3947.)
eh1
    ERROR: absolute error 1.72e-01 greater than 1.00e-02. (Test: -10.6241.
Benchmark: -10.7964.)
eh1
    ERROR: absolute error 4.06e-01 greater than 1.00e-02. (Test: -10.6304.
Benchmark: -10.2247.)
eh1
    ERROR: absolute error 4.19e-01 greater than 1.00e-02. (Test: -10.6557.
Benchmark: -10.2366.)
eh1
    ERROR: absolute error 2.73e-01 greater than 1.00e-02. (Test: -10.7245.
Benchmark: -10.4515.)

pw_langevin - langevin_smc.in: **FAILED**.
el1
    ERROR: absolute error 5.21e-02 greater than 1.00e-02. (Test: 0.7628.
Benchmark: 0.8149.)
el1
    ERROR: absolute error 2.81e-02 greater than 1.00e-02. (Test: 0.6296.
Benchmark: 0.6577.)
el1
    ERROR: absolute error 5.01e-02 greater than 1.00e-02. (Test: 0.8071.
Benchmark: 0.757.)
el1
    ERROR: absolute error 1.89e-01 greater than 1.00e-02. (Test: 0.6871.
Benchmark: 0.4977.)
el1
    ERROR: absolute error 1.45e-01 greater than 1.00e-02. (Test: 0.7734.
Benchmark: 0.6287.)
el1
    ERROR: absolute error 1.96e-02 greater than 1.00e-02. (Test: 0.762.
Benchmark: 0.7816.)
el1
    ERROR: absolute error 5.69e-02 greater than 1.00e-02. (Test: 0.7798.
Benchmark: 0.7229.)
el1
    ERROR: absolute error 1.45e-01 greater than 1.00e-02. (Test: 0.6516.
Benchmark: 0.7962.)
el1
    ERROR: absolute error 3.92e-02 greater than 1.00e-02. (Test: 0.7488.
Benchmark: 0.788.)
e1
    ERROR: absolute error 6.76e-04 greater than 1.00e-06. (Test:
-2.414139.  Benchmark: -2.414815.)
eh1
    ERROR: absolute error 1.72e-01 greater than 1.00e-02. (Test: -10.6793.
Benchmark: -10.507.)
eh1
    ERROR: absolute error 1.58e-02 greater than 1.00e-02. (Test: -10.3584.
Benchmark: -10.3742.)
eh1
    ERROR: absolute error 1.66e-01 greater than 1.00e-02. (Test: -10.5316.
Benchmark: -10.6974.)
eh1
    ERROR: absolute error 6.57e-01 greater than 1.00e-02. (Test: -10.9399.
Benchmark: -10.2825.)
eh1
    ERROR: absolute error 2.87e-01 greater than 1.00e-02. (Test: -10.6448.
Benchmark: -10.3576.)
eh1
    ERROR: absolute error 6.69e-02 greater than 1.00e-02. (Test: -10.6823.
Benchmark: -10.6154.)
eh1
    ERROR: absolute error 1.93e-01 greater than 1.00e-02. (Test: -10.6229.
Benchmark: -10.8154.)
eh1
    ERROR: absolute error 1.98e-01 greater than 1.00e-02. (Test: -10.371.
Benchmark: -10.5688.)
eh1
    ERROR: absolute error 1.30e-01 greater than 1.00e-02. (Test: -10.725.
Benchmark: -10.595.)

All done. ERROR: only 0 out of 2 tests passed.
Failed tests in:
/ihome/crc/build/quantumespresso/qe-gcc-openmpi-6.1/test-suite/pw_langevin/

and `make run-custom-test-serial testdir=pw_langevin`

Using executable: /ihome/crc/build/quantumespresso/qe-gcc-
openmpi-6.1/test-suite/..//test-suite/run-pw.sh.
Test id: 110517-1.
Benchmark: 6.1.

pw_langevin - langevin.in: Passed.

pw_langevin - langevin_smc.in: Passed.

All done. 2 out of 2 tests passed.

Is this expected? Any suggestions? There are about 30 failing tests total.
They are all "very close" to the correct answer, but not the correct answer.

Thanks,

Barry

P.S.

I have never run QE in my academic career. I am installing it at our center
for another user and I don't want to release a code with failing tests.

-- 
Barry E Moore II, PhD
E-mail: bmooreii at pitt.edu

Assistant Research Professor
Center for Simulation and Modeling
University of Pittsburgh
Pittsburgh, PA 15260
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20170515/3c4da534/attachment.html>


More information about the users mailing list