[QE-users] [SPAM] Re: GPU version gives different result from CPU version

Shimin ZHANG szhang943 at wisc.edu
Thu Aug 1 19:11:01 CEST 2024


Dear Omar and Ivan,

I tried the several the setup you mentioned. Changing the startingwfc=“atomic” did not make difference, so it is not the starting point problem. However making the cpu parallel setting the same as gpu will give the same result as gpu.

In all the other cpu calculation I use 128cores with -nk 1 -nb 8 -nd 1 . In the gpu calculation I use 8gpu cores with -nk 1 -nb 8 ; After I change the cpu parallel into 8cores with -nk 1 -nb 8 -nd 1 , the result became the same as gpu.

My question is that if the parallel setting does make difference then is it the problem from compilation?

Best
Shimin

# VBM CBM GAP
_________________CPU_____________________
60Ry_cpu
8.4785 11.5730 3.0945
60Ry_cpu_k111
8.4870 11.5774 3.0904
60Ry_cpu_sameparallel
8.6273 11.5295 2.9022
60Ry_cpu_startingwfc_atomic
8.4785 11.5730 3.0945
60Ry_NERSC7.0cpu
8.4298 11.5605 3.1307
_________________GPU_____________________
60Ry_gpu
8.6273 11.5295 2.9022
60Ry_gpu_k111
8.5993 11.5521 2.9528
60Ry_gpu_smallermixing
8.6273 11.5295 2.9022
60Ry_gpu_startingwfc_atomic
8.6273 11.5295 2.9022
60Ry_NERSC7.0gpu
10.2121 10.7435 .5314 : broken module
60Ry_othergpu
8.6273 11.5295 2.9022

70Ry_gpu
8.6350 11.5408 2.9058
80Ry_gpu
8.6301 11.5394 2.9093
________________GPU Unit cell__________
80Ry_gpu_uc
8.4537 11.5622 3.1085
60Ry_gpu_uc
8.4585 11.5537 3.0952



On Jul 30, 2024, at 6:06 AM, Ivan Carnimeo <icarnimeo at sissa.it> wrote:

Dear Shimin,
unfortunately the test is too large also for me to try it.

Small differences on band gap and total energy can be also due to differences in MPI parallelism used in the CPU and GPU calculations.

If you want to understand more, you can launch the CPU calculation with exactly the same parallelization options as the GPU one (number of MPI ranks, OMP threads, pools, bands, etc...). This will tell you much the error depends on the different CPU/GPU libraries or on the different MPI distribution of the data.

You can also reduce the randomness of the wavefunction initialization with startingwfc='atomic'.

Regards,
IC


_______________________________________________
The Quantum ESPRESSO community stands by the Ukrainian
people and expresses its concerns about the devastating
effects that the Russian military offensive has on their
country and on the free and peaceful scientific, cultural,
and economic cooperation amongst peoples
_______________________________________________
Quantum ESPRESSO is supported by MaX (https://urldefense.com/v3/__http://www.max-centre.eu__;!!Mak6IKo!Mc0hnxd4b9mOtc1xDHEkX2MF7KsGu4fjz37H61XwZfOh1TTBPi7-ykGJswmGM2P6rvpYd6swoI-3ViUb19r4e4s$ )
users mailing list users at lists.quantum-espresso.org<mailto:users at lists.quantum-espresso.org>
https://urldefense.com/v3/__https://lists.quantum-espresso.org/mailman/listinfo/users__;!!Mak6IKo!Mc0hnxd4b9mOtc1xDHEkX2MF7KsGu4fjz37H61XwZfOh1TTBPi7-ykGJswmGM2P6rvpYd6swoI-3ViUbYIe6dWk$

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20240801/a80b9579/attachment.html>


More information about the users mailing list