[QE-users] Sub optimal performance on 32 core AMD machine

Pamela Whitfield whitfieldps1 at gmail.com
Tue Nov 17 22:01:53 CET 2020


Paolo

Thanks. That makes sense from what I've seen.

DFTD3 runs with -np 20 don't actually drop down to a single core at
any point so I assume some other processes are keeping the CPU busy.

Sitting and watching a single core chug along in the GPU version after
blazing through everything else is a little depressing! :-(

A good thing that DFTD2 is good enough for many problems :-)
Best regards

Pam


On Tue, Nov 17, 2020 at 9:20 PM Pamela Whitfield <whitfieldps1 at
gmail.com <https://lists.quantum-espresso.org/mailman/listinfo/users>>
wrote:

Hyperthreading is necessary for the GPU version which works fine for DFTD2
>* but not for DFTD3 - hence I use a CPU-only MPI-only version for the more
*>* complex dispersion corrections.
*>
If I remember correctly, there is no parallelization at all in DFT-D3
(while DFT-D2 is parallelized with both MPI and OpenMP)

Paolo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20201117/a689ca01/attachment.html>


More information about the users mailing list