[QE-users] Sub optimal performance on 32 core AMD machine
Pamela Whitfield
whitfieldps1 at gmail.com
Tue Nov 17 22:01:53 CET 2020
Paolo
Thanks. That makes sense from what I've seen.
DFTD3 runs with -np 20 don't actually drop down to a single core at
any point so I assume some other processes are keeping the CPU busy.
Sitting and watching a single core chug along in the GPU version after
blazing through everything else is a little depressing! :-(
A good thing that DFTD2 is good enough for many problems :-)
Best regards
Pam
On Tue, Nov 17, 2020 at 9:20 PM Pamela Whitfield <whitfieldps1 at
gmail.com <https://lists.quantum-espresso.org/mailman/listinfo/users>>
wrote:
Hyperthreading is necessary for the GPU version which works fine for DFTD2
>* but not for DFTD3 - hence I use a CPU-only MPI-only version for the more
*>* complex dispersion corrections.
*>
If I remember correctly, there is no parallelization at all in DFT-D3
(while DFT-D2 is parallelized with both MPI and OpenMP)
Paolo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20201117/a689ca01/attachment.html>
More information about the users
mailing list