[QE-users] Sub optimal performance on 32 core AMD machine

Michal Husak Michal.Husak at vscht.cz
Mon Nov 16 15:19:04 CET 2020


I had purchased a new PC with 2x 16 core AMD EPYC processors . 64 
cores with hyper threading ...
I was hoping my QM programs (Quantum Espresso, CASTEP) will run on the new
system faster, than on my old 4 core i7 Intel machine (8 year old) ....

To my great surprise, the opposite is almost true :-(.
My main task is scf and geometry optimization of middle sized organic 
molecular crystals (abut 100 C,H,N per unit cell) ...

I was playing with OpenMPI/OpenMP setup changes ...
I was playing with the secret MKL_DEBUG_CPU_TYPE=5 parameter 
(responsible for slow run of Intel MKL compiled code on AMD) ...

Nothing helps, the best speed is obteined when I  use only 4 cores 
(OpenMPI or OpenMP - results similar) ...
Using 16 or 32 cores gives almost no benefit ...
The CPU load for run on 1/4/816/32 coresponds to the nubmer of CPU 
set = they try to do something ...

Any idea what I should check, try optimize ?

Maybe the bottleneck is memory access, not CPU power  (I have 128 
GB  almost not used RAM) ?

Michal Husak

UCT Prague






More information about the users mailing list