[QE-users] Sub optimal performance on 32 core AMD machine
Michal Husak
Michal.Husak at vscht.cz
Mon Nov 16 15:19:04 CET 2020
I had purchased a new PC with 2x 16 core AMD EPYC processors . 64
cores with hyper threading ...
I was hoping my QM programs (Quantum Espresso, CASTEP) will run on the new
system faster, than on my old 4 core i7 Intel machine (8 year old) ....
To my great surprise, the opposite is almost true :-(.
My main task is scf and geometry optimization of middle sized organic
molecular crystals (abut 100 C,H,N per unit cell) ...
I was playing with OpenMPI/OpenMP setup changes ...
I was playing with the secret MKL_DEBUG_CPU_TYPE=5 parameter
(responsible for slow run of Intel MKL compiled code on AMD) ...
Nothing helps, the best speed is obteined when I use only 4 cores
(OpenMPI or OpenMP - results similar) ...
Using 16 or 32 cores gives almost no benefit ...
The CPU load for run on 1/4/816/32 coresponds to the nubmer of CPU
set = they try to do something ...
Any idea what I should check, try optimize ?
Maybe the bottleneck is memory access, not CPU power (I have 128
GB almost not used RAM) ?
Michal Husak
UCT Prague
More information about the users
mailing list