[QE-users] [QE-GPU] Performance of the NGC Container

Tue Jul 6 19:44:12 CEST 2021

Hello (@Louis Stuber),

The QE container on NGC (https://ngc.nvidia.com/catalog/containers/hpc:quantum_espresso) appears to be running very well for us on a node with two A100's for the "AUSURF112, Gold surface (112 atoms), DEISA pw" benchmark. We see a speed-up of 8x in comparison to running on 80 Skylake CPU-cores (no GPUs) where the code was built from source.

The procedure we used for the above is here:
https://researchcomputing.princeton.edu/support/knowledge-base/quantum-espresso

However, for one system we see a slow down (i.e., the code runs faster using only CPU-cores). Can you tell if the system below should perform well using the container?

"My system is basically just two carbon dioxide molecules and doing a single point calculation on them using the PBE-D3 functional and basically just altering the distance between the two molecules in the atomic coordinates."

Can someone comment in general on when one would expect the container running on GPUs to outperform a build-from-source executable running on CPU-cores?

CUDA-aware MPI is nice. It appears that the container is configured to use the MPI libraries in the container instead of those installed for the local cluster. Is this true? Can users take advantage of their local CUDA-aware MPI libraries?

Jon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20210706/2cd8a90e/attachment.html>