[QE-users] [QE-GPU] High GPU oversubscription detected

Paolo Giannozzi paolo.giannozzi at uniud.it
Wed Nov 29 15:53:12 CET 2023


On 11/27/23 11:32, Yin-Ying Ting wrote:

> Based on the *environment.f90* file, this message is triggered when 
> /nproc > ndev * nnode * 2/. If I understand correctly, I have nproc 
> (Number of parallel processe)=4, ndev(Number of GPU Devices per Node) =4 
> and nnode (Number of Nodes)=1. This condition seems to be false (4 > 8). 
> Despite this, the message still appears. All 4 GPUs were active during 
> the run.

funny. Even funnier, the number of GPUs actually used does not seem to 
be written anywhere on output.

Add a line printing nproc, ndev, nnode just before the warning is 
issued, recompile and re-run. One (at least) of those numbers is not 
what you expect. Computers are not among the most reliable machines, but 
they should be able to find out who is larger between 4 and 8

Paolo
-- 
Paolo Giannozzi, DMIF, Univ. Udine, Italy
*** AVAILABLE POST-DOC POSITION:
*** https://physicslab.uniud.it/persone/paolo-giannozzi/advert


More information about the users mailing list