[QE-users] Large time lag post software upgradation in HPC system

Niharika Joshi nh.joshi at ncl.res.in
Tue Oct 15 13:20:20 CEST 2024


Dear Pietro, 
Thanks a lot for your reply. 
I will try out your suggestions and check. 

With regards, 
Niharika 


From: pdelugas at sissa.it 
To: users at lists.quantum-espresso.org 
Sent: Tuesday, October 15, 2024 2:02:33 PM 
Subject: Re: [QE-users] Large time lag post software upgradation in HPC system 

Hello 
It is strange because qe v7.3 is way faster than 6.7, especially on GPUs. It has to do with some fine-tuning in using the cluster. 
You should ask help to the system managers of your cluster. 
Just trying to guess: 


    1. 
The problem might be hyperthreading, so make sure that OMP_NUM_THREADS is set to 1. 

    2. 
try to see in the GPU MPI aware communications are working compile with --with-cuda-mpi=no 


hope it helps 
best regards 
Pietro 

From: users <users-bounces at lists.quantum-espresso.org> on behalf of Niharika Joshi <nh.joshi at ncl.res.in> 
Sent: Tuesday, October 15, 2024 09:34 
To: Quantum ESPRESSO users Forum <users at lists.quantum-espresso.org> 
Subject: [QE-users] Large time lag post software upgradation in HPC system 
Dear QE users, 
I am using a HPC resource for more than a year with QE(6.7Max GPU) without any issue. My present research problem focuses on studying methane and carbon dioxide adsorption on spinel surfaces. The system is large with more than 380 atoms and ~3500 electrons. Normally, 2-3 ionic cycles (with 60-70 iterations) gets complete within a day. However, recently there has been some software upgradation in the computing system after which I have observed a huge time lag in my calculations. Currently, only few iterations are performed in 24 hours. 

Please find below two tables listing the details of hardware specifications and upgradation information of software in the computing system. 

Component 
	
Specification 

CPU 
	
AMD EPYC 7742 64C 2.25GHz 

CPU core 
	
128 cores (Dual socket each with 64 cores); 256 cores with hyper-threading 

L3 cache 
	
256 Mb 

RAM 
	
1 TB 

GPU 
	
NVIDIA A100-SXM4 

GPU Memory 
	
40 Gb 

Total no. of GPU per node 
	
8 

Storage 
	
10.5 PiB PFS based storage 

Networking 
	
Mellonex ConnectX-6 VPI (infiniband HDR) 


Software 
	
Specification of upgradation 

OS 
	
from Ubuntu 20.04.02 (DGX OS 5.0.5) to Ubuntu 22.04.04 (DGX OS 6.3.0) 

Kernel 
	
from 5.4.0-80-generic to 5.15.0-1062-nvidia 

CUDA 
	
10.1 to 12.4 (below versions are also available) 

NVIDIA Driver version 
	
450.142.00 to 550.90.07 

Post software upgradation, QE-7.3 was installed in the following manner: 
Step 1 : Source up the HPC-SDK environment: source /opt/hpc-sdk-23.9/env.sh Step 2. Set up the environment: ./configure --prefix=installation-location --with-cuda=$CUDA_ROOT --with-cuda-runtime=12.2 --with-cuda-cc=80 --enable-openmp --with-scalapack=no --with-cuda-mpi=yes Step 3. Compile the source code: make all -j8 Step 4 . Install the compiled binaries: make instal l 

Kindly, suggest some solution to this problem. Any advice/suggestion at this point would really be very helpful to me. 

With best regards, 
Niharika Joshi, 
National Post Doctoral Fellow, 
CSIR National Chemical Laboratory, Pune, 
Maharashtra-411008, India. 




_______________________________________________ 
The Quantum ESPRESSO community stands by the Ukrainian 
people and expresses its concerns about the devastating 
effects that the Russian military offensive has on their 
country and on the free and peaceful scientific, cultural, 
and economic cooperation amongst peoples 
_______________________________________________ 
Quantum ESPRESSO is supported by MaX (www.max-centre.eu) 
users mailing list users at lists.quantum-espresso.org 
https://lists.quantum-espresso.org/mailman/listinfo/users 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20241015/577e3c14/attachment.html>


More information about the users mailing list