<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Dear Julien,</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
To solve both problems, I would define a new atomic type for the couple of interest and apply the inter-site V between these two "Hubbard atoms". All the other atoms will be treated as non-Hubbard, and hence the bottleneck in alloc_neigh should disappear. </div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Greetings,</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Iurii</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div id="Signature" class="elementToProof">
<div style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(102, 102, 102);">
----------------------------------------------------------</div>
<div style="font-family: Cambria, Georgia, serif; font-size: 12pt; color: rgb(102, 102, 102);">
Dr. Iurii TIMROV</div>
<div style="font-family: Cambria, Georgia, serif; font-size: 12pt; color: rgb(102, 102, 102);">
Tenure-track scientist</div>
<div style="font-family: Cambria, Georgia, serif; font-size: 12pt; color: rgb(102, 102, 102);">
Laboratory for Materials Simulations (LMS)</div>
<div style="font-family: Cambria, Georgia, serif; font-size: 12pt; color: rgb(102, 102, 102);">
Paul Scherrer Institut (PSI)</div>
<div style="font-family: Cambria, Georgia, serif; font-size: 12pt; color: rgb(102, 102, 102);">
CH-5232 Villigen, Switzerland</div>
<div style="font-family: Cambria, Georgia, serif; font-size: 12pt; color: rgb(102, 102, 102);">
+41 56 310 62 14</div>
<div style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<a href="https://www.psi.ch/en/lms/people/iurii-timrov" target="_blank" rel="noopener noreferrer" data-auth="NotApplicable" data-linkindex="0">https://www.psi.ch/en/lms/people/iurii-timrov</a></div>
</div>
<div id="appendonsend"></div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> users <users-bounces@lists.quantum-espresso.org> on behalf of julien_barbaud@sjtu.edu.cn <julien_barbaud@sjtu.edu.cn><br>
<b>Sent:</b> Monday, May 5, 2025 01:10<br>
<b>To:</b> users@lists.quantum-espresso.org <users@lists.quantum-espresso.org><br>
<b>Subject:</b> [QE-users] DFT+U+V neighbour allocation extremely slow and not parallelized</font>
<div> </div>
</div>
<style>
<!--
@font-face
{font-family:"Cambria Math"}
@font-face
{font-family:DengXian}
@font-face
{font-family:Calibri}
@font-face
{font-family:DengXian}
p.x_MsoNormal, li.x_MsoNormal, div.x_MsoNormal
{margin-top:0in;
margin-right:0in;
margin-bottom:8.0pt;
margin-left:0in;
line-height:106%;
font-size:11.0pt;
font-family:"Calibri",sans-serif}
span.x_EmailStyle17
{font-family:"Calibri",sans-serif;
color:windowtext}
.x_MsoChpDefault
{font-family:"Calibri",sans-serif}
@page WordSection1
{margin:1.0in 1.25in 1.0in 1.25in}
div.x_WordSection1
{}
-->
</style>
<div lang="EN-US" link="#0563C1" vlink="#954F72" style="word-wrap:break-word">
<div class="x_WordSection1">
<p class="x_MsoNormal">Dear QE users,</p>
<p class="x_MsoNormal"> </p>
<p class="x_MsoNormal">I am currently trying to run calculations on a 324-atom crystal system using DFT+U+V (I’m running a recompiled version of QE 7.2 with an increased natx parameter). More specifically, I am trying to use only the V parameter on the combination
of two specific atomic orbitals which are supposed to form a polaronic dimer (the Hubbard potential is helping to localize the polaronic charge on the dimer after tuning). This worked well on smaller systems.</p>
<p class="x_MsoNormal">The issue that I’m facing with the large system is that the calculation is taking very long to initiate (~20 000 seconds), while the iterations themselves are relatively fast (~3000 seconds).This seems strange that the preliminary calculations
take so much longer than the actual iterations. On smaller systems (96 atoms), I was not observing that trend, the bottleneck was the iterations as expected and there was barely any initialization time.</p>
<p class="x_MsoNormal">The most concerning aspect is that this bottleneck does not seem to be effectively distributed in parallel, because no matter how many nodes I use, the pre-iteration calculations always seem to take about 20 000 seconds (meanwhile, the
iteration time does decrease when using more nodes)</p>
<p class="x_MsoNormal">The routine taking up all that time is the Hubbard routine “alloc_neigh”. I thought it might have been due to some memory issues, but looking at the memory usage of the nodes during the job, it seems to have only used about 30% of memory
according to slurm “seff” utility. I have also experimented with different IO settings to try to reduce memory usage without success.</p>
<p class="x_MsoNormal"> </p>
<p class="x_MsoNormal">Is there a way to speed up that part of the calculation, or at least to distribute it over several nodes? Am I doing something wrong in my input?
</p>
<p class="x_MsoNormal">Additionally, it seems that the program is calculating Hubbard projectors for every single atom of the species, even though I am only applying a V parameter on two hand-picked atoms, which seems very wasteful indeed. Is there a way to
force the program to drop the Hubbard calculations on the atoms of the same species which do not receive a V value?</p>
<p class="x_MsoNormal"> </p>
<p class="x_MsoNormal">I have attached the input and output file of an example job (charge has been set to 0 in that particular one, but same problem happens when adding the polaronic charge)</p>
<p class="x_MsoNormal"> </p>
<p class="x_MsoNormal">Thanks in advance!</p>
<p class="x_MsoNormal">Julien</p>
<p class="x_MsoNormal"> </p>
</div>
</div>
</body>
</html>