<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=us-ascii"><meta name=Generator content="Microsoft Word 15 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:SimSun;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:SimSun;
panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:#954F72;
text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0cm;
margin-right:0cm;
margin-bottom:0cm;
margin-left:36.0pt;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
span.EmailStyle18
{mso-style-type:personal-compose;
font-family:"Calibri","sans-serif";
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;
font-family:"Calibri","sans-serif";}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:1219393627;
mso-list-type:hybrid;
mso-list-template-ids:198210034 67698689 67698691 67698693 67698689 67698691 67698693 67698689 67698691 67698693;}
@list l0:level1
{mso-level-number-format:bullet;
mso-level-text:\F0B7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Symbol;}
@list l0:level2
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Courier New";}
@list l0:level3
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;}
@list l0:level4
{mso-level-number-format:bullet;
mso-level-text:\F0B7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Symbol;}
@list l0:level5
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Courier New";}
@list l0:level6
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;}
@list l0:level7
{mso-level-number-format:bullet;
mso-level-text:\F0B7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Symbol;}
@list l0:level8
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Courier New";}
@list l0:level9
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;}
ol
{margin-bottom:0cm;}
ul
{margin-bottom:0cm;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-US link="#0563C1" vlink="#954F72"><div class=WordSection1><p class=MsoNormal>Dear users,<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>I am starting to use a hpc cluster of my university, but I am very green on parallel computation.<o:p></o:p></p><p class=MsoNormal>I have made a first test (test #1) on a very small-scale simulation (relaxation of a GO sheet with 19 atoms, with respect to the gamma point). The calculation took 3m20s to run on 1 proc on my personal computer. On the cluster with 4 proc and default parallel options, it took 1m5s, and on 8 proc it took 44s. This seems like a reasonable behavior, and at least shows that raising the number of procs does reduce computation time in this case (with obvious limitations if too many procs for the job).<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>However I tried with another test, a bit bigger (test #2). This example is a scf calculation with 120 atoms (still with respect to the gamma point). In this case, the parallelization brings absolutely no improvement. In fact, although the <i>outfile</i> confirms that the code is running on N procs, it has similar performances as if it was running on 1 proc (sometimes even worse actually, but probably not in a significant manner, as the times are fluctuating a bit from 1 run to another)<o:p></o:p></p><p class=MsoNormal>I tried to run this same input file on my personal computer both on 1 and 2 cores. Turns out that it takes 10376s to run 10 iterations on 1 core, while it takes 6777s on two cores, so it seems that the parallelization is doing ok on my computer.<o:p></o:p></p><p class=MsoNormal>I have tried to run with different number of cores on the hpc, and different parallelization options (like for instance –nb 4), but nothing seems to improve the time<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Basically, I am stuck with those 2 seemingly conflicting facts:<o:p></o:p></p><p class=MsoListParagraph style='text-indent:-18.0pt;mso-list:l0 level1 lfo2'><![if !supportLists]><span style='font-family:Symbol'><span style='mso-list:Ignore'>·<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]>Parallelization seems to have no particular problem on the hpc cluster because test #1 gives good results<o:p></o:p></p><p class=MsoListParagraph style='text-indent:-18.0pt;mso-list:l0 level1 lfo2'><![if !supportLists]><span style='font-family:Symbol'><span style='mso-list:Ignore'>·<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]>Parallelization seems to have no particular problem with the particular input file #2 because it seems to scale reasonably with proc number on my individual computer<o:p></o:p></p><p class=MsoNormal>However, combining both and running this file in parallel on the hpc cluster ends up not working correctly…<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>I attached the input file and output file of test #2 (this output only ran 1 iteration because I interrupted it). I also included as well as the slurm script that I use to submit the calculation to the job manager, in case it helps (bonding.scf.slurm)<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Any suggestion on what is going wrong would be very welcome.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Thanks in advance, <o:p></o:p></p><p class=MsoNormal>Julien<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p></div></body></html>