Thank you FAbbio, I will try elpa and let you know.<br><br>cheers<br><br>Layla<br><br><div class="gmail_quote">2012/10/1 Layla Martin-Samos <span dir="ltr"><<a href="mailto:lmartinsamos@gmail.com" target="_blank">lmartinsamos@gmail.com</a>></span><br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Ciao Ivan, I explain myself very badly, first look at this:<br><br>computer   mpi process     threads    ndiag   complex/gamma_only   time for diaghg    version   Libs<div class="im">

<br><div><br> jade         32             1          1      complex (cdiaghg)                                  27.44 s      4.3.2     sequential<br>


 jade         32             1          1      complex (cdiaghg)                                 > 10 min     4.3.2     threads<br> jade         32             1          1      complex (cdiaghg)                                 > 10 min     5.0.1     threads<br>


</div><br></div>it is exactly the same job, just the libs have changed.<br><br>For BGQ I run with bg_size=128, threads 4. My concern is that diaghg is 3 times slower than jade but with 4 times more mpi + threads. I was wondering if it is convenient to use threads in the diag or the libs are less efficient? boh! <br>

<div class="HOEnZb"><div class="h5">

<br><div class="gmail_quote">2012/10/1 Ivan Girotto <span dir="ltr"><<a href="mailto:igirotto@ictp.it" target="_blank">igirotto@ictp.it</a>></span><br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


  <div bgcolor="#FFFFFF" text="#000000">

    Hi Layla,<br>

    <br>

    I have never tried with 1thread as it's not recommended on BG/Q. At

    least 2threads x MPI process. <br>

    On the other hand I have some doubt about the way you are running

    the jobs? How big is the BG size? How many processes are you running

    per node in each of the 2 cases?<br>

    <br>

    I have some doubt about the question itself too. You are saying that

    you see a slow down comparing 4Threads Vs 1 Threads. But from the

    table below you report only data with 4 threads for BG/Q.<br>

    Have you perhaps switched the headers of the two columns Threads and

    Ndiag? <br>

    <br>

    It's expected that the SGI architecture based on Intel processors is

    faster than BG/Q. <br>

    <br>

    Ivan<div><div><br>

    <br>

    On 01/10/2012 13:17, Layla Martin-Samos wrote:

    </div></div><blockquote type="cite"><div><div>Dear all, I have made some test calculations on Fermi

      and Jade, for a 107 atoms system, 70 Ry cutoff for wfc, 285

      occupied bands and 1 kpoint. What the results seems to show is

      that the diagonalization with multithreads lib seems to

      considerably slowdown the diagonalization time (diaghd is called

      33 times on all the jobs and the final results are identical). The

      compiled cineca version gives identical time and results than

      5.0.1.  Note that jade in sequential is faster than BGQ. I am

      continuing some other tests on jade, unfortunatelly the runs stay

      a lot of time in the queue, the machine is full and even for a 10

      min job with 32 cores you wait more than 3 hours. As attachement I

      put the two make.sys for jade. <br>

      <br>

      <br>

      omputer   mpi process     threads    ndiag   complex/gamma_only  

      time for diaghg    version   Libs<br>

      <br>

       bgq         128             4          1      complex

      (cdiaghg)                                 69.28 s      5.0.1    

      threads<br>

      bgq          128             4          1      complex

      (cdiaghg)                                 69.14 s      4.3.2    

      threads<br>

      <br>

       jade         32             1          1      complex

      (cdiaghg)                                  27.44 s      4.3.2    

      sequential<br>

       jade         32             1          1      complex

      (cdiaghg)                                 > 10 min    

      4.3.2     threads<br>

       jade         32             1          1      complex

      (cdiaghg)                                 > 10 min    

      5.0.1     threads<br>

      <br>

      <br>

      bgq          128             4          4      complex

      (cdiaghg)                                310.52 s      5.0.1   

      threads<br>

      <br>

      bgq          128             4          4      gamma

      (rdiaghg)                                    73.87 s      5.0.1   

      threads<br>

      bgq          128             4          4      gamma

      (rdiaghg)                                    73.71 s      4.3.2   

      threads<br>

      <br>

      bgq          128             4          1      gamma

      (rdiaghg)                               CRASH 2 it     5.0.1   

      threads<br>

      bgq          128             4          1      gamma

      (rdiaghg)                               CRASH 2 it     4.3.2   

      threads<br>

      <br>

      <br>

      did someone observe a similar behavior? <br>

      <br>

      cheers <br>

      <br>

      Layla<br>

      <br>

      <br>

      <br>

      <br>

      <br>

      <fieldset></fieldset>

      <br>

      </div></div><pre>_______________________________________________

Q-e-developers mailing list

<a href="mailto:Q-e-developers@qe-forge.org" target="_blank">Q-e-developers@qe-forge.org</a>

<a href="http://qe-forge.org/mailman/listinfo/q-e-developers" target="_blank">http://qe-forge.org/mailman/listinfo/q-e-developers</a><span><font color="#888888">

</font></span></pre><span><font color="#888888">

    </font></span></blockquote><span><font color="#888888">

    <br>

    <pre cols="72">-- 


Ivan Girotto - <a href="mailto:igirotto@ictp.it" target="_blank">igirotto@ictp.it</a>

High Performance Computing Specialist

Information & Communication Technology Section

The Abdus Salam - <a href="http://www.ictp.it" target="_blank">www.ictp.it</a>

International Centre for Theoretical Physics

Strada Costiera, 11 - 34151 Trieste - IT

Tel <a href="tel:%2B39.040.2240.484" value="+390402240484" target="_blank">+39.040.2240.484</a>

Fax <a href="tel:%2B39.040.2240.249" value="+390402240249" target="_blank">+39.040.2240.249</a>

</pre>

  </font></span></div>


<br>_______________________________________________<br>

Q-e-developers mailing list<br>

<a href="mailto:Q-e-developers@qe-forge.org" target="_blank">Q-e-developers@qe-forge.org</a><br>

<a href="http://qe-forge.org/mailman/listinfo/q-e-developers" target="_blank">http://qe-forge.org/mailman/listinfo/q-e-developers</a><br>

<br></blockquote></div><br>

</div></div></blockquote></div><br>