[Q-e-developers] lelfield performance

Fri Apr 20 17:10:37 CEST 2012

Dear developers,
there is a serious scaling problem with the lelfield code when used for  
isolated systems (i.e. only one kpoint): the mp_sum
       call mp_sum(aux_g(:))
around line 450 of c_phase_field.f90 becomes *slower* with the number of  
processors until it takes ~90% of the *total* wall time for about 16 cores.

I cannot really understand what's going on in that part of the code,  
anybody familiar with it is available to give a few tips?

bests

P.S. I cannot understand where to use intra_bgrp_comm and where to use  
intra_pool_comm any more.
I have the impression that you should use the pool communicator when a sum  
is spread over k-points, and the bgrp communicator in a very few cases  
where a sum is spread over bands. Yet most intra_pool have become  
intra_bgrp, so I am confused: is the bgrp contained in the pool? Or the  
other way round? Or are they independent?

-- 
Lorenzo Paulatto IdR @ IMPMC/CNRS & Université Paris 6
phone: +33 (0)1 44275 084 / skype: paulatz
www:   http://www-int.impmc.upmc.fr/~paulatto/
mail:  23-24/4é16 Boîte courrier 115, 4 place Jussieu 75252 Paris Cédex 05