[Q-e-developers] Behavior of band-parallelization in QE 6.1

Wed Jul 12 15:09:28 CEST 2017

Dear Ryan

the band parallelization for exact-exchange has undergone some reshuffling.
See this paper: Taylor A. Barnes et al., "Improved treatment of exact
exchange in Quantum ESPRESSO", Computer Physics Communications, May 31,
2017. From what I understand, the starting calculation (no exact exchange)
is performed as usual, with no band parallelization, while parallellization
on pairs of bands is applied to the exact-exchange calculation. I am
forwarding to Taylor and Thorsten who may know better than me (Thorsten: I
haven't forgotten your patch of two months ago! it is in the pipeline of
things to be done)

Paolo

On Mon, Jul 10, 2017 at 9:55 PM, Ryan McAvoy <mcavor11 at gmail.com> wrote:

> Hello,
>
> I am Ryan L. McAvoy, a PhD student in Giulia Galli's group. I am trying to
> use the band parallelization for hybrids in QE 6.1 and I am finding
> unexpected behavior. I have created a test case on a small test system (the
> zinc dimer) to illustrate the following behavior. I have attached those
> files.
>
>
>    1. The output informing the user that there is band-parallelization
>    for a hybrid functional appears to have been broken as changing the number
>    of band groups with -nbgrp does not trigger the output that should occur
>    from subroutine parallel_info() in environment.f90, which would indicate it
>    believes nbgrp to be 1. This may be triggered by the statement
>    "mp_start_bands(1 ,...." at line 94 of mp_global.f90 as that I have checked
>    that "nband_" is the correct value after "CALL get_command_line()" in
>    mp_global.f90
>    2. The number of planewaves appears to be distributed over all of the
>    processors even at large numbers of "nband_". I have checked that this is
>    more than an output error by printing the lda(npw) at each run of h_psi and
>    it exactly conforms to what one would expect by dividing the total number
>    of planewaves  by the number of processors(plus a factor of 1/2 for gamma
>    tricks).
>    3. Behavior 2 prevents me from scaling to as large a number of
>    processors as I could with QE 6.0. As using QE 6.0 hybrids on C60, I could
>    run on 8000+ processors on the BGQ machine Cetus at Argonne National Lab
>    but with QE 6.1 the output says that it has run out of planewaves even with
>    a large number of band groups(I have demonstrated this behavior below on
>    the zinc dimer on 640 Intel processors to aid reproducibility)
>
> Is #2 the intended behavior for this new parallelization method?
>
> Thank you for your time and attention to this matter,
> Ryan L. McAvoy
>
> ............................................................
> ............................................................
> ...........................................................
>
>
> My run scripts are of the form
>
> module load mkl/11.2
> module load intelmpi/5.0+intel-15.0
>
> QE_BIN_DIR=PUTPATHHERE/qe-6.1/bin
>
> export MPI_TASKS=$SLURM_NTASKS
>
> exe=${QE_BIN_DIR}/pw.x
>
> export OMP_NUM_THREADS=1
>
> nband=10
> mpirun -n $MPI_TASKS ${exe} -nb $nband <  ${fileVal}.in >
>  ${fileVal}_nband${nband}_nproc${MPI_TASKS}.out
>
>
>
> _______________________________________________
> Q-e-developers mailing list
> Q-e-developers at qe-forge.org
> http://qe-forge.org/mailman/listinfo/q-e-developers
>
>

-- 
Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
Phone +39-0432-558216, fax +39-0432-558222
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/developers/attachments/20170712/f0c66255/attachment.html>