[Pw_forum] default parallelization and parallelization of bands.x

Sat Dec 5 21:03:08 CET 2015

The only parallelization that i see in bands is the basic one over R & 
G. If it is different from the parallelization used previously you 
should use wf_collect.
the code computes the overlap between the orbital at k and k+dk in order 
to decide how to connect them. it's an nbnd^2 operation done band by 
band. not very efficient evidently but it should not take hours.
you can use wf_collect=.true. and increase the number of processors.

stefano

On 05/12/2015 12:57, Maxim Skripnik wrote:
> Thank you for the information. Yes, at the beginning of the pw.x 
> output it says:
>      Parallel version (MPI), running on    64 processors
>      R & G space division:  proc/nbgrp/npool/nimage =      64
>
> Is bands.x parallelized at all? If so, where can I find information on 
> that? There's nothing mentioned in the documentation:
> http://www.quantum-espresso.org/wp-content/uploads/Doc/pp_user_guide.pdf
> http://www.quantum-espresso.org/wp-content/uploads/Doc/INPUT_BANDS.html
>
> What could be the reason for bands.x taking many hours to calculate 
> the bands? The foregoing pw.x calculation has already determined the 
> energy for each k-point along a path (Gamma -> K -> M -> Gamma). There 
> are 61 k-points and 129 bands. So what is bands.x actaully doing 
> beside reformating that data? The input file job.bands looks like this:
>  &bands
>     prefix   = 'st1'
>     outdir   = './tmp'
> /
> The calculation is initiated by
> mpirun -np 64 bands.x < job.bands
>
> Maxim Skripnik
> Department of Physics
> University of Konstanz
>
> Am Samstag, 05. Dezember 2015 02:37 CET, stefano de gironcoli 
> <degironc at sissa.it> schrieb:
> On 04/12/2015 22:53, Maxim Skripnik wrote:
>> Hello,
>>
>> I'm a bit confused by the parallelization scheme of QE. First of all, 
>> I run calculations on a cluster with usually 1 to 8 nodes, each of 
>> which has 16 cores. There is a very good scaling of pw.x e.g. for 
>> structural relaxation jobs. I do not specify any particular 
>> parallelization scheme as mentioned in the documentation, i.e. I 
>> start the calculations with
>> mpirun -np 128 pw.x < job.pw
>> on 8 nodes, 16 cores each. According to the documentation ni=1, nk=1 
>> and nt=1. So in which respect are the calculations parallelized by 
>> default? Why do the calculations scale so well without specifying ni, 
>> nk, nt, nd?
> R and G parallelization is performed.
> wavefunctions' planewaves, density planewaves and slices of real space 
> objects are distributed across 128 processors. A report of how this is 
> done is given at the beginning of the output.
> Did you had a look to it ?
>> Second question is, whether one can speed up bands.x calculations. Up 
>> to now I start these this way:
>> mpirun -np 64 bands.x < job.bands
>> on 4 nodes, 16 cores each. Does it make sense to define nb for 
>> bands.x? If yes, what would be reasonable values?
> expect no gain. band parallelization is not implemented in bands.
>
> stefano
>
>
>
>
>
>
>> The systems of interest consist of typically ~50 atoms with periodic 
>> boundaries.
>>
>> Maxim Skripnik
>> Department of Physics
>> University of Konstanz
>>
>> _______________________________________________
>> Pw_forum mailing list
>> Pw_forum at pwscf.org
>> http://pwscf.org/mailman/listinfo/pw_forum
>
>
>
>
>
> _______________________________________________
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://pwscf.org/mailman/listinfo/pw_forum

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20151205/e17d2c2b/attachment.html>