[Pw_forum] default parallelization and parallelization of bands.x
stefano de gironcoli
degironc at sissa.it
Sat Dec 5 21:03:08 CET 2015
The only parallelization that i see in bands is the basic one over R &
G. If it is different from the parallelization used previously you
should use wf_collect.
the code computes the overlap between the orbital at k and k+dk in order
to decide how to connect them. it's an nbnd^2 operation done band by
band. not very efficient evidently but it should not take hours.
you can use wf_collect=.true. and increase the number of processors.
stefano
On 05/12/2015 12:57, Maxim Skripnik wrote:
> Thank you for the information. Yes, at the beginning of the pw.x
> output it says:
> Parallel version (MPI), running on 64 processors
> R & G space division: proc/nbgrp/npool/nimage = 64
>
> Is bands.x parallelized at all? If so, where can I find information on
> that? There's nothing mentioned in the documentation:
> http://www.quantum-espresso.org/wp-content/uploads/Doc/pp_user_guide.pdf
> http://www.quantum-espresso.org/wp-content/uploads/Doc/INPUT_BANDS.html
>
> What could be the reason for bands.x taking many hours to calculate
> the bands? The foregoing pw.x calculation has already determined the
> energy for each k-point along a path (Gamma -> K -> M -> Gamma). There
> are 61 k-points and 129 bands. So what is bands.x actaully doing
> beside reformating that data? The input file job.bands looks like this:
> &bands
> prefix = 'st1'
> outdir = './tmp'
> /
> The calculation is initiated by
> mpirun -np 64 bands.x < job.bands
>
> Maxim Skripnik
> Department of Physics
> University of Konstanz
>
> Am Samstag, 05. Dezember 2015 02:37 CET, stefano de gironcoli
> <degironc at sissa.it> schrieb:
> On 04/12/2015 22:53, Maxim Skripnik wrote:
>> Hello,
>>
>> I'm a bit confused by the parallelization scheme of QE. First of all,
>> I run calculations on a cluster with usually 1 to 8 nodes, each of
>> which has 16 cores. There is a very good scaling of pw.x e.g. for
>> structural relaxation jobs. I do not specify any particular
>> parallelization scheme as mentioned in the documentation, i.e. I
>> start the calculations with
>> mpirun -np 128 pw.x < job.pw
>> on 8 nodes, 16 cores each. According to the documentation ni=1, nk=1
>> and nt=1. So in which respect are the calculations parallelized by
>> default? Why do the calculations scale so well without specifying ni,
>> nk, nt, nd?
> R and G parallelization is performed.
> wavefunctions' planewaves, density planewaves and slices of real space
> objects are distributed across 128 processors. A report of how this is
> done is given at the beginning of the output.
> Did you had a look to it ?
>> Second question is, whether one can speed up bands.x calculations. Up
>> to now I start these this way:
>> mpirun -np 64 bands.x < job.bands
>> on 4 nodes, 16 cores each. Does it make sense to define nb for
>> bands.x? If yes, what would be reasonable values?
> expect no gain. band parallelization is not implemented in bands.
>
> stefano
>
>
>
>
>
>
>> The systems of interest consist of typically ~50 atoms with periodic
>> boundaries.
>>
>> Maxim Skripnik
>> Department of Physics
>> University of Konstanz
>>
>> _______________________________________________
>> Pw_forum mailing list
>> Pw_forum at pwscf.org
>> http://pwscf.org/mailman/listinfo/pw_forum
>
>
>
>
>
> _______________________________________________
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://pwscf.org/mailman/listinfo/pw_forum
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20151205/e17d2c2b/attachment.html>
More information about the users
mailing list