[Q-e-developers] nb less than the number of proc
Carlo Cavazzoni
c.cavazzoni at cineca.it
Thu Jun 13 12:15:35 CEST 2013
Dear Giovanni,
I agree that the message is misleading (and wrong), but nb is not the
number of bands,
nb is the block size used to distribute the hamiltonian (matrix whose
size is number of bands x number of bands).
Whereas desc%n is indeed the number of bands.
The subroutine blk2cyc_redist is called when you are using internal
parallel diagonalization,
which is not block parallelized, but parallelized row by row, then
if the number of processor is less the number of bands, this internal
parallel subroutine does not work.
To avoid the problem it should be enough to
use just one processor in the parallelization matrix: -northo 1
carlo
Il 11/06/2013 17:47, xiaochuan Ge ha scritto:
> Dear all,
>
> I wish this is the right place to ask this question.
>
> Since, I don't know exactly when, but definitely this year, when one
> try to do pw calculation with many cores, the code will stop and leave
> this error message:
>
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> Error in routine blk2cyc_redist (1):
> nb less than the number of proc
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
> The error usually happens when the number of cores is larger than the
> number of bands. While USING OLD VERSION OF THE CODE WON"T HAVE THIS
> PROBLEM.
>
> If one looks into the code the meaning of nb seems misleading. Because
> in the code, the trigger of this error is due to this command:
>
> IF( desc%n < nproc ) &
> CALL errore( ' cyc2blk_redist ', ' nb less than the number of proc
> ', 1 )
>
> While the definition of nb at this part was defined as:
> nb = desc%nrcx ! leading dimension of the local matrix block
>
> On the other hand this error happens when the number of processor is
> larger than the Number of Bands.
>
> So what NB actually means here? After all, I really don't understand
> why the code should stop when I am using 32 cores to calculate a
> system with 6 bands. The plane wave parallelization should not care
> about how many bands there are in total.
>
> Could anyone give any comments, thank you very much. I don't attach an
> input here, because one could easily reproduce this error by
> calculating any small molecule with many cores on a parallel machine.
>
> ===================
> Ge Xiaochuan(Giovanni)
> 4th year PHD Student
> Condensed Matter
> SISSA,Italy
> ===================
>
>
> _______________________________________________
> Q-e-developers mailing list
> Q-e-developers at qe-forge.org
> http://qe-forge.org/mailman/listinfo/q-e-developers
--
Ph.D. Carlo Cavazzoni
SuperComputing Applications and Innovation Department
CINECA - Via Magnanelli 6/3, 40033 Casalecchio di Reno (Bologna)
Tel: +39 051 6171411 Fax: +39 051 6132198
www.cineca.it
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/developers/attachments/20130613/319f726b/attachment.html>
More information about the developers
mailing list