[Q-e-developers] calbec() hungs if # of MPI process > 1

Paolo Giannozzi p.giannozzi at gmail.com
Wed Nov 4 08:00:54 CET 2015


Sec.8.3 explains why some parts of the code are run on a single processor.
You must call parallel code outside regions of code that are running on a
single processor.

Paolo

On Tue, Nov 3, 2015 at 11:15 PM, Ilya Ryabinkin <i.ryabinkin at utoronto.ca>
wrote:

> Not sure if I understood how to proceed from there...
> I read Sec 8.3 but didn't find answers to my issue.
>
> Could you direct me more?
>
> --
> I.
>
> On Tue, Nov 3, 2015 at 4:48 PM, Paolo Giannozzi <p.giannozzi at gmail.com>
> wrote:
> >
> >
> > On Tue, Nov 3, 2015 at 10:16 PM, Ilya Ryabinkin <igryabinkin at gmail.com>
> > wrote:
> >>
> >>
> >> I call get_nacs() subroutine from subroutine verlet(), in particular,
> >> past the line 229 of dynamics_module.f90, if the  matters.
> >
> >
> > it matters. Subroutine verlet is called only by processor 0 ("ionode"),
> so
> > if you try to do anything in parallel inside it, it will hang for sure.
> What
> > is not parallel is often done on a single processor, results are then
> > broadcast to the other processors. Why, it is explained in Sec.8.3. of
> the
> > developer manual
> >
> > Paolo
> >>
> >>
> >> > There isn't anything obviously wrong in your piece of code. Is it
> inside
> >> > a
> >> > modified version of pw.x, or is it stand-alone? in the latter case,
> >> > there
> >> > might be some missing initialization.
> >>
> >> It sounds relevant. Indeed, in extrapolate_wfcs() the very first (as
> >> well as the very last) line is
> >>
> >>   CALL mp_barrier( intra_image_comm ) ! debug
> >>
> >> I tried to add these lines to my get_nacs() as well, but it doesn't do
> >> any good: code hungs immediately, without even going past mp_barrier()
> >>
> >> --
> >> Ilya
> >>
> >>
> >> In the former case...it shouldn't
> >> > happen. Anyway: try first of all to figure out where your code hangs.
> >> > "calbec" is an interface to several different routines. Maybe it goes
> >> > into
> >> > the wrong one. It should just do a call to DGEMM or ZGEMM and a call
> to
> >> > mp_sum (wrapper to mpi_reduce)
> >> >
> >> > Paolo
> >> >
> >> > On Tue, Nov 3, 2015 at 7:47 PM, Ilya Ryabinkin <igryabinkin at gmail.com
> >
> >> > wrote:
> >> >>
> >> >> Dear colleagues:
> >> >> I'm developing a code to calculate the non-adiabatic couplings using
> >> >> finite-difference scheme. The problem I currently stumbled upon is
> the
> >> >> following:
> >> >>
> >> >> I need to calculate the overlap matrix between MOs at t-dt and t,
> >> >> gamma_only. What  I'm doing:
> >> >>
> >> >> Code is borrowed from update_pot.f90, subroutine extrapolate_wfcs(),
> >> >> lines  595-600
> >> >>
> >> >>       ALLOCATE( evcold( npwx, nbnd ), aux( npwx, nbnd ) )
> >> >>       !! Retreive old WF at t-dt from *.oldwfc file
> >> >>       CALL davcio( evcold, 2*nwordwfc, iunoldwfc, 1, -1 )
> >> >>       !! Get current WF at t
> >> >>       aux = evc
> >> >>       !
> >> >>      ALLOCATE( S( nbnd, nbnd ), O( nbnd, nbnd ) )
> >> >>       CALL calbec( npw, aux, evcold, S ) ! Get overlap S(t-dt, t)
> >> >>       !
> >> >>       O = ANINT( S )    ! O is a rounded S matrix to perform phase
> and
> >> >>                                 ! ordering matching
> >> >>       IF ( ANY( diag( O ) == 0.0_dp ) ) THEN
> >> >>          ...
> >> >>
> >> >> Essentially, my code hungs at calbec() call if I ran it in parallell
> >> >> with # of MPI process >1. gdb shows that processes are in a polling
> >> >> loop, which means that they are wait for something. I'm baffled at
> >> >> this point, as there is nothing they should wait for.
> >> >>
> >> >> What I am missing?
> >> >>
> >> >> Thanks for the help,
> >> >> Ilya
> >> >> _______________________________________________
> >> >> Q-e-developers mailing list
> >> >> Q-e-developers at qe-forge.org
> >> >> http://qe-forge.org/mailman/listinfo/q-e-developers
> >> >
> >> >
> >> >
> >> >
> >> > --
> >> > Paolo Giannozzi, Dept. Chemistry&Physics&Environment,
> >> > Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
> >> > Phone +39-0432-558216, fax +39-0432-558222
> >> >
> >> > _______________________________________________
> >> > Q-e-developers mailing list
> >> > Q-e-developers at qe-forge.org
> >> > http://qe-forge.org/mailman/listinfo/q-e-developers
> >> >
> >>
> >> _______________________________________________
> >> Q-e-developers mailing list
> >> Q-e-developers at qe-forge.org
> >> http://qe-forge.org/mailman/listinfo/q-e-developers
> >>
> >
> >
> >
> > --
> > Paolo Giannozzi, Dept. Chemistry&Physics&Environment,
> > Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
> > Phone +39-0432-558216, fax +39-0432-558222
> >
> > _______________________________________________
> > Q-e-developers mailing list
> > Q-e-developers at qe-forge.org
> > http://qe-forge.org/mailman/listinfo/q-e-developers
> >
> _______________________________________________
> Q-e-developers mailing list
> Q-e-developers at qe-forge.org
> http://qe-forge.org/mailman/listinfo/q-e-developers
>



-- 
Paolo Giannozzi, Dept. Chemistry&Physics&Environment,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
Phone +39-0432-558216, fax +39-0432-558222
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/developers/attachments/20151104/945f0dba/attachment.html>


More information about the developers mailing list