[Q-e-developers] calbec() hungs if # of MPI process > 1
Ilya Ryabinkin
igryabinkin at gmail.com
Thu Nov 5 22:28:28 CET 2015
Paolo:
I have moved get_nacs() call to the beginning of 'move_ions' module.
This is the right place, I believe, since NACs are to be computed
after a new evc is available but before any ionic motions have been
performed.
get_nacs() works now without a hitch on any number of MPI processes.
Results are the same up to the accuracy of orbitals (bands).
I think the issue is closed now, thanks for the help!
--
Ilya
On Wed, Nov 4, 2015 at 2:00 AM, Paolo Giannozzi <p.giannozzi at gmail.com> wrote:
> Sec.8.3 explains why some parts of the code are run on a single processor.
> You must call parallel code outside regions of code that are running on a
> single processor.
>
> Paolo
>
> On Tue, Nov 3, 2015 at 11:15 PM, Ilya Ryabinkin <i.ryabinkin at utoronto.ca>
> wrote:
>>
>> Not sure if I understood how to proceed from there...
>> I read Sec 8.3 but didn't find answers to my issue.
>>
>> Could you direct me more?
>>
>> --
>> I.
>>
>> On Tue, Nov 3, 2015 at 4:48 PM, Paolo Giannozzi <p.giannozzi at gmail.com>
>> wrote:
>> >
>> >
>> > On Tue, Nov 3, 2015 at 10:16 PM, Ilya Ryabinkin <igryabinkin at gmail.com>
>> > wrote:
>> >>
>> >>
>> >> I call get_nacs() subroutine from subroutine verlet(), in particular,
>> >> past the line 229 of dynamics_module.f90, if the matters.
>> >
>> >
>> > it matters. Subroutine verlet is called only by processor 0 ("ionode"),
>> > so
>> > if you try to do anything in parallel inside it, it will hang for sure.
>> > What
>> > is not parallel is often done on a single processor, results are then
>> > broadcast to the other processors. Why, it is explained in Sec.8.3. of
>> > the
>> > developer manual
>> >
>> > Paolo
>> >>
>> >>
>> >> > There isn't anything obviously wrong in your piece of code. Is it
>> >> > inside
>> >> > a
>> >> > modified version of pw.x, or is it stand-alone? in the latter case,
>> >> > there
>> >> > might be some missing initialization.
>> >>
>> >> It sounds relevant. Indeed, in extrapolate_wfcs() the very first (as
>> >> well as the very last) line is
>> >>
>> >> CALL mp_barrier( intra_image_comm ) ! debug
>> >>
>> >> I tried to add these lines to my get_nacs() as well, but it doesn't do
>> >> any good: code hungs immediately, without even going past mp_barrier()
>> >>
>> >> --
>> >> Ilya
>> >>
>> >>
>> >> In the former case...it shouldn't
>> >> > happen. Anyway: try first of all to figure out where your code hangs.
>> >> > "calbec" is an interface to several different routines. Maybe it goes
>> >> > into
>> >> > the wrong one. It should just do a call to DGEMM or ZGEMM and a call
>> >> > to
>> >> > mp_sum (wrapper to mpi_reduce)
>> >> >
>> >> > Paolo
>> >> >
>> >> > On Tue, Nov 3, 2015 at 7:47 PM, Ilya Ryabinkin
>> >> > <igryabinkin at gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> Dear colleagues:
>> >> >> I'm developing a code to calculate the non-adiabatic couplings using
>> >> >> finite-difference scheme. The problem I currently stumbled upon is
>> >> >> the
>> >> >> following:
>> >> >>
>> >> >> I need to calculate the overlap matrix between MOs at t-dt and t,
>> >> >> gamma_only. What I'm doing:
>> >> >>
>> >> >> Code is borrowed from update_pot.f90, subroutine extrapolate_wfcs(),
>> >> >> lines 595-600
>> >> >>
>> >> >> ALLOCATE( evcold( npwx, nbnd ), aux( npwx, nbnd ) )
>> >> >> !! Retreive old WF at t-dt from *.oldwfc file
>> >> >> CALL davcio( evcold, 2*nwordwfc, iunoldwfc, 1, -1 )
>> >> >> !! Get current WF at t
>> >> >> aux = evc
>> >> >> !
>> >> >> ALLOCATE( S( nbnd, nbnd ), O( nbnd, nbnd ) )
>> >> >> CALL calbec( npw, aux, evcold, S ) ! Get overlap S(t-dt, t)
>> >> >> !
>> >> >> O = ANINT( S ) ! O is a rounded S matrix to perform phase
>> >> >> and
>> >> >> ! ordering matching
>> >> >> IF ( ANY( diag( O ) == 0.0_dp ) ) THEN
>> >> >> ...
>> >> >>
>> >> >> Essentially, my code hungs at calbec() call if I ran it in parallell
>> >> >> with # of MPI process >1. gdb shows that processes are in a polling
>> >> >> loop, which means that they are wait for something. I'm baffled at
>> >> >> this point, as there is nothing they should wait for.
>> >> >>
>> >> >> What I am missing?
>> >> >>
>> >> >> Thanks for the help,
>> >> >> Ilya
>> >> >> _______________________________________________
>> >> >> Q-e-developers mailing list
>> >> >> Q-e-developers at qe-forge.org
>> >> >> http://qe-forge.org/mailman/listinfo/q-e-developers
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > Paolo Giannozzi, Dept. Chemistry&Physics&Environment,
>> >> > Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
>> >> > Phone +39-0432-558216, fax +39-0432-558222
>> >> >
>> >> > _______________________________________________
>> >> > Q-e-developers mailing list
>> >> > Q-e-developers at qe-forge.org
>> >> > http://qe-forge.org/mailman/listinfo/q-e-developers
>> >> >
>> >>
>> >> _______________________________________________
>> >> Q-e-developers mailing list
>> >> Q-e-developers at qe-forge.org
>> >> http://qe-forge.org/mailman/listinfo/q-e-developers
>> >>
>> >
>> >
>> >
>> > --
>> > Paolo Giannozzi, Dept. Chemistry&Physics&Environment,
>> > Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
>> > Phone +39-0432-558216, fax +39-0432-558222
>> >
>> > _______________________________________________
>> > Q-e-developers mailing list
>> > Q-e-developers at qe-forge.org
>> > http://qe-forge.org/mailman/listinfo/q-e-developers
>> >
>> _______________________________________________
>> Q-e-developers mailing list
>> Q-e-developers at qe-forge.org
>> http://qe-forge.org/mailman/listinfo/q-e-developers
>
>
>
>
> --
> Paolo Giannozzi, Dept. Chemistry&Physics&Environment,
> Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
> Phone +39-0432-558216, fax +39-0432-558222
>
> _______________________________________________
> Q-e-developers mailing list
> Q-e-developers at qe-forge.org
> http://qe-forge.org/mailman/listinfo/q-e-developers
>
More information about the developers
mailing list