[Pw_forum] Gathering 3-d arrays across pools using QE's mp_sum
Vahid Askarpour
vh261281 at dal.ca
Mon Oct 31 00:59:09 CET 2016
I allocate an array that needs to be calculated among pools using:
ALLOCATE(A(3,nbnd,nkf)), where nkf is the number of k points per pool and initialized as:
A(:,:,:)=0
Once array A is calculated, I allocate array A_tot as:
ALLOCATE(A_tot(3,nbnd,nkf_tot)), where nkf_tot is the total number of k points. and initialized as:
A_tot(:,:,:)=0
Then I call the poolgather1 as below:
#if defined(__MPI)
CALL poolgather1(3,nbnd,nkf_tot,nkf,A,A_tot)
#else
A_tot=A
#endif
I have tried both allocating A_tot outside and inside the #if defined section.
Poolgather1 is similar to other EPW poolgather and QE poolcollect routines and is given below:
subroutine poolgather1 (nsize1,nsize2,nkstot,nks, f_in, f_out)
!--------------------------------------------------------------------
USE kinds, ONLY : DP
USE mp_global, ONLY : my_pool_id, inter_pool_comm, npool,kunit
USE mp, ONLY : mp_barrier, mp_bcast,mp_sum
USE mp_world, ONLY : mpime
implicit none
!
INTEGER, INTENT (in) :: nsize1
!! first dimension of vectors f_in and f_out
INTEGER, INTENT (in) :: nsize2
!! second dimension of vectors f_in and f_out
INTEGER, INTENT (in) :: nks
!! number of k-points per pool
INTEGER, INTENT (in) :: nkstot
!! total number of k-points
REAL (KIND=DP), INTENT (in) :: f_in(nsize1,nsize2,nks)
! input ( only for k-points of mypool )
REAL (KIND=DP), INTENT (out) :: f_out(nsize1,nsize2,nkstot)
! output ( contains values for all k-point )
!
#if defined(__MPI)
INTEGER :: rest, nbase
! the rest of the integer division nkstot / npool
! the position in the original list
rest = nkstot / kunit - ( nkstot / kunit / npool ) * npool
!
nbase = nks * my_pool_id
!
IF ( ( my_pool_id + 1 ) > rest ) nbase = nbase + rest * kunit
!
f_out = 0.d0
f_out(1:nsize1,1:nsize2,(nbase+1):(nbase+nks)) = f_in(1:nsize1,1:nsize2,1:nks)
!
! ... reduce across the pools
!
CALL mp_sum(f_out,inter_pool_comm)
!
#else
f_out(:,:,:) = f_in(:,:,:)
!
#endif
!
end subroutine poolgather1
For my test case, nsize1=3, nsize2=4, nkstot=47.
Thanks,
Vahid
On Oct 30, 2016, at 8:19 PM, Ye Luo <xw111luoye at gmail.com<mailto:xw111luoye at gmail.com>> wrote:
mp_sum_rt is interfaced with mp_sum for 3d array. When you mp_sum a 3d array, it will be automatically called.
USE mp, ONLY : mp_sum_rt is not necessary.
If you want to check if mp_sum_rt is implicitly called via interface, you can print something before and after your mp_sum call and also print something in mp_sum_rt. You should find the expected printing behaviour.
if you compare the mp_sum_rt with the 2d and 4d implementation mp_sum_rm and mp_sum_r4d, you should find they flat the dimensionality and call the same 1-d routine reduce_base_real.
As I asked earlier, how did you allocate your 3-D arrays input and output? Are you allocating sufficient size for the output, did you deallocate it by mistake? How large is output array size, if it exceeds 2^32, you might hit the 32bit integer bug of the variable msglen.
Ye
===================
Ye Luo, Ph.D.
Leadership Computing Facility
Argonne National Laboratory
2016-10-30 14:25 GMT-05:00 Vahid Askarpour <vh261281 at dal.ca<mailto:vh261281 at dal.ca>>:
Hi Ye,
The changes I am making are part of the EPW/QE code and so the parallelization depends on the QE MPI routines. I have no problem using mp_sum for a 2-D array such as
energy(iband,ik). I can use mp_sum to add the contributions from various nodes and print the total in agreement with the output from the serial run. As far as the allocation goes, I use the same 2-D approach to my 3-D array.
The problem arises when I use the 3-D array with mp_sum. I noticed that in the Modules/mp.f90 file, there is a subroutine as follows:
SUBROUTINE mp_sum_rt( msg, gid )
IMPLICIT NONE
REAL (DP), INTENT (INOUT) :: msg(:,:,:)
INTEGER, INTENT(IN) :: gid
#if defined(__MPI)
INTEGER :: msglen
msglen = size(msg)
CALL reduce_base_real( msglen, msg, gid, -1 )
#endif
END SUBROUTINE mp_sum_rt
Here msg has three dimensions. There are other subroutines for 1, 2, and 4 dimensions. Somehow, if I can get EPW to use the above subroutine, it might work. Adding a statement like:
USE mp, ONLY : mp_sum_rt
at the beginning of the file does not work as the compilation fails because EPW does not see the subroutine. Interestingly, it does see mp_sum which is in the same mp.f90 file.
So is there a way that I can get EPW to see mp_sum_rt? This might solve my problem.
Since I know that mp_sum works with 2-D arrays, one option is to rewrite my files in 2-D, one array for each x,y,z direction. I am trying to avoid this.
Thanks,
Vahid
On Oct 30, 2016, at 3:59 PM, Ye Luo <xw111luoye at gmail.com<mailto:xw111luoye at gmail.com>> wrote:
Hi Vahid,
segfault in mp_sum doesn't necessarily mean a problem there. Probably you wrote something to output array but not in a valid place before mp_sum.
Try to check your allocation of output and the copy make sure they are correct.
Ye
===================
Ye Luo, Ph.D.
Leadership Computing Facility
Argonne National Laboratory
2016-10-28 14:33 GMT-05:00 Vahid Askarpour <vh261281 at dal.ca<mailto:vh261281 at dal.ca>>:
Hi Ye,
Thank you for your suggestion. I tried it and when I ran the code, it seg-faulted. I put flags in the code to see where the segmentation faults occurs. It happens as the code calls mp_sum. It seems that mp_sum may not
be able to handle this reduction.
Cheers,
Vahid
On Oct 28, 2016, at 2:51 PM, Ye Luo <xw111luoye at gmail.com<mailto:xw111luoye at gmail.com>> wrote:
In Fortran, whatever-D array is 1-D array. mp_sum should be fine.
I saw something strange in your code that you were not copying the right things as you expected.
How about the following?
output(1:3,1:nbnds,(k_pool*pool_id+1:k_pool*pool_id+k_pool))=input(1:3,1:nbnds,1:k_pool)
Ye
===================
Ye Luo, Ph.D.
Leadership Computing Facility
Argonne National Laboratory
2016-10-28 12:29 GMT-05:00 Vahid Askarpour <vh261281 at dal.ca<mailto:vh261281 at dal.ca>>:
Dear QE Users,
I am working on some modifications to the QE-6.0 code using symmetry. When I try to combine a 3-D array scattered across nodes, I use the following:
output(3,nbnds,(k_pool*pool_id+1:k_pool*pool_id+k_pool))=input(3,nbnds,1:k_pool)
Here, nbnds is the number of bands, k_pool is the number of k points/pool, and pool_id is the id of the pool. Here I am assuming the the number of k points is divisible by the number of pools.
Then I call mp_sum(output,inter_pool_comm) to put all the segments of input across the nodes into one output file.
When I run the modified QE code in parallel, the output file is different from the serial run.
Does the QE's mp_sum allow the above operation for a three-D array?
Any hints or suggestions would be greatly appreciated.
Vahid
Vahid Askarpour
Department of Physics and Atmospheric Science
Dalhousie University,
Halifax, NS, Canada
_______________________________________________
Pw_forum mailing list
Pw_forum at pwscf.org<mailto:Pw_forum at pwscf.org>
http://pwscf.org/mailman/listinfo/pw_forum
_______________________________________________
Pw_forum mailing list
Pw_forum at pwscf.org<mailto:Pw_forum at pwscf.org>
http://pwscf.org/mailman/listinfo/pw_forum
_______________________________________________
Pw_forum mailing list
Pw_forum at pwscf.org<mailto:Pw_forum at pwscf.org>
http://pwscf.org/mailman/listinfo/pw_forum
_______________________________________________
Pw_forum mailing list
Pw_forum at pwscf.org<mailto:Pw_forum at pwscf.org>
http://pwscf.org/mailman/listinfo/pw_forum
_______________________________________________
Pw_forum mailing list
Pw_forum at pwscf.org<mailto:Pw_forum at pwscf.org>
http://pwscf.org/mailman/listinfo/pw_forum
_______________________________________________
Pw_forum mailing list
Pw_forum at pwscf.org<mailto:Pw_forum at pwscf.org>
http://pwscf.org/mailman/listinfo/pw_forum
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20161030/9ff207a7/attachment.html>
More information about the users
mailing list