[Q-e-developers] MPI problems with ELPA

Gabriele Sclauzero gabriele.sclauzero at epfl.ch
Wed Oct 23 11:08:13 CEST 2013


In the header of the get_elpa_row_col_comms subroutine (in ELPA/src/elpa1.f90) it's written:

...
mpi_comm_rows/mpi_comm_cols can be free'd with MPI_Comm_free if not used any more.
...

so I guess that ELPA does not provide such driver for freeing the communicators. However that subroutine looks quite simple, it just calls twice
mpi_comm_split and creates two new communicators (mpi_comm_rows/mpi_comm_cols).
From what I understand, all processes in the diagonalization group should have one communicator of each type, so I believe that two simple calls to MPI_Comm_free might work.
I don't think those communicators are used again after exiting subroutine pdsyevd_drv.


Ciao,

Gabriele

> Ciao Gabriele,
> 
> ok, these is a memory leakage bug,
> ELPA communicators must be destroyed after the diagonalization has been performed.
> I'm wondering if there is an ELPA driver to do this (similar to GET_ELPA_ROW_COL_COMMS)
> instead of calling directly MPI subroutines,
> 
> carlo
> 
> Il 23/10/2013 10:39, Gabriele Sclauzero ha scritto:
>> 
>> Ciao Carlo,
>> 
>>     thanks for the hint. As far as I can see, the only point where this kind of problem could arise is here
>> 
>> >>> grep -A6  __ELPA  ./Modules/dspev_drv.f90
>> 
>> --
>> #ifdef __ELPA
>>      INTEGER     :: nprow,npcol,my_prow, my_pcol,mpi_comm_rows, mpi_comm_cols
>> #endif 
>> 
>>      IF( SIZE( s, 1 ) /= lds ) &
>>         CALL errore( ' pdsyevd_drv ', ' wrong matrix leading dimension ', 1 )
>>      !
>> --
>> #ifdef __ELPA
>>      CALL BLACS_Gridinfo(ortho_cntx,nprow, npcol, my_prow,my_pcol)
>>      CALL GET_ELPA_ROW_COL_COMMS(ortho_comm, my_prow, my_pcol,mpi_comm_rows, mpi_comm_cols)
>>      CALL SOLVE_EVP_REAL(n,  n,   s, lds,    w,  vv, lds     ,nb  ,mpi_comm_rows, mpi_comm_cols)
>>      IF( tv )  s = vv
>>      IF( ALLOCATED( vv ) ) DEALLOCATE( vv )
>> #else
>> --
>> 
>> 
>> because the subroutine GET_ELPA_ROW_COL_COMMS internally calls MPI_Comm_split, which seems to be the MPI subroutine which crashed (according to the error message here below).
>> Since the communicators mpi_comm_rows and mpi_comm_cols are just used by ELPA in this subroutine, would it be safe to call
>> MPI_Comm_free ( mpi_comm_rows, ierr )
>> MPI_Comm_free ( mpi_comm_cols, ierr )
>> just before #else ?
>> 
>> If you don't see any potential problem, I would try this solution.
>> 
>> 
>> Best,
>> 
>> Gabriele
>> 
>> 
>> 
>> On 10/23/2013 10:06 AM, Carlo Cavazzoni wrote:
>>>> Ciao Gabriele,
>>>> 
>>>> as far I understand with ELPA (which is build on a much deeper communicator hierarchy wrt SCALAPACK)
>>>> you hit a MPI environment limit (number of communicators).
>>>> Even if the limit could be somehow increased, it sound like 
>>>> somewhere a communicator is created and not destroyed in the relax work-flow.
>>>> I guess it could be something related to temporary communicator created to distribute atoms to processors.
>>>> I don't remember exactly, but it can be checked easily searching for communicator initialization calls
>>>> 
>>>> carlo
>>>> 
>>>> Il 22/10/2013 17:35, Gabriele Sclauzero ha scritto:
>>>>> Dear all,
>>>>> 
>>>>>     I've recently started using ELPA in place of Scalapack for large scale calculations and I indeed see a very good improvement of the performance.
>>>>>  
>>>>>     Unfortunately, I've found a recurrent problem when running relax calculations (not due to the relaxation itself, though, I believe). The program crashes because of some MPI-related issues. The error message from the system looks as follows:
>>>>> 
>>>>> Abort(1) on node 1732 (rank 1732 in comm 1140850688): Fatal error in PMPI_Comm_split: Other MPI error, error stack:
>>>>> PMPI_Comm_split(474).................: MPI_Comm_split(comm=0xc4000004, color=2, key=27, new_comm=0x1fffff7478) failed
>>>>> PMPI_Comm_split(456).................: 
>>>>> MPIR_Comm_split_impl(228)............: 
>>>>> MPIR_Get_contextid_sparse_group(1071): Too many communicators
>>>>> 
>>>>> 
>>>>> The crash happens during the Davidson diagonalization after a few ionic cycles of the relaxation (after roughly ~200 Davidson diagonalizations). It happens both with v.5.0.3 and with the latest SVN revision. If I compile with Scalapack in place of ELPA, both versions work fine (but are slower...). 
>>>>> 
>>>>> Compilation details (see also attached make.sys):
>>>>> BG/Q machine, XLF 14.1, XLC 12.1, ESSL 5.1, Scalapack 2.0.2
>>>>> ./configure --enable-openmp --with-elpa
>>>>> 
>>>>> The calculation was run on 256 nodes with the following command line:
>>>>> runjob -n 2048 -p 8 --envs OMP_NUM_THREADS=4 --cwd [...] : /home/sclauzer/Codes/espresso/5.0.3_ELPA/bin/pw.x -nband 1 -npool 1 -ndiag 1024 -ntg 4 -in [...]
>>>>> 
>>>>> The system is quite large, a slab with ~1400 atoms and ~3000 bands. I don't know if the problem would show up for a smaller system or on fewer nodes, but I can try to provide a smaller example in order to investigate the problem more easily. Hopefully, this is something simple that the ELPA/Scalapack and BGQ experts among you can spot at a glance.
>>>>> 
>>>>> 
>>>>> Best,
>>>>> 
>>>>> Gabriele
>>> 
>> 
>> 
>> § Gabriele Sclauzero, EPFL SB ITP CSEA
>>    PH H2 462, Station 3, CH-1015 Lausanne
>> 
>> 
>> 
>> _______________________________________________
>> Q-e-developers mailing list
>> Q-e-developers at qe-forge.org
>> http://qe-forge.org/mailman/listinfo/q-e-developers
> 
> 
> -- 
> Ph.D. Carlo Cavazzoni
> SuperComputing Applications and Innovation Department
> CINECA - Via Magnanelli 6/3, 40033 Casalecchio di Reno (Bologna)
> Tel: +39 051 6171411  Fax: +39 051 6132198
> www.cineca.it
> _______________________________________________
> Q-e-developers mailing list
> Q-e-developers at qe-forge.org
> http://qe-forge.org/mailman/listinfo/q-e-developers


§ Gabriele Sclauzero, EPFL SB ITP CSEA
   PH H2 462, Station 3, CH-1015 Lausanne

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/developers/attachments/20131023/623041c9/attachment.html>


More information about the developers mailing list