[QE-users] Parallel computing of QE7.1 vc-relax crashes when using large number of processors

Xin Jin xin.tlg.jin at outlook.com
Mon Oct 31 07:58:24 CET 2022


Hello Krishnendu,

Thank you for the message
Yes, we can start a separate thread. I didn't realize that there could 
be such a problem.
Actually, the reason that I use ibrav=0 for BCC W is that I want to put 
some interstitials inside if the program works.

Best regards,
Xin


On 30/10/2022 05:35, KRISHNENDU MUKHERJEE wrote:
>
> Dear Xin Jin,
>
>  Sorry that I have another matter to discuss. If you wish we may start 
> a separate thread on the subject I am eager to discuss. I have some 
> concern about your script. If you want to do calculation for BCC W you 
> may need to use ibrav=3. Note that for ibrav=3 the (primitive) lattice 
> vectors are in the form:
>
> ibrav=3          cubic I (bcc)
>       v1 = (a/2)(1,1,1),  v2 = (a/2)(-1,1,1),  v3 = (a/2)(-1,-1,1) 
> which is inbuilt in QE (so that you need not input CELL_PARAMETERS) 
> and you only need to put an atom in the position 0.00 0.00 0.00.
> With those input QE determines the k_points for BCC Bravais Lattice.
>
> You have input the atomic positions in terms of supercell made of BCC 
> unit cell. And you have put ibrav=0. With those input I am afraid most 
> probably the k_points generated would be that of a Simple Cubic structure.
> What we can do is next week I would generate the k_points for BCC with 
> a mesh of 4 4 4 0 0 0 and post it. And you can see whether it is 
> matching with the k_points considered in your calculation and discuss 
> further.
>
> Thank you,
> Best regards,
> Krishnendu
>
> ------------------------------------------------------------------------------------------------------------------------------------------
>
> Xin Jin wrote on 28/Oct/2022
>
>
> Dear Quantum Espresso Forum,
>
> I encountered a problem related to the parallel computing using QE7.1
> for vc-relax.
>
> I was trying to perform a vc-relax for a 3*3*3 BCC tungsten super cell.
> The code works fine for non-parallel computing, also works fine for
> parallel computing if the number of processors is smaller than 10.
>
> However, if the number of processors is larger than 10, I will get
> following MPI error:
> /*** An error occurred in MPI_Comm_free//
> //*** reported by process [3585895498,2]//
> //*** on communicator MPI_COMM_WORLD//
> //*** MPI_ERR_COMM: invalid communicator//
> //*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now 
> abort,//
> //***    and potentially your MPI job)/
>
> For parallel computing, I am using /OpenMPI/3.1.4-gcccuda/. (In
> addition, it seems that If I use OpenMPI V4, the simulation speed will
> be much slower than that of V3.)
>
> Another thing is that, if I decrease the size of the supper cell, for
> example to 2*2*2, then there is no problem in the parallel computing
> even if I use more than 30 processors.
>
> Could you help me look at this problem, please?
>
> The input for QE can be found below.
>
> Thank you in advance!
>
> Xin Jin
>
> /&control//
> //
> //    calculation='vc-relax' //
> //    restart_mode='from_scratch', //
> //    prefix='W_relax', //
> //    pseudo_dir="../../PP_files",//
> //    outdir='./'//
> //
> // ///
> ////
> //
> // &system//
> //    ibrav= 0, //
> //    celldm(1)=5.972,//
> //    nat=  54, //
> //    ntyp= 1,//
> //    ecutwfc = 50,//
> //    ecutrho = 500,//
> //    occupations='smearing', smearing='mp', degauss=0.06//
> // ///
> //
> // &electrons//
> //    diagonalization='david',//
> //    conv_thr =  1.0d-8,//
> //    mixing_beta = 0.5,//
> // ///
> ////
> // &ions//
> // ///
> //
> // &cell//
> //    press = 0.0,//
> // ///
> ////
> //ATOMIC_SPECIES//
> // W  183.84 W.pbe-spn-kjpaw_psl.1.0.0.UPF//
> ////
> //CELL_PARAMETERS {alat}//
> //   3.0  0.0  0.0//
> //   0.0  3.0  0.0//
> //   0.0  0.0  3.0 //
> ////
> //ATOMIC_POSITIONS {alat}//
> //W 0.00000 0.00000 0.00000//
> //W 0.50000 0.50000 0.50000//
> //W 1.00000 0.00000 0.00000//
> //W 1.50000 0.50000 0.50000//
> //W 2.00000 0.00000 0.00000//
> //W 2.50000 0.50000 0.50000//
> //W 0.00000 1.00000 0.00000//
> //W 0.50000 1.50000 0.50000//
> //W 1.00000 1.00000 0.00000//
> //W 1.50000 1.50000 0.50000//
> //W 2.00000 1.00000 0.00000//
> //W 2.50000 1.50000 0.50000//
> //W 0.00000 2.00000 0.00000//
> //W 0.50000 2.50000 0.50000//
> //W 1.00000 2.00000 0.00000//
> //W 1.50000 2.50000 0.50000//
> //W 2.00000 2.00000 0.00000//
> //W 2.50000 2.50000 0.50000//
> //W 0.00000 0.00000 1.00000//
> //W 0.50000 0.50000 1.50000//
> //W 1.00000 0.00000 1.00000//
> //W 1.50000 0.50000 1.50000//
> //W 2.00000 0.00000 1.00000//
> //W 2.50000 0.50000 1.50000//
> //W 0.00000 1.00000 1.00000//
> //W 0.50000 1.50000 1.50000//
> //W 1.00000 1.00000 1.00000//
> //W 1.50000 1.50000 1.50000//
> //W 2.00000 1.00000 1.00000//
> //W 2.50000 1.50000 1.50000//
> //W 0.00000 2.00000 1.00000//
> //W 0.50000 2.50000 1.50000//
> //W 1.00000 2.00000 1.00000//
> //W 1.50000 2.50000 1.50000//
> //W 2.00000 2.00000 1.00000//
> //W 2.50000 2.50000 1.50000//
> //W 0.00000 0.00000 2.00000//
> //W 0.50000 0.50000 2.50000//
> //W 1.00000 0.00000 2.00000//
> //W 1.50000 0.50000 2.50000//
> //W 2.00000 0.00000 2.00000//
> //W 2.50000 0.50000 2.50000//
> //W 0.00000 1.00000 2.00000//
> //W 0.50000 1.50000 2.50000//
> //W 1.00000 1.00000 2.00000//
> //W 1.50000 1.50000 2.50000//
> //W 2.00000 1.00000 2.00000//
> //W 2.50000 1.50000 2.50000//
> //W 0.00000 2.00000 2.00000//
> //W 0.50000 2.50000 2.50000//
> //W 1.00000 2.00000 2.00000//
> //W 1.50000 2.50000 2.50000//
> //W 2.00000 2.00000 2.00000//
> //W 2.50000 2.50000 2.50000//
> //
> //K_POINTS {automatic}//
> //4 4 4 0 0 0//
> /
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: 
> <http://lists.quantum-espresso.org/pipermail/users/attachments/20221028/87b8c648/attachment.html>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20221031/fab7ad35/attachment.html>


More information about the users mailing list