[QE-users] Parallel computing of QE7.1 vc-relax crashes when using large number of processors

Paolo Giannozzi paolo.giannozzi at uniud.it
Mon Oct 31 07:55:57 CET 2022


Doesn't happen to me

Paolo

On 31/10/2022 07:46, Xin Jin wrote:
> Hello Paolo,
> 
> I think the problem happens before the start of the iteration of 
> self-consistent calculation.
> 
> The last output from .out file before the crash is like this:
> 
> "Smooth grid:   274793 G-vectors     FFT dimensions: (  81,  81, 81)"
> "Estimated max dynamical RAM per process >     594.72 MB"
> "Estimated total dynamical RAM >       9.29 GB"
> "Initial potential from superposition of free atoms"
> "starting charge     755.9699, renormalised to     756.0000"
> "Starting wfcs are  702 randomized atomic wfcs"
> 
> Thank you.
> 
> Best regards,
> Xin
> 
> On 30/10/2022 09:23, Paolo Giannozzi wrote:
>> You get the message when the calculation starts, after initialization, 
>> after a few scf steps, after a few optimization steps, ... ?
>>
>> Paolo
>>
>> On 28/10/2022 14:45, Xin Jin wrote:
>>>
>>> You don't often get email from xin.tlg.jin at outlook.com. Learn why 
>>> this is important <https://aka.ms/LearnAboutSenderIdentification>
>>>
>>>
>>> Dear Quantum Espresso Forum,
>>>
>>> I encountered a problem related to the parallel computing using QE7.1 
>>> for vc-relax.
>>>
>>> I was trying to perform a vc-relax for a 3*3*3 BCC tungsten super 
>>> cell. The code works fine for non-parallel computing, also works fine 
>>> for parallel computing if the number of processors is smaller than 10.
>>>
>>> However, if the number of processors is larger than 10, I will get 
>>> following MPI error:
>>> /*** An error occurred in MPI_Comm_free//
>>> //*** reported by process [3585895498,2]//
>>> //*** on communicator MPI_COMM_WORLD//
>>> //*** MPI_ERR_COMM: invalid communicator//
>>> //*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now 
>>> abort,//
>>> //***    and potentially your MPI job)/
>>>
>>> For parallel computing, I am using /OpenMPI/3.1.4-gcccuda/. (In 
>>> addition, it seems that If I use OpenMPI V4, the simulation speed 
>>> will be much slower than that of V3.)
>>>
>>> Another thing is that, if I decrease the size of the supper cell, for 
>>> example to 2*2*2, then there is no problem in the parallel computing 
>>> even if I use more than 30 processors.
>>>
>>> Could you help me look at this problem, please?
>>>
>>> The input for QE can be found below.
>>>
>>> Thank you in advance!
>>>
>>> Xin Jin
>>>
>>> /&control//
>>> //
>>> //    calculation='vc-relax' //
>>> //    restart_mode='from_scratch', //
>>> //    prefix='W_relax', //
>>> //    pseudo_dir="../../PP_files",//
>>> //    outdir='./'//
>>> //
>>> // ///
>>> ////
>>> //
>>> // &system//
>>> //    ibrav= 0, //
>>> //    celldm(1)=5.972,//
>>> //    nat=  54, //
>>> //    ntyp= 1,//
>>> //    ecutwfc = 50,//
>>> //    ecutrho = 500,//
>>> //    occupations='smearing', smearing='mp', degauss=0.06//
>>> // ///
>>> //
>>> // &electrons//
>>> //    diagonalization='david',//
>>> //    conv_thr =  1.0d-8,//
>>> //    mixing_beta = 0.5,//
>>> // ///
>>> ////
>>> // &ions//
>>> // ///
>>> //
>>> // &cell//
>>> //    press = 0.0,//
>>> // ///
>>> ////
>>> //ATOMIC_SPECIES//
>>> // W  183.84 W.pbe-spn-kjpaw_psl.1.0.0.UPF//
>>> ////
>>> //CELL_PARAMETERS {alat}//
>>> //   3.0  0.0  0.0//
>>> //   0.0  3.0  0.0//
>>> //   0.0  0.0  3.0 //
>>> ////
>>> //ATOMIC_POSITIONS {alat}//
>>> //W 0.00000 0.00000 0.00000//
>>> //W 0.50000 0.50000 0.50000//
>>> //W 1.00000 0.00000 0.00000//
>>> //W 1.50000 0.50000 0.50000//
>>> //W 2.00000 0.00000 0.00000//
>>> //W 2.50000 0.50000 0.50000//
>>> //W 0.00000 1.00000 0.00000//
>>> //W 0.50000 1.50000 0.50000//
>>> //W 1.00000 1.00000 0.00000//
>>> //W 1.50000 1.50000 0.50000//
>>> //W 2.00000 1.00000 0.00000//
>>> //W 2.50000 1.50000 0.50000//
>>> //W 0.00000 2.00000 0.00000//
>>> //W 0.50000 2.50000 0.50000//
>>> //W 1.00000 2.00000 0.00000//
>>> //W 1.50000 2.50000 0.50000//
>>> //W 2.00000 2.00000 0.00000//
>>> //W 2.50000 2.50000 0.50000//
>>> //W 0.00000 0.00000 1.00000//
>>> //W 0.50000 0.50000 1.50000//
>>> //W 1.00000 0.00000 1.00000//
>>> //W 1.50000 0.50000 1.50000//
>>> //W 2.00000 0.00000 1.00000//
>>> //W 2.50000 0.50000 1.50000//
>>> //W 0.00000 1.00000 1.00000//
>>> //W 0.50000 1.50000 1.50000//
>>> //W 1.00000 1.00000 1.00000//
>>> //W 1.50000 1.50000 1.50000//
>>> //W 2.00000 1.00000 1.00000//
>>> //W 2.50000 1.50000 1.50000//
>>> //W 0.00000 2.00000 1.00000//
>>> //W 0.50000 2.50000 1.50000//
>>> //W 1.00000 2.00000 1.00000//
>>> //W 1.50000 2.50000 1.50000//
>>> //W 2.00000 2.00000 1.00000//
>>> //W 2.50000 2.50000 1.50000//
>>> //W 0.00000 0.00000 2.00000//
>>> //W 0.50000 0.50000 2.50000//
>>> //W 1.00000 0.00000 2.00000//
>>> //W 1.50000 0.50000 2.50000//
>>> //W 2.00000 0.00000 2.00000//
>>> //W 2.50000 0.50000 2.50000//
>>> //W 0.00000 1.00000 2.00000//
>>> //W 0.50000 1.50000 2.50000//
>>> //W 1.00000 1.00000 2.00000//
>>> //W 1.50000 1.50000 2.50000//
>>> //W 2.00000 1.00000 2.00000//
>>> //W 2.50000 1.50000 2.50000//
>>> //W 0.00000 2.00000 2.00000//
>>> //W 0.50000 2.50000 2.50000//
>>> //W 1.00000 2.00000 2.00000//
>>> //W 1.50000 2.50000 2.50000//
>>> //W 2.00000 2.00000 2.00000//
>>> //W 2.50000 2.50000 2.50000//
>>> //
>>> //K_POINTS {automatic}//
>>> //4 4 4 0 0 0//
>>> /
>>>
>>> _______________________________________________
>>> The Quantum ESPRESSO community stands by the Ukrainian
>>> people and expresses its concerns about the devastating
>>> effects that the Russian military offensive has on their
>>> country and on the free and peaceful scientific, cultural,
>>> and economic cooperation amongst peoples
>>> _______________________________________________
>>> Quantum ESPRESSO is supported by MaX 
>>> (https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.max-centre.eu%2F&data=05%7C01%7Cpaolo.giannozzi%40uniud.it%7Cb178b86d184c461ff0ae08dabb0ba952%7C6e6ade15296c4224ac581c8ec2fd53a8%7C0%7C0%7C638027957228169653%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000%7C%7C%7C&sdata=Fa%2BVu%2F4W0y0VhMrebcXXexOBvRbHs2UadDJJ%2BOKmZb4%3D&reserved=0)
>>> users mailing list users at lists.quantum-espresso.org
>>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.quantum-espresso.org%2Fmailman%2Flistinfo%2Fusers&data=05%7C01%7Cpaolo.giannozzi%40uniud.it%7Cb178b86d184c461ff0ae08dabb0ba952%7C6e6ade15296c4224ac581c8ec2fd53a8%7C0%7C0%7C638027957228169653%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000%7C%7C%7C&sdata=Sx4TJqW4jajtGFIVCaHgGTta8tf9sbwXR%2F%2BzJr76WP0%3D&reserved=0
>>
> 

-- 
Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 206, 33100 Udine Italy, +39-0432-558216


More information about the users mailing list