[QE-users] Parallel computing of QE7.1 vc-relax crashes when using large number of processors
Paolo Giannozzi
paolo.giannozzi at uniud.it
Mon Oct 31 07:55:57 CET 2022
Doesn't happen to me
Paolo
On 31/10/2022 07:46, Xin Jin wrote:
> Hello Paolo,
>
> I think the problem happens before the start of the iteration of
> self-consistent calculation.
>
> The last output from .out file before the crash is like this:
>
> "Smooth grid: 274793 G-vectors FFT dimensions: ( 81, 81, 81)"
> "Estimated max dynamical RAM per process > 594.72 MB"
> "Estimated total dynamical RAM > 9.29 GB"
> "Initial potential from superposition of free atoms"
> "starting charge 755.9699, renormalised to 756.0000"
> "Starting wfcs are 702 randomized atomic wfcs"
>
> Thank you.
>
> Best regards,
> Xin
>
> On 30/10/2022 09:23, Paolo Giannozzi wrote:
>> You get the message when the calculation starts, after initialization,
>> after a few scf steps, after a few optimization steps, ... ?
>>
>> Paolo
>>
>> On 28/10/2022 14:45, Xin Jin wrote:
>>>
>>> You don't often get email from xin.tlg.jin at outlook.com. Learn why
>>> this is important <https://aka.ms/LearnAboutSenderIdentification>
>>>
>>>
>>> Dear Quantum Espresso Forum,
>>>
>>> I encountered a problem related to the parallel computing using QE7.1
>>> for vc-relax.
>>>
>>> I was trying to perform a vc-relax for a 3*3*3 BCC tungsten super
>>> cell. The code works fine for non-parallel computing, also works fine
>>> for parallel computing if the number of processors is smaller than 10.
>>>
>>> However, if the number of processors is larger than 10, I will get
>>> following MPI error:
>>> /*** An error occurred in MPI_Comm_free//
>>> //*** reported by process [3585895498,2]//
>>> //*** on communicator MPI_COMM_WORLD//
>>> //*** MPI_ERR_COMM: invalid communicator//
>>> //*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now
>>> abort,//
>>> //*** and potentially your MPI job)/
>>>
>>> For parallel computing, I am using /OpenMPI/3.1.4-gcccuda/. (In
>>> addition, it seems that If I use OpenMPI V4, the simulation speed
>>> will be much slower than that of V3.)
>>>
>>> Another thing is that, if I decrease the size of the supper cell, for
>>> example to 2*2*2, then there is no problem in the parallel computing
>>> even if I use more than 30 processors.
>>>
>>> Could you help me look at this problem, please?
>>>
>>> The input for QE can be found below.
>>>
>>> Thank you in advance!
>>>
>>> Xin Jin
>>>
>>> /&control//
>>> //
>>> // calculation='vc-relax' //
>>> // restart_mode='from_scratch', //
>>> // prefix='W_relax', //
>>> // pseudo_dir="../../PP_files",//
>>> // outdir='./'//
>>> //
>>> // ///
>>> ////
>>> //
>>> // &system//
>>> // ibrav= 0, //
>>> // celldm(1)=5.972,//
>>> // nat= 54, //
>>> // ntyp= 1,//
>>> // ecutwfc = 50,//
>>> // ecutrho = 500,//
>>> // occupations='smearing', smearing='mp', degauss=0.06//
>>> // ///
>>> //
>>> // &electrons//
>>> // diagonalization='david',//
>>> // conv_thr = 1.0d-8,//
>>> // mixing_beta = 0.5,//
>>> // ///
>>> ////
>>> // &ions//
>>> // ///
>>> //
>>> // &cell//
>>> // press = 0.0,//
>>> // ///
>>> ////
>>> //ATOMIC_SPECIES//
>>> // W 183.84 W.pbe-spn-kjpaw_psl.1.0.0.UPF//
>>> ////
>>> //CELL_PARAMETERS {alat}//
>>> // 3.0 0.0 0.0//
>>> // 0.0 3.0 0.0//
>>> // 0.0 0.0 3.0 //
>>> ////
>>> //ATOMIC_POSITIONS {alat}//
>>> //W 0.00000 0.00000 0.00000//
>>> //W 0.50000 0.50000 0.50000//
>>> //W 1.00000 0.00000 0.00000//
>>> //W 1.50000 0.50000 0.50000//
>>> //W 2.00000 0.00000 0.00000//
>>> //W 2.50000 0.50000 0.50000//
>>> //W 0.00000 1.00000 0.00000//
>>> //W 0.50000 1.50000 0.50000//
>>> //W 1.00000 1.00000 0.00000//
>>> //W 1.50000 1.50000 0.50000//
>>> //W 2.00000 1.00000 0.00000//
>>> //W 2.50000 1.50000 0.50000//
>>> //W 0.00000 2.00000 0.00000//
>>> //W 0.50000 2.50000 0.50000//
>>> //W 1.00000 2.00000 0.00000//
>>> //W 1.50000 2.50000 0.50000//
>>> //W 2.00000 2.00000 0.00000//
>>> //W 2.50000 2.50000 0.50000//
>>> //W 0.00000 0.00000 1.00000//
>>> //W 0.50000 0.50000 1.50000//
>>> //W 1.00000 0.00000 1.00000//
>>> //W 1.50000 0.50000 1.50000//
>>> //W 2.00000 0.00000 1.00000//
>>> //W 2.50000 0.50000 1.50000//
>>> //W 0.00000 1.00000 1.00000//
>>> //W 0.50000 1.50000 1.50000//
>>> //W 1.00000 1.00000 1.00000//
>>> //W 1.50000 1.50000 1.50000//
>>> //W 2.00000 1.00000 1.00000//
>>> //W 2.50000 1.50000 1.50000//
>>> //W 0.00000 2.00000 1.00000//
>>> //W 0.50000 2.50000 1.50000//
>>> //W 1.00000 2.00000 1.00000//
>>> //W 1.50000 2.50000 1.50000//
>>> //W 2.00000 2.00000 1.00000//
>>> //W 2.50000 2.50000 1.50000//
>>> //W 0.00000 0.00000 2.00000//
>>> //W 0.50000 0.50000 2.50000//
>>> //W 1.00000 0.00000 2.00000//
>>> //W 1.50000 0.50000 2.50000//
>>> //W 2.00000 0.00000 2.00000//
>>> //W 2.50000 0.50000 2.50000//
>>> //W 0.00000 1.00000 2.00000//
>>> //W 0.50000 1.50000 2.50000//
>>> //W 1.00000 1.00000 2.00000//
>>> //W 1.50000 1.50000 2.50000//
>>> //W 2.00000 1.00000 2.00000//
>>> //W 2.50000 1.50000 2.50000//
>>> //W 0.00000 2.00000 2.00000//
>>> //W 0.50000 2.50000 2.50000//
>>> //W 1.00000 2.00000 2.00000//
>>> //W 1.50000 2.50000 2.50000//
>>> //W 2.00000 2.00000 2.00000//
>>> //W 2.50000 2.50000 2.50000//
>>> //
>>> //K_POINTS {automatic}//
>>> //4 4 4 0 0 0//
>>> /
>>>
>>> _______________________________________________
>>> The Quantum ESPRESSO community stands by the Ukrainian
>>> people and expresses its concerns about the devastating
>>> effects that the Russian military offensive has on their
>>> country and on the free and peaceful scientific, cultural,
>>> and economic cooperation amongst peoples
>>> _______________________________________________
>>> Quantum ESPRESSO is supported by MaX
>>> (https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.max-centre.eu%2F&data=05%7C01%7Cpaolo.giannozzi%40uniud.it%7Cb178b86d184c461ff0ae08dabb0ba952%7C6e6ade15296c4224ac581c8ec2fd53a8%7C0%7C0%7C638027957228169653%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000%7C%7C%7C&sdata=Fa%2BVu%2F4W0y0VhMrebcXXexOBvRbHs2UadDJJ%2BOKmZb4%3D&reserved=0)
>>> users mailing list users at lists.quantum-espresso.org
>>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.quantum-espresso.org%2Fmailman%2Flistinfo%2Fusers&data=05%7C01%7Cpaolo.giannozzi%40uniud.it%7Cb178b86d184c461ff0ae08dabb0ba952%7C6e6ade15296c4224ac581c8ec2fd53a8%7C0%7C0%7C638027957228169653%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000%7C%7C%7C&sdata=Sx4TJqW4jajtGFIVCaHgGTta8tf9sbwXR%2F%2BzJr76WP0%3D&reserved=0
>>
>
--
Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 206, 33100 Udine Italy, +39-0432-558216
More information about the users
mailing list