[QE-users] Parallel computing of QE7.1 vc-relax crashes when using large number of processors

KRISHNENDU MUKHERJEE krishnendu at nmlindia.org
Sun Oct 30 04:35:08 CET 2022


Dear Xin Jin, 

Sorry that I have another matter to discuss. If you wish we may start a separate thread on the subject I am eager to discuss. I have some concern about your script. If you want to do calculation for BCC W you may need to use ibrav=3. Note that for ibrav=3 the (primitive) lattice vectors are in the form: 

ibrav=3 cubic I (bcc) 
v1 = (a/2)(1,1,1), v2 = (a/2)(-1,1,1), v3 = (a/2)(-1,-1,1) which is inbuilt in QE (so that you need not input CELL_PARAMETERS) and you only need to put an atom in the position 0.00 0.00 0.00. 
With those input QE determines the k_points for BCC Bravais Lattice. 

You have input the atomic positions in terms of supercell made of BCC unit cell. And you have put ibrav=0. With those input I am afraid most probably the k_points generated would be that of a Simple Cubic structure. 
What we can do is next week I would generate the k_points for BCC with a mesh of 4 4 4 0 0 0 and post it. And you can see whether it is matching with the k_points considered in your calculation and discuss further. 

Thank you, 
Best regards, 
Krishnendu 

------------------------------------------------------------------------------------------------------------------------------------------ 

Xin Jin wrote on 28/Oct/2022 


Dear Quantum Espresso Forum, 

I encountered a problem related to the parallel computing using QE7.1 
for vc-relax. 

I was trying to perform a vc-relax for a 3*3*3 BCC tungsten super cell. 
The code works fine for non-parallel computing, also works fine for 
parallel computing if the number of processors is smaller than 10. 

However, if the number of processors is larger than 10, I will get 
following MPI error: 
/*** An error occurred in MPI_Comm_free// 
//*** reported by process [3585895498,2]// 
//*** on communicator MPI_COMM_WORLD// 
//*** MPI_ERR_COMM: invalid communicator// 
//*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,// 
//*** and potentially your MPI job)/ 

For parallel computing, I am using /OpenMPI/3.1.4-gcccuda/. (In 
addition, it seems that If I use OpenMPI V4, the simulation speed will 
be much slower than that of V3.) 

Another thing is that, if I decrease the size of the supper cell, for 
example to 2*2*2, then there is no problem in the parallel computing 
even if I use more than 30 processors. 

Could you help me look at this problem, please? 

The input for QE can be found below. 

Thank you in advance! 

Xin Jin 

/&control// 
// 
// calculation='vc-relax' // 
// restart_mode='from_scratch', // 
// prefix='W_relax', // 
// pseudo_dir="../../PP_files",// 
// outdir='./'// 
// 
// /// 
//// 
// 
// &system// 
// ibrav= 0, // 
// celldm(1)=5.972,// 
// nat= 54, // 
// ntyp= 1,// 
// ecutwfc = 50,// 
// ecutrho = 500,// 
// occupations='smearing', smearing='mp', degauss=0.06// 
// /// 
// 
// &electrons// 
// diagonalization='david',// 
// conv_thr = 1.0d-8,// 
// mixing_beta = 0.5,// 
// /// 
//// 
// &ions// 
// /// 
// 
// &cell// 
// press = 0.0,// 
// /// 
//// 
//ATOMIC_SPECIES// 
// W 183.84 W.pbe-spn-kjpaw_psl.1.0.0.UPF// 
//// 
//CELL_PARAMETERS {alat}// 
// 3.0 0.0 0.0// 
// 0.0 3.0 0.0// 
// 0.0 0.0 3.0 // 
//// 
//ATOMIC_POSITIONS {alat}// 
//W 0.00000 0.00000 0.00000// 
//W 0.50000 0.50000 0.50000// 
//W 1.00000 0.00000 0.00000// 
//W 1.50000 0.50000 0.50000// 
//W 2.00000 0.00000 0.00000// 
//W 2.50000 0.50000 0.50000// 
//W 0.00000 1.00000 0.00000// 
//W 0.50000 1.50000 0.50000// 
//W 1.00000 1.00000 0.00000// 
//W 1.50000 1.50000 0.50000// 
//W 2.00000 1.00000 0.00000// 
//W 2.50000 1.50000 0.50000// 
//W 0.00000 2.00000 0.00000// 
//W 0.50000 2.50000 0.50000// 
//W 1.00000 2.00000 0.00000// 
//W 1.50000 2.50000 0.50000// 
//W 2.00000 2.00000 0.00000// 
//W 2.50000 2.50000 0.50000// 
//W 0.00000 0.00000 1.00000// 
//W 0.50000 0.50000 1.50000// 
//W 1.00000 0.00000 1.00000// 
//W 1.50000 0.50000 1.50000// 
//W 2.00000 0.00000 1.00000// 
//W 2.50000 0.50000 1.50000// 
//W 0.00000 1.00000 1.00000// 
//W 0.50000 1.50000 1.50000// 
//W 1.00000 1.00000 1.00000// 
//W 1.50000 1.50000 1.50000// 
//W 2.00000 1.00000 1.00000// 
//W 2.50000 1.50000 1.50000// 
//W 0.00000 2.00000 1.00000// 
//W 0.50000 2.50000 1.50000// 
//W 1.00000 2.00000 1.00000// 
//W 1.50000 2.50000 1.50000// 
//W 2.00000 2.00000 1.00000// 
//W 2.50000 2.50000 1.50000// 
//W 0.00000 0.00000 2.00000// 
//W 0.50000 0.50000 2.50000// 
//W 1.00000 0.00000 2.00000// 
//W 1.50000 0.50000 2.50000// 
//W 2.00000 0.00000 2.00000// 
//W 2.50000 0.50000 2.50000// 
//W 0.00000 1.00000 2.00000// 
//W 0.50000 1.50000 2.50000// 
//W 1.00000 1.00000 2.00000// 
//W 1.50000 1.50000 2.50000// 
//W 2.00000 1.00000 2.00000// 
//W 2.50000 1.50000 2.50000// 
//W 0.00000 2.00000 2.00000// 
//W 0.50000 2.50000 2.50000// 
//W 1.00000 2.00000 2.00000// 
//W 1.50000 2.50000 2.50000// 
//W 2.00000 2.00000 2.00000// 
//W 2.50000 2.50000 2.50000// 
// 
//K_POINTS {automatic}// 
//4 4 4 0 0 0// 
/ 
-------------- next part -------------- 
An HTML attachment was scrubbed... 
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20221028/87b8c648/attachment.html> 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20221030/4387cad4/attachment.html>


More information about the users mailing list