[QE-users] Fatal error in PMPI_Comm_free: Invalid communicator

Md. Jahid Hasan Sagor md.sagor at maine.edu
Mon Jul 22 19:11:29 CEST 2024


*Hi Dear QE experts,*

*I ran a job file of silicon (4*4*4) supercell (total 128 atoms) and ended
up with the following error.*

Abort(336148997) on node 40 (rank 40 in comm 0): Fatal error in
PMPI_Comm_free: Invalid communicator, error stack:
PMPI_Comm_free(137): MPI_Comm_free(comm=0x7ffc4cef5528) failed
PMPI_Comm_free(85).: Null communicator
Abort(940128773) on node 44 (rank 44 in comm 0): Fatal error in
PMPI_Comm_free: Invalid communicator, error stack:
PMPI_Comm_free(137): MPI_Comm_free(comm=0x7ffd12d17fa8) failed
PMPI_Comm_free(85).: Null communicator
Abort(940128773) on node 56 (rank 56 in comm 0): Fatal error in
PMPI_Comm_free: Invalid communicator, error stack:
PMPI_Comm_free(137): MPI_Comm_free(comm=0x7fff2e304428) failed
PMPI_Comm_free(85).: Null communicator
slurmstepd: error: *** STEP 1982493.0 ON node-146 CANCELLED AT
2024-07-22T12:49:17 ***
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
srun: error: node-149: tasks 48-71: Killed
srun: error: node-146: tasks 0-23: Killed
srun: error: node-147: tasks 24-47: Killed

*Here is my job file. I have also attached the input file si.scf.in
<http://si.scf.in> for your kind perusal. *


#!/usr/bin/env bash
#SBATCH --job-name=hasanjob
#SBATCH --nodes=3                       # node count
#SBATCH --ntasks-per-node=24         # number of tasks per node
#SBATCH --cpus-per-task=1           # cpu-cores per task (>1 if
multi-threaded tasks)
#SBATCH --mem-per-cpu=5gb                    # Job memory request
#SBATCH --time=120:00:00               # Time limit hrs:min:sec
#SBATCH --output=sdc_save.txt              # Standard output and error log
#SBATCH --partition=skylake           # MOAB/Torque called these queues

module unload mvapich2
module load mvapich2-intel/2.3.5
module load quantum-espresso
srun pw.x < si.scf.in > si.scf.out


*Would you please help me to solve the problem? I have opened the structure
in xcrysden and it looks fine to me.*


Best
Md Jahid Hasan
PhD student
Mechanical Engineering
University of Maine
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20240722/a4290691/attachment.html>
-------------- next part --------------
Abort(336148997) on node 40 (rank 40 in comm 0): Fatal error in PMPI_Comm_free: Invalid communicator, error stack:
PMPI_Comm_free(137): MPI_Comm_free(comm=0x7ffc4cef5528) failed
PMPI_Comm_free(85).: Null communicator
Abort(940128773) on node 44 (rank 44 in comm 0): Fatal error in PMPI_Comm_free: Invalid communicator, error stack:
PMPI_Comm_free(137): MPI_Comm_free(comm=0x7ffd12d17fa8) failed
PMPI_Comm_free(85).: Null communicator
Abort(940128773) on node 56 (rank 56 in comm 0): Fatal error in PMPI_Comm_free: Invalid communicator, error stack:
PMPI_Comm_free(137): MPI_Comm_free(comm=0x7fff2e304428) failed
PMPI_Comm_free(85).: Null communicator
slurmstepd: error: *** STEP 1982493.0 ON node-146 CANCELLED AT 2024-07-22T12:49:17 ***
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
srun: error: node-149: tasks 48-71: Killed
srun: error: node-146: tasks 0-23: Killed
srun: error: node-147: tasks 24-47: Killed
-------------- next part --------------
A non-text attachment was scrubbed...
Name: si.scf.in
Type: application/octet-stream
Size: 8894 bytes
Desc: not available
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20240722/a4290691/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: siv1.slurm
Type: application/octet-stream
Size: 652 bytes
Desc: not available
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20240722/a4290691/attachment-0001.obj>


More information about the users mailing list