[QE-users] Fatal error in PMPI_Comm_free: Invalid communicator
Md. Jahid Hasan Sagor
md.sagor at maine.edu
Mon Jul 22 19:11:29 CEST 2024
*Hi Dear QE experts,*
*I ran a job file of silicon (4*4*4) supercell (total 128 atoms) and ended
up with the following error.*
Abort(336148997) on node 40 (rank 40 in comm 0): Fatal error in
PMPI_Comm_free: Invalid communicator, error stack:
PMPI_Comm_free(137): MPI_Comm_free(comm=0x7ffc4cef5528) failed
PMPI_Comm_free(85).: Null communicator
Abort(940128773) on node 44 (rank 44 in comm 0): Fatal error in
PMPI_Comm_free: Invalid communicator, error stack:
PMPI_Comm_free(137): MPI_Comm_free(comm=0x7ffd12d17fa8) failed
PMPI_Comm_free(85).: Null communicator
Abort(940128773) on node 56 (rank 56 in comm 0): Fatal error in
PMPI_Comm_free: Invalid communicator, error stack:
PMPI_Comm_free(137): MPI_Comm_free(comm=0x7fff2e304428) failed
PMPI_Comm_free(85).: Null communicator
slurmstepd: error: *** STEP 1982493.0 ON node-146 CANCELLED AT
2024-07-22T12:49:17 ***
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
srun: error: node-149: tasks 48-71: Killed
srun: error: node-146: tasks 0-23: Killed
srun: error: node-147: tasks 24-47: Killed
*Here is my job file. I have also attached the input file si.scf.in
<http://si.scf.in> for your kind perusal. *
#!/usr/bin/env bash
#SBATCH --job-name=hasanjob
#SBATCH --nodes=3 # node count
#SBATCH --ntasks-per-node=24 # number of tasks per node
#SBATCH --cpus-per-task=1 # cpu-cores per task (>1 if
multi-threaded tasks)
#SBATCH --mem-per-cpu=5gb # Job memory request
#SBATCH --time=120:00:00 # Time limit hrs:min:sec
#SBATCH --output=sdc_save.txt # Standard output and error log
#SBATCH --partition=skylake # MOAB/Torque called these queues
module unload mvapich2
module load mvapich2-intel/2.3.5
module load quantum-espresso
srun pw.x < si.scf.in > si.scf.out
*Would you please help me to solve the problem? I have opened the structure
in xcrysden and it looks fine to me.*
Best
Md Jahid Hasan
PhD student
Mechanical Engineering
University of Maine
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20240722/a4290691/attachment.html>
-------------- next part --------------
Abort(336148997) on node 40 (rank 40 in comm 0): Fatal error in PMPI_Comm_free: Invalid communicator, error stack:
PMPI_Comm_free(137): MPI_Comm_free(comm=0x7ffc4cef5528) failed
PMPI_Comm_free(85).: Null communicator
Abort(940128773) on node 44 (rank 44 in comm 0): Fatal error in PMPI_Comm_free: Invalid communicator, error stack:
PMPI_Comm_free(137): MPI_Comm_free(comm=0x7ffd12d17fa8) failed
PMPI_Comm_free(85).: Null communicator
Abort(940128773) on node 56 (rank 56 in comm 0): Fatal error in PMPI_Comm_free: Invalid communicator, error stack:
PMPI_Comm_free(137): MPI_Comm_free(comm=0x7fff2e304428) failed
PMPI_Comm_free(85).: Null communicator
slurmstepd: error: *** STEP 1982493.0 ON node-146 CANCELLED AT 2024-07-22T12:49:17 ***
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
srun: error: node-149: tasks 48-71: Killed
srun: error: node-146: tasks 0-23: Killed
srun: error: node-147: tasks 24-47: Killed
-------------- next part --------------
A non-text attachment was scrubbed...
Name: si.scf.in
Type: application/octet-stream
Size: 8894 bytes
Desc: not available
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20240722/a4290691/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: siv1.slurm
Type: application/octet-stream
Size: 652 bytes
Desc: not available
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20240722/a4290691/attachment-0001.obj>
More information about the users
mailing list