[QE-users] QE GPU ORTE_ERROR problem
Sitangshu Bhattacharya
sitangshu at iiita.ac.in
Thu Apr 11 20:45:57 CEST 2024
Hi,
I am getting some mpi error while executing the GPU version of QE 7.3.1. I
have used the following commands to install:
module purge
module load nvhpc_23.5/nvhpc/23.5
./configure --with-cuda=$PATH --with-cuda-cc=70 --with-cuda-runtime=12.1
--enable-parallel --enable-openmp --with-cuda-mpi=yes MPIF90=mpif90
FC=nvfortran CC=nvc CXX=nvc++
The nvcc -V shows cuda 12.2. The installation was smooth and all the
binaries were generated. Then I went to the bin and typed ./pw.x.
Unfortunately, this shows:
[login02:158963] [[INVALID],INVALID] ORTE_ERROR_LOG: A system-required
executable either could not be found or was not executable by this user in
file ../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line
388
[login02:158963] [[INVALID],INVALID] ORTE_ERROR_LOG: A system-required
executable either could not be found or was not executable by this user in
file ../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line
166
--------------------------------------------------------------------------
Sorry! You were supposed to get help about:
orte_init:startup:internal-failure
But I couldn't open the help file:
/proj/nv/libraries/Linux_x86_64/23.5/openmpi/227312-rel-2/share/openmpi/help-orte-runtime:
No such file or directory. Sorry!
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Sorry! You were supposed to get help about:
mpi_init:startup:internal-failure
But I couldn't open the help file:
/proj/nv/libraries/Linux_x86_64/23.5/openmpi/227312-rel-2/share/openmpi/help-mpi-runtime.txt:
No such file or directory. Sorry!
--------------------------------------------------------------------------
*** An error occurred in MPI_Init_thread
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[login02:158963] Local abort before MPI_INIT completed completed
successfully, but am not able to aggregate error messages, and not able to
guarantee that all other processes were killed!
Any solutions?
Regards,
**********************************************
Sitangshu Bhattacharya (সিতাংশু ভট্টাচার্য), Ph.D
Assistant Professor,
Room No. 2221, CC-1,
Electronic Structure Theory Group,
Department of Electronics and Communication Engineering,
Indian Institute of Information Technology-Allahabad
Uttar Pradesh 211 012
India
Telephone: 91-532-2922000 Extn.: 2131
Web-page: http://profile.iiita.ac.in/sitangshu/
Institute: http://www.iiita.ac.in/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20240412/0b46ab73/attachment.html>
More information about the users
mailing list