[QE-users] Issue with running parallel version of QE 6.3 on more than 16 cpu

Martina Lessio ml4132 at columbia.edu
Thu Aug 30 18:49:18 CEST 2018


Dear Paolo,

Thanks for testing my input file. I guess this means there is something
wrong with how I compiled the code, although it's hard to understand why
the error only occurs when I submit my job on more than 16 processors.

Thanks again for your time.

All the best,
Martina

On Thu, Aug 30, 2018 at 12:43 PM Paolo Giannozzi <p.giannozzi at gmail.com>
wrote:

> It works for me, on 18, 24, 32 processors, at least for the development
> version. I ran it on a 16-processor machine, but it doesn't matter how many
> physical cores one has (the code knows nothing about the actual number of
> cores, only about the number of MPI processes)
>
> Paolo
>
> On Thu, Aug 30, 2018 at 6:23 PM, Martina Lessio <ml4132 at columbia.edu>
> wrote:
>
>> Dear Paolo,
>>
>> Apologies for not including those details. The sample error message
>> reported in my previous email was the result of a calculation run on 18 cpu
>> (but I get similar messages when running on other cpu numbers larger than
>> 16) using the following submission command:
>> mpirun -np 18 pw.x < MoTe2_opt.in
>>
>> Thank you in advance for your help.
>>
>> All the best,
>> Martina
>>
>> On Thu, Aug 30, 2018 at 12:09 PM Paolo Giannozzi <p.giannozzi at gmail.com>
>> wrote:
>>
>>> Please report the exact conditions under which you are running the
>>> 24-processor case: something like
>>>   mpirun -np 24 pw.x -nk .. -nd .. -whatever_option
>>>
>>> Paolo
>>>
>>> On Wed, Aug 29, 2018 at 11:49 PM, Martina Lessio <ml4132 at columbia.edu>
>>> wrote:
>>>
>>>> Dear all,
>>>>
>>>> I have been successfully using QE 5.4 for a while now but recently
>>>> decided to install the newest version hoping that some issues I have been
>>>> experiencing with 5.4 would be resolved. However, I now have some issues
>>>> when running version 6.3 in parallel. In particular, if I run a sample
>>>> calculation (input file provided below) on more than 16 processors the
>>>> calculation crashes after printing this line "Starting wfcs are random" and
>>>> the following error message is printed in the output file:
>>>> [compute-0-5.local:5241] *** An error occurred in MPI_Bcast
>>>> [compute-0-5.local:5241] *** on communicator MPI COMMUNICATOR 20 SPLIT
>>>> FROM 18
>>>> [compute-0-5.local:5241] *** MPI_ERR_TRUNCATE: message truncated
>>>> [compute-0-5.local:5241] *** MPI_ERRORS_ARE_FATAL: your MPI job will
>>>> now abort
>>>>
>>>> --------------------------------------------------------------------------
>>>> mpirun has exited due to process rank 16 with PID 5243 on
>>>> node compute-0-5.local exiting improperly. There are two reasons this
>>>> could occur:
>>>>
>>>> 1. this process did not call "init" before exiting, but others in
>>>> the job did. This can cause a job to hang indefinitely while it waits
>>>> for all processes to call "init". By rule, if one process calls "init",
>>>> then ALL processes must call "init" prior to termination.
>>>>
>>>> 2. this process called "init", but exited without calling "finalize".
>>>> By rule, all processes that call "init" MUST call "finalize" prior to
>>>> exiting or it will be considered an "abnormal termination"
>>>>
>>>> This may have caused other processes in the application to be
>>>> terminated by signals sent by mpirun (as reported here).
>>>>
>>>> --------------------------------------------------------------------------
>>>> [compute-0-5.local:05226] 1 more process has sent help message
>>>> help-mpi-errors.txt / mpi_errors_are_fatal
>>>> [compute-0-5.local:05226] Set MCA parameter "orte_base_help_aggregate"
>>>> to 0 to see all help / error messages
>>>>
>>>>
>>>> Note that I have been running QE 5.4 on 24 cpu on this same computer
>>>> cluster without any issue. I am copying my input file at the end of this
>>>> email.
>>>>
>>>> Any help with this would be greatly appreciated.
>>>> Thank you in advance.
>>>>
>>>> All the best,
>>>> Martina
>>>>
>>>> Martina Lessio
>>>> Department of Chemistry
>>>> Columbia University
>>>>
>>>> *Input file:*
>>>> &control
>>>>     calculation = 'relax'
>>>>     restart_mode='from_scratch',
>>>>     prefix='MoTe2_bulk_opt_1',
>>>>     pseudo_dir = '/home/mlessio/espresso-5.4.0/pseudo/',
>>>>     outdir='/home/mlessio/espresso-5.4.0/tempdir/'
>>>>  /
>>>>  &system
>>>>     ibrav= 4, A=3.530, B=3.530, C=13.882, cosAB=-0.5, cosAC=0, cosBC=0,
>>>>     nat= 6, ntyp= 2,
>>>>     ecutwfc =60.
>>>>     occupations='smearing', smearing='gaussian', degauss=0.01
>>>>     nspin =1
>>>>  /
>>>>  &electrons
>>>>     mixing_mode = 'plain'
>>>>     mixing_beta = 0.7
>>>>     conv_thr =  1.0d-10
>>>>  /
>>>>  &ions
>>>>  /
>>>> ATOMIC_SPECIES
>>>>  Mo  95.96 Mo_ONCV_PBE_FR-1.0.upf
>>>>  Te  127.6 Te_ONCV_PBE_FR-1.1.upf
>>>> ATOMIC_POSITIONS {crystal}
>>>> Te     0.333333334         0.666666643         0.625000034
>>>> Te     0.666666641         0.333333282         0.375000000
>>>> Te     0.666666641         0.333333282         0.125000000
>>>> Te     0.333333334         0.666666643         0.874999966
>>>> Mo     0.333333334         0.666666643         0.250000000
>>>> Mo     0.666666641         0.333333282         0.750000000
>>>>
>>>> K_POINTS {automatic}
>>>>   8 8 2 0 0 0
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users at lists.quantum-espresso.org
>>>> https://lists.quantum-espresso.org/mailman/listinfo/users
>>>>
>>>
>>>
>>>
>>> --
>>> Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
>>> Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
>>> Phone +39-0432-558216, fax +39-0432-558222
>>>
>>> _______________________________________________
>>> users mailing list
>>> users at lists.quantum-espresso.org
>>> https://lists.quantum-espresso.org/mailman/listinfo/users
>>
>>
>>
>> --
>> Martina Lessio, Ph.D.
>> Frontiers of Science Lecturer in Discipline
>> Postdoctoral Research Scientist
>> Department of Chemistry
>> Columbia University
>>
>> _______________________________________________
>> users mailing list
>> users at lists.quantum-espresso.org
>> https://lists.quantum-espresso.org/mailman/listinfo/users
>>
>
>
>
> --
> Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
> Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
> Phone +39-0432-558216, fax +39-0432-558222
>
> _______________________________________________
> users mailing list
> users at lists.quantum-espresso.org
> https://lists.quantum-espresso.org/mailman/listinfo/users



-- 
Martina Lessio, Ph.D.
Frontiers of Science Lecturer in Discipline
Postdoctoral Research Scientist
Department of Chemistry
Columbia University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20180830/e38d9e01/attachment.html>


More information about the users mailing list