[Pw_forum] problem with phonon in parallel

Eduardo Ariel Menendez P emenendez at macul.ciencias.uchile.cl
Fri Mar 11 21:09:45 CET 2005


Hello,

I have a problem with the use of phonon in parallel. I can run pw.x with
no apparent problem, but  ph.x fails.
For example, I run the example 06 step by step,

mdo_mpi_fast /home/gonzalo/eariel/Compilaciones/Espresso/bin/pw.x < alas.scf.in >alas.scf.out

mdo_mpi_fast /home/gonzalo/eariel/Compilaciones/Espresso/bin/ph.x < alas.phG.in  > alas.phG.out

and this is how the output of ph.x looks like:

MPI_Recv: message truncated (rank 1, comm 4) %Really early problem

     Program PHONON    v.2.1.2  starts ...
     Today is 11Mar2005 at 13:35: 9

     Parallel version (MPI)

     Number of processors in use:       2
     R & G space division:  nprocp =    2

     Ultrasoft (Vanderbilt) Pseudopotentials

     Reading file alas.save ...
     read complete

     Reading file alas.save ...
     read complete

     Planes per process (thick) : nr3 = 20 npp =  10 ncplane =  400

.......................................

     Atomic displacements:
     There are   2 irreducible representations

     Representation     1      3 modes - To be done

     Representation     2      3 modes - To be done
Rank (1, MPI_COMM_WORLD): Call stack within LAM:
Rank (1, MPI_COMM_WORLD):  - MPI_Recv()
Rank (1, MPI_COMM_WORLD):  - MPI_Bcast()
Rank (1, MPI_COMM_WORLD):  - MPI_Allreduce()
Rank (1, MPI_COMM_WORLD):  - main()

and I receive this error message

-----------------------------------------------------------------------------
One of the processes started by mpirun has exited with a nonzero exit
code.  This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.

PID 27484 failed on node n0 (1.1.2.3) with exit status 1.
-----------------------------------------------------------------------------
mpirun failed with exit status 1

Will. This MPI problem is not always so evident.
I also run with the option -npool 2. (there are two k-points in example
06).

In this case the run begins ok
     Program PHONON    v.2.1.2  starts ...
     Today is 11Mar2005 at 14:12:47

     Parallel version (MPI)

     Number of processors in use:       2
     K-points division:     npool  =    2

but it stops at the end
     Electric Fields Calculation

      iter #   1 total cpu time :     0.6 secs   av.it.:   6.3
      thresh= 0.100E-01 alpha_mix =  0.700 |ddv_scf|^2 =  0.585E-05

      iter #   2 total cpu time :     0.9 secs   av.it.:   9.3
      thresh= 0.242E-03 alpha_mix =  0.700 |ddv_scf|^2 =  0.742E-06

      iter #   3 total cpu time :     1.2 secs   av.it.:   9.3
      thresh= 0.861E-04 alpha_mix =  0.700 |ddv_scf|^2 =  0.188E-08

      iter #   4 total cpu time :     1.5 secs   av.it.:  10.0
      thresh= 0.433E-05 alpha_mix =  0.700 |ddv_scf|^2 =  0.120E-10

      iter #   5 total cpu time :     1.8 secs   av.it.:   9.3
      thresh= 0.347E-06 alpha_mix =  0.700 |ddv_scf|^2 =  0.573E-12

     End of electric fields calculation

          Dielectric constant in cartesian axis

          (      27.811916498       0.
-----------------------------------------------------------------------------
One of the processes started by mpirun has exited with a nonzero exit
code.  This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.

PID 27533 failed on node n1 (1.1.2.3) due to signal 11.
-----------------------------------------------------------------------------
mpirun failed with exit status 11


I would appreciate any suggestion to locate the cause.

Best regards
Eduardo

Eduardo A. Menendez Proupin
Department of Physics
Faculty of Science
University of Chile
Las Palmeras 3425
Ñuñoa, Santiago
Chile
Phone: 56+2+678 74 11
http://fisica.ciencias.uchile.cl/~emenendez/



More information about the users mailing list