[Pw_forum] PHONON errors varies when i use 6 or 2 cpu?

Xunlei Ding ding at sissa.it
Fri Jun 8 09:59:10 CEST 2007


Dear xu,
Yes, you are right because ph.x need to read wfc files of scf 
calculation. So the number of cpu should be the same.
Maybe you can try wf_collect=.true. in scf calculation if you want to 
change the cpu number.

And I suggest you to do some small tests on these questions.

Best wishes,
Ding

xu yuehua wrote:

> Dear Ding:
> I think  about your idea , if your idea is correct ,that says:if i use 
> 6 cpu to do scf ,then i must use the same number of cpu to continue 
> tophonon calculation .is it right for  me to comprehend your idea ?
> need your help thanks a lot
>
>  
> 2007/6/8, Xunlei Ding <ding at sissa.it <mailto:ding at sissa.it>>:
>
>     Dear Xu,
>     I think,
>     error for 6 cpu calculation is just because one of the six nodes
>     is down,
>     and error for 4 cpu calculation is because you change 6 cpu to 4 cpu.
>     So my suggestion is, doing the ph calculation with 6 cpu again.
>
>     Hope it will works.
>
>     Yours,
>     ding
>
>
>
>     xu yuehua wrote:
>
>     > hi everyone?
>     > today i met a problem when i compute phonon :first i do scf using 6
>     > cpu ,then i also use 6 cpu to do phono at G,BUT a problem came
>     out in
>     > out.file :
>     >
>     >
>     >
>     >  Proc/  planes cols    G   planes cols    G    columns  G
>     >  Pool       (dense grid)      (smooth grid)   (wavefct grid)
>     >   1      5   3284  53988    4   2408  34052  719   5577
>     >   2      4   3283  53987    4   2407  34051  719   5577
>     >   3      4   3283  53987    4   2407  34049  719   5577
>     >   4      4   3283  53987    4   2407  34051  719   5577
>     >   5      4   3283  53987    4   2407  34049  719   5577
>     >   6      4   3283  53987    4   2407  34051  720   5576
>     >   0     25  19699 323923   24  14443 204303 4315  33461
>     >
>     >
>     >      nbndx  =    20  nbnd   =    20  natomwfc =    30  npwx  
>     =    4282
>     >      nelec  =  40.00  nkb   =    50  ngl    =   10269
>     > p0_9381:  p4_error: net_recv read:  probable EOF on socket: 1
>     > Killed by signal 2.^M
>     > forrtl: error (69): process interrupted (SIGINT)
>     > Killed by signal 2.^M
>     > Killed by signal 2.^M
>     > Killed by signal 2.^M
>     > Killed by signal 2.^M
>     > p0_9381: (12.363281) net_send: could not write to fd=4, errno = 32
>     > Fri Jun  8 09:41:35 CST 2007
>     >
>     > because i do not know the reason .and then i try to use 4 cpu to
>     > compute phono  ,this time the error is like this :
>     >
>     >
>     >
>     >
>     > Representation    44      1 modes - To be done
>     >
>     >      Representation    45      1 modes - To be done
>     >  IOS = 36
>     >
>     >  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>     >      from davcio : error #        20
>     >      i/o error in davcio
>     >  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>     >
>     >
>     >      stopping ...
>     >
>     >  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
>     >      from davcio : error #        20
>     >      i/o error in davcio
>     >  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>     >
>     >
>     >      stopping ...
>     > [0] MPI Abort by user Aborting program !
>     > [0] Aborting program!
>     > p0_11006:  p4_error: : 0
>     > Killed by signal 2.^M
>     > forrtl: error (69): process interrupted (SIGINT)
>     > p0_11006: (18.296875 ) net_send: could not write to fd=4, errno
>     = 32
>     > Fri Jun  8 09:57:22 CST 2007
>     >
>     > above two case ,the same input:
>     > phonons of fiveringwater at Gamma
>     >  &inputph
>     >   tr2_ph=1.0d-14,
>     >   prefix='fxx_specify_ibra_500_12+force',
>     >   epsil=.true.,
>     >   amass(1)=1.0,
>     >   amass(2)=15.999,
>     >   outdir='/raid/xx/pwscf/tmp/',
>     >   fildyn='fxx.dynG',
>     >  /
>     > 0.0 0.0 0.0
>     >
>     >
>     >
>     >
>     >
>     > so my question is  why different number of cpu can change the
>     error ?
>     > befor a few days ago ,i use 2 cpu to do relax ,scf and phonon about
>     > another case ,there was well ,but now .....?
>     > i need your  help .thanks
>     >
>     > --
>     > Xu Yuehua
>     > physics Department of Nanjing university
>     > China
>
>     _______________________________________________
>     Pw_forum mailing list
>     Pw_forum at pwscf.org <mailto:Pw_forum at pwscf.org>
>     http://www.democritos.it/mailman/listinfo/pw_forum
>
>
>
>
> -- 
> Xu Yuehua
> physics Department of Nanjing university
> China 




More information about the users mailing list