[Pw_forum] PHONON errors varies when i use 6 or 2 cpu?
Xunlei Ding
ding at sissa.it
Fri Jun 8 09:59:10 CEST 2007
Dear xu,
Yes, you are right because ph.x need to read wfc files of scf
calculation. So the number of cpu should be the same.
Maybe you can try wf_collect=.true. in scf calculation if you want to
change the cpu number.
And I suggest you to do some small tests on these questions.
Best wishes,
Ding
xu yuehua wrote:
> Dear Ding:
> I think about your idea , if your idea is correct ,that says:if i use
> 6 cpu to do scf ,then i must use the same number of cpu to continue
> tophonon calculation .is it right for me to comprehend your idea ?
> need your help thanks a lot
>
>
> 2007/6/8, Xunlei Ding <ding at sissa.it <mailto:ding at sissa.it>>:
>
> Dear Xu,
> I think,
> error for 6 cpu calculation is just because one of the six nodes
> is down,
> and error for 4 cpu calculation is because you change 6 cpu to 4 cpu.
> So my suggestion is, doing the ph calculation with 6 cpu again.
>
> Hope it will works.
>
> Yours,
> ding
>
>
>
> xu yuehua wrote:
>
> > hi everyone?
> > today i met a problem when i compute phonon :first i do scf using 6
> > cpu ,then i also use 6 cpu to do phono at G,BUT a problem came
> out in
> > out.file :
> >
> >
> >
> > Proc/ planes cols G planes cols G columns G
> > Pool (dense grid) (smooth grid) (wavefct grid)
> > 1 5 3284 53988 4 2408 34052 719 5577
> > 2 4 3283 53987 4 2407 34051 719 5577
> > 3 4 3283 53987 4 2407 34049 719 5577
> > 4 4 3283 53987 4 2407 34051 719 5577
> > 5 4 3283 53987 4 2407 34049 719 5577
> > 6 4 3283 53987 4 2407 34051 720 5576
> > 0 25 19699 323923 24 14443 204303 4315 33461
> >
> >
> > nbndx = 20 nbnd = 20 natomwfc = 30 npwx
> = 4282
> > nelec = 40.00 nkb = 50 ngl = 10269
> > p0_9381: p4_error: net_recv read: probable EOF on socket: 1
> > Killed by signal 2.^M
> > forrtl: error (69): process interrupted (SIGINT)
> > Killed by signal 2.^M
> > Killed by signal 2.^M
> > Killed by signal 2.^M
> > Killed by signal 2.^M
> > p0_9381: (12.363281) net_send: could not write to fd=4, errno = 32
> > Fri Jun 8 09:41:35 CST 2007
> >
> > because i do not know the reason .and then i try to use 4 cpu to
> > compute phono ,this time the error is like this :
> >
> >
> >
> >
> > Representation 44 1 modes - To be done
> >
> > Representation 45 1 modes - To be done
> > IOS = 36
> >
> > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> > from davcio : error # 20
> > i/o error in davcio
> > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> >
> >
> > stopping ...
> >
> > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
> > from davcio : error # 20
> > i/o error in davcio
> > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> >
> >
> > stopping ...
> > [0] MPI Abort by user Aborting program !
> > [0] Aborting program!
> > p0_11006: p4_error: : 0
> > Killed by signal 2.^M
> > forrtl: error (69): process interrupted (SIGINT)
> > p0_11006: (18.296875 ) net_send: could not write to fd=4, errno
> = 32
> > Fri Jun 8 09:57:22 CST 2007
> >
> > above two case ,the same input:
> > phonons of fiveringwater at Gamma
> > &inputph
> > tr2_ph=1.0d-14,
> > prefix='fxx_specify_ibra_500_12+force',
> > epsil=.true.,
> > amass(1)=1.0,
> > amass(2)=15.999,
> > outdir='/raid/xx/pwscf/tmp/',
> > fildyn='fxx.dynG',
> > /
> > 0.0 0.0 0.0
> >
> >
> >
> >
> >
> > so my question is why different number of cpu can change the
> error ?
> > befor a few days ago ,i use 2 cpu to do relax ,scf and phonon about
> > another case ,there was well ,but now .....?
> > i need your help .thanks
> >
> > --
> > Xu Yuehua
> > physics Department of Nanjing university
> > China
>
> _______________________________________________
> Pw_forum mailing list
> Pw_forum at pwscf.org <mailto:Pw_forum at pwscf.org>
> http://www.democritos.it/mailman/listinfo/pw_forum
>
>
>
>
> --
> Xu Yuehua
> physics Department of Nanjing university
> China
More information about the users
mailing list