[Pw_forum] Distributing phonon calculations todifferent machines

Dal Corso Andrea dalcorso at sissa.it
Thu Jul 8 08:40:53 CEST 2010


On Thu, 2010-07-08 at 11:11 +0800, Huiqun Zhou wrote:
> Andrea,
> 
> I checked the directory, all files are there. See below,
> 
> ls /gpfsTMP/hqzhou/tmp/sill_1.04v0/q1/_ph0sill_1.04v0.phsave/
> data-file.xml      data-file.xml.2  data-file.xml.5  data-file.xml.8
> data-file.xml.1    data-file.xml.3  data-file.xml.6
> data-file.xml.1.0  data-file.xml.4  data-file.xml.7
> 

what about the directory /gpfsTMP/hqzhou/tmp/sill_1.04v0/q1 ? 
Are all the files there? and the directory 
/gpfsTMP/hqzhou/tmp/sill_1.04v0/q1/sill_1.04v0.save  ?

Andrea

> It's a little bit insane, this error only occurred when I compute the
> first q point (gamma). ph.x worked fine while calculating other
> q points. You can see the script snippet in my second post in this
> thread, I had created neccessary directories and copied all needed
> files in advance before distributing jobs.
> 
> Any ideas?
> 
> huiqun zhou
> @earth sciences, nanjing university, china
> 
> ----- Original Message ----- 
> From: "Dal Corso Andrea" <dalcorso at sissa.it>
> To: "PWSCF Forum" <pw_forum at pwscf.org>
> Sent: Wednesday, July 07, 2010 8:06 PM
> Subject: Re: [Pw_forum] Distributing phonon calculations todifferent 
> machines
> 
> 
> > On Tue, 2010-07-06 at 17:55 +0800, Huiqun Zhou wrote:
> >> Thanks, Andrea.
> >>
> >> As indicated in the script below, I have copied all files and directories
> >> created
> >> by pw.x run.
> >>
> >> if test ! -d $TMP_DIR/${system}/q${i} ; then
> >>          mkdir $TMP_DIR/${system}/q${i}
> >>          cp -r $TMP_DIR/${system}/${system}.* $TMP_DIR/${system}/q${i}
> >> fi
> >>
> >
> > Are the files really there? If all the files are in the directory
> > $TMP_DIR/${system}/q${i} there is no reason why the ph.x code stops.
> > If they are not, I cannot help you. It is not a problem of ph.x.
> >
> >
> >> BTW, could you please describe more in detail about the newly added
> >> information
> >> in INPUT_PH.html for distributing phonon calculations to cluster? If I 
> >> set
> >> wf_collect
> >> to true, there should be no relation in nproc and npool between pw.x run 
> >> and
> >> later
> >> two ph.x runs, right? Taking AlAs in the GRID_example as example, If I 
> >> want
> >> to use
> >> server with 8 CPU core to do calculations for each one q point (8 servers 
> >> in
> >> total),
> >> what are the values of images and pools?
> >>
> >>
> > You are right, the explanation refer only to the case
> > wf_collect=.false.. However, image parallelization of ph.x is very
> > experimental, so be patient. At the moment it divides both q and irreps.
> > Load balancing on q only is not implemented.
> >
> >
> > Andrea
> >
> >
> >
> >> Huiqun Zhou
> >> @Earth Sciences, Nanjing University, China
> >>
> >> ----- Original Message ----- 
> >> From: "Dal Corso Andrea" <dalcorso at sissa.it>
> >> To: "PWSCF Forum" <pw_forum at pwscf.org>
> >> Sent: Tuesday, July 06, 2010 4:14 PM
> >> Subject: Re: [Pw_forum] Distributing phonon calculations todifferent
> >> machines
> >>
> >>
> >> > On Tue, 2010-07-06 at 12:13 +0800, Huiqun Zhou wrote:
> >> >> Sorry, I sent an unfinished message.
> >> >>
> >> >> When using _ph0{prefix}.phsave, I got the error message shown in the
> >> >> previous
> >> >> message.
> >> >>
> >> >> Here is the snippet of my script for distributing lsf tasks:
> >> >>
> >> >> ......
> >> >> nq=`sed -n '2p' ./${system}_q${nq1}${nq2}${nq3}.dyn0`
> >> >>
> >> >> for ((i=1; i<=$nq; i++))
> >> >> do
> >> >>     if test ! -d $TMP_DIR/${system}/q${i} ; then
> >> >>         mkdir $TMP_DIR/${system}/q${i}
> >> >>         cp -r $TMP_DIR/${system}/${system}.* $TMP_DIR/${system}/q${i}
> >> >>     fi
> >> >>     if test ! -d $TMP_DIR/${system}/q${i}/_ph0${system}.phsave ; then
> >> >>         mkdir $TMP_DIR/${system}/q${i}/_ph0${system}.phsave
> >> >>         cp -r $TMP_DIR/${system}/_ph0${system}.phsave/*
> >> >> $TMP_DIR/${system}/q${i}/_ph0${system}.phsave
> >> >>     fi
> >> >> done
> >> >>
> >> >> for ((i=1; i<=$nq; i++))
> >> >> do
> >> >> cat > ${system}_q${i}.in << EOF
> >> >> phonons of ${system}
> >> >>  &inputph
> >> >>   tr2_ph = 1.0d-13,
> >> >>   alpha_mix(1) = 0.2,
> >> >>   prefix = '${system}',
> >> >>   ldisp = .true.,
> >> >>   recover = .true.
> >> >>   nq1 = ${nq1}, nq2 = ${nq2}, nq3 = ${nq3}
> >> >>   start_q = $i, last_q = $i
> >> >>   outdir = '$TMP_DIR/${system}/q${i}',
> >> >>   fildyn = '${system}_q${nq1}${nq2}${nq3}.dyn'
> >> >> ......
> >> >> EOF
> >> >> $ECHO "calculation of q point $i"
> >> >> bsub -a intelmpi -n $processes \
> >> >>      -R "span[ptile=8]" \
> >> >>      -J ${r}q${i}anda \
> >> >>      -oo ${system}_q${i}.out \
> >> >>      -eo ${system}_q${i}.err \
> >> >>      $PH_COMMAND -input ./${system}_q${i}.in
> >> >> done
> >> >>
> >> >>
> >> >> Huiqun Zhou
> >> >> @Earth Sciences, Nanjing University, China
> >> >>         ----- Original Message ----- 
> >> >>         From: Huiqun Zhou
> >> >>         To: pw_forum at pwscf.org
> >> >>         Sent: Tuesday, July 06, 2010 12:00 PM
> >> >>         Subject: [Pw_forum] Distributing phonon calculations to
> >> >>         different machines
> >> >>
> >> >>
> >> >>         dear developers,
> >> >>
> >> >>         Please clarify what directory should be copied
> >> >>         for distributing phonon calculations
> >> >>         to different machines,  _ph{prefix}.phsave
> >> >>         or _ph0{prefix}.phsave? The former is
> >> >>         described in the manual INPUT_PH.html, the latter is used in
> >> >>         the GRID_example.
> >> >>         Although there is no _ph{prefix}.phsave existed after the
> >> >>         preparatory run with
> >> >>         start_irr=0 and last_irr=0, using the former works OK at the
> >> >>         cost of redundant
> >> >>         calculations.
> >> >>
> >> >>              Representation #  1 mode #   1
> >> >>
> >> >>              Self-consistent Calculation
> >> >>
> >> >>          %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> >> >>         %%%%%%%%%%%%%%%%%
> >> >>              from davcio : error #        25
> >> >>              error while reading from file
> >> >>          %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> >> >>         %%%%%%%%%%%%%%%%%
> >> >>
> >> >>              stopping ...
> >> >
> >> > Thank you for the message. I will correct the INPUT_PH documentation.
> >> > The correct directory is _ph0{prefix}.phsave.
> >> >
> >> > This message usually means that you have not copied all the required
> >> > files. Did you copy all the files produced by pw.x?
> >> >
> >> > HTH,
> >> >
> >> > Andrea
> >> >
> >> >
> >> >
> >> >>
> >> >>         ______________________________________________________________
> >> >>
> >> >>         _______________________________________________
> >> >>         Pw_forum mailing list
> >> >>         Pw_forum at pwscf.org
> >> >>         http://www.democritos.it/mailman/listinfo/pw_forum
> >> >> _______________________________________________
> >> >> Pw_forum mailing list
> >> >> Pw_forum at pwscf.org
> >> >> http://www.democritos.it/mailman/listinfo/pw_forum
> >> > -- 
> >> > Andrea Dal Corso                    Tel. 0039-040-3787428
> >> > SISSA, Via Beirut 2/4               Fax. 0039-040-3787528
> >> > 34151 Trieste (Italy)               e-mail: dalcorso at sissa.it
> >> >
> >> >
> >> > _______________________________________________
> >> > Pw_forum mailing list
> >> > Pw_forum at pwscf.org
> >> > http://www.democritos.it/mailman/listinfo/pw_forum
> >> >
> >>
> >>
> >> _______________________________________________
> >> Pw_forum mailing list
> >> Pw_forum at pwscf.org
> >> http://www.democritos.it/mailman/listinfo/pw_forum
> > -- 
> > Andrea Dal Corso                    Tel. 0039-040-3787428
> > SISSA, Via Beirut 2/4               Fax. 0039-040-3787528
> > 34151 Trieste (Italy)               e-mail: dalcorso at sissa.it
> >
> >
> > _______________________________________________
> > Pw_forum mailing list
> > Pw_forum at pwscf.org
> > http://www.democritos.it/mailman/listinfo/pw_forum
> > 
> 
> 
> _______________________________________________
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://www.democritos.it/mailman/listinfo/pw_forum
-- 
Andrea Dal Corso                    Tel. 0039-040-3787428
SISSA, Via Beirut 2/4               Fax. 0039-040-3787528
34151 Trieste (Italy)               e-mail: dalcorso at sissa.it





More information about the users mailing list