[Pw_forum] Distributing phonon calculations todifferent machines

Huiqun Zhou hqzhou at nju.edu.cn
Fri Jul 9 12:24:02 CEST 2010


Andrea,

Below are what I have in q1/ directory, it includes everything, I think.

[hqzhou at c01n05 ~]$ ls -l /gpfsTMP/hqzhou/tmp/sill_1.04v0/q1/
total 405184
drwxrwxr-x  2 hqzhou hqzhou    65536 Jul  9 18:17 _ph0sill_1.04v0.phsave
drwxrwxr-x 10 hqzhou hqzhou    65536 Jul  6 07:54 sill_1.04v0.save
-rw-rw-r--  1 hqzhou hqzhou 52060160 Jul  9 18:14 sill_1.04v0.wfc1
-rw-rw-r--  1 hqzhou hqzhou 52224000 Jul  9 18:14 sill_1.04v0.wfc2
-rw-rw-r--  1 hqzhou hqzhou 51773440 Jul  9 18:14 sill_1.04v0.wfc3
-rw-rw-r--  1 hqzhou hqzhou 51814400 Jul  9 18:14 sill_1.04v0.wfc4
-rw-rw-r--  1 hqzhou hqzhou 51855360 Jul  9 18:14 sill_1.04v0.wfc5
-rw-rw-r--  1 hqzhou hqzhou 51363840 Jul  9 18:14 sill_1.04v0.wfc6
-rw-rw-r--  1 hqzhou hqzhou 51896320 Jul  9 18:14 sill_1.04v0.wfc7
-rw-rw-r--  1 hqzhou hqzhou 51609600 Jul  9 18:14 sill_1.04v0.wfc8

Thanks for your help.

huiqun zhou
@earth sciences, nanjing university, china

----- Original Message ----- 
From: "Dal Corso Andrea" <dalcorso at sissa.it>
To: "PWSCF Forum" <pw_forum at pwscf.org>
Sent: Thursday, July 08, 2010 2:40 PM
Subject: Re: [Pw_forum] Distributing phonon calculations todifferent 
machines


> On Thu, 2010-07-08 at 11:11 +0800, Huiqun Zhou wrote:
>> Andrea,
>>
>> I checked the directory, all files are there. See below,
>>
>> ls /gpfsTMP/hqzhou/tmp/sill_1.04v0/q1/_ph0sill_1.04v0.phsave/
>> data-file.xml      data-file.xml.2  data-file.xml.5  data-file.xml.8
>> data-file.xml.1    data-file.xml.3  data-file.xml.6
>> data-file.xml.1.0  data-file.xml.4  data-file.xml.7
>>
>
> what about the directory /gpfsTMP/hqzhou/tmp/sill_1.04v0/q1 ?
> Are all the files there? and the directory
> /gpfsTMP/hqzhou/tmp/sill_1.04v0/q1/sill_1.04v0.save  ?
>
> Andrea
>
>> It's a little bit insane, this error only occurred when I compute the
>> first q point (gamma). ph.x worked fine while calculating other
>> q points. You can see the script snippet in my second post in this
>> thread, I had created neccessary directories and copied all needed
>> files in advance before distributing jobs.
>>
>> Any ideas?
>>
>> huiqun zhou
>> @earth sciences, nanjing university, china
>>
>> ----- Original Message ----- 
>> From: "Dal Corso Andrea" <dalcorso at sissa.it>
>> To: "PWSCF Forum" <pw_forum at pwscf.org>
>> Sent: Wednesday, July 07, 2010 8:06 PM
>> Subject: Re: [Pw_forum] Distributing phonon calculations todifferent
>> machines
>>
>>
>> > On Tue, 2010-07-06 at 17:55 +0800, Huiqun Zhou wrote:
>> >> Thanks, Andrea.
>> >>
>> >> As indicated in the script below, I have copied all files and 
>> >> directories
>> >> created
>> >> by pw.x run.
>> >>
>> >> if test ! -d $TMP_DIR/${system}/q${i} ; then
>> >>          mkdir $TMP_DIR/${system}/q${i}
>> >>          cp -r $TMP_DIR/${system}/${system}.* $TMP_DIR/${system}/q${i}
>> >> fi
>> >>
>> >
>> > Are the files really there? If all the files are in the directory
>> > $TMP_DIR/${system}/q${i} there is no reason why the ph.x code stops.
>> > If they are not, I cannot help you. It is not a problem of ph.x.
>> >
>> >
>> >> BTW, could you please describe more in detail about the newly added
>> >> information
>> >> in INPUT_PH.html for distributing phonon calculations to cluster? If I
>> >> set
>> >> wf_collect
>> >> to true, there should be no relation in nproc and npool between pw.x 
>> >> run
>> >> and
>> >> later
>> >> two ph.x runs, right? Taking AlAs in the GRID_example as example, If I
>> >> want
>> >> to use
>> >> server with 8 CPU core to do calculations for each one q point (8 
>> >> servers
>> >> in
>> >> total),
>> >> what are the values of images and pools?
>> >>
>> >>
>> > You are right, the explanation refer only to the case
>> > wf_collect=.false.. However, image parallelization of ph.x is very
>> > experimental, so be patient. At the moment it divides both q and 
>> > irreps.
>> > Load balancing on q only is not implemented.
>> >
>> >
>> > Andrea
>> >
>> >
>> >
>> >> Huiqun Zhou
>> >> @Earth Sciences, Nanjing University, China
>> >>
>> >> ----- Original Message ----- 
>> >> From: "Dal Corso Andrea" <dalcorso at sissa.it>
>> >> To: "PWSCF Forum" <pw_forum at pwscf.org>
>> >> Sent: Tuesday, July 06, 2010 4:14 PM
>> >> Subject: Re: [Pw_forum] Distributing phonon calculations todifferent
>> >> machines
>> >>
>> >>
>> >> > On Tue, 2010-07-06 at 12:13 +0800, Huiqun Zhou wrote:
>> >> >> Sorry, I sent an unfinished message.
>> >> >>
>> >> >> When using _ph0{prefix}.phsave, I got the error message shown in 
>> >> >> the
>> >> >> previous
>> >> >> message.
>> >> >>
>> >> >> Here is the snippet of my script for distributing lsf tasks:
>> >> >>
>> >> >> ......
>> >> >> nq=`sed -n '2p' ./${system}_q${nq1}${nq2}${nq3}.dyn0`
>> >> >>
>> >> >> for ((i=1; i<=$nq; i++))
>> >> >> do
>> >> >>     if test ! -d $TMP_DIR/${system}/q${i} ; then
>> >> >>         mkdir $TMP_DIR/${system}/q${i}
>> >> >>         cp -r $TMP_DIR/${system}/${system}.* 
>> >> >> $TMP_DIR/${system}/q${i}
>> >> >>     fi
>> >> >>     if test ! -d $TMP_DIR/${system}/q${i}/_ph0${system}.phsave ; 
>> >> >> then
>> >> >>         mkdir $TMP_DIR/${system}/q${i}/_ph0${system}.phsave
>> >> >>         cp -r $TMP_DIR/${system}/_ph0${system}.phsave/*
>> >> >> $TMP_DIR/${system}/q${i}/_ph0${system}.phsave
>> >> >>     fi
>> >> >> done
>> >> >>
>> >> >> for ((i=1; i<=$nq; i++))
>> >> >> do
>> >> >> cat > ${system}_q${i}.in << EOF
>> >> >> phonons of ${system}
>> >> >>  &inputph
>> >> >>   tr2_ph = 1.0d-13,
>> >> >>   alpha_mix(1) = 0.2,
>> >> >>   prefix = '${system}',
>> >> >>   ldisp = .true.,
>> >> >>   recover = .true.
>> >> >>   nq1 = ${nq1}, nq2 = ${nq2}, nq3 = ${nq3}
>> >> >>   start_q = $i, last_q = $i
>> >> >>   outdir = '$TMP_DIR/${system}/q${i}',
>> >> >>   fildyn = '${system}_q${nq1}${nq2}${nq3}.dyn'
>> >> >> ......
>> >> >> EOF
>> >> >> $ECHO "calculation of q point $i"
>> >> >> bsub -a intelmpi -n $processes \
>> >> >>      -R "span[ptile=8]" \
>> >> >>      -J ${r}q${i}anda \
>> >> >>      -oo ${system}_q${i}.out \
>> >> >>      -eo ${system}_q${i}.err \
>> >> >>      $PH_COMMAND -input ./${system}_q${i}.in
>> >> >> done
>> >> >>
>> >> >>
>> >> >> Huiqun Zhou
>> >> >> @Earth Sciences, Nanjing University, China
>> >> >>         ----- Original Message ----- 
>> >> >>         From: Huiqun Zhou
>> >> >>         To: pw_forum at pwscf.org
>> >> >>         Sent: Tuesday, July 06, 2010 12:00 PM
>> >> >>         Subject: [Pw_forum] Distributing phonon calculations to
>> >> >>         different machines
>> >> >>
>> >> >>
>> >> >>         dear developers,
>> >> >>
>> >> >>         Please clarify what directory should be copied
>> >> >>         for distributing phonon calculations
>> >> >>         to different machines,  _ph{prefix}.phsave
>> >> >>         or _ph0{prefix}.phsave? The former is
>> >> >>         described in the manual INPUT_PH.html, the latter is used 
>> >> >> in
>> >> >>         the GRID_example.
>> >> >>         Although there is no _ph{prefix}.phsave existed after the
>> >> >>         preparatory run with
>> >> >>         start_irr=0 and last_irr=0, using the former works OK at 
>> >> >> the
>> >> >>         cost of redundant
>> >> >>         calculations.
>> >> >>
>> >> >>              Representation #  1 mode #   1
>> >> >>
>> >> >>              Self-consistent Calculation
>> >> >>
>> >> >> 
>> >> >> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>> >> >>         %%%%%%%%%%%%%%%%%
>> >> >>              from davcio : error #        25
>> >> >>              error while reading from file
>> >> >> 
>> >> >> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>> >> >>         %%%%%%%%%%%%%%%%%
>> >> >>
>> >> >>              stopping ...
>> >> >
>> >> > Thank you for the message. I will correct the INPUT_PH 
>> >> > documentation.
>> >> > The correct directory is _ph0{prefix}.phsave.
>> >> >
>> >> > This message usually means that you have not copied all the required
>> >> > files. Did you copy all the files produced by pw.x?
>> >> >
>> >> > HTH,
>> >> >
>> >> > Andrea
>> >> >
>> >> >
>> >> >
>> >> >>
>> >> >> 
>> >> >> ______________________________________________________________
>> >> >>
>> >> >>         _______________________________________________
>> >> >>         Pw_forum mailing list
>> >> >>         Pw_forum at pwscf.org
>> >> >>         http://www.democritos.it/mailman/listinfo/pw_forum
>> >> >> _______________________________________________
>> >> >> Pw_forum mailing list
>> >> >> Pw_forum at pwscf.org
>> >> >> http://www.democritos.it/mailman/listinfo/pw_forum
>> >> > -- 
>> >> > Andrea Dal Corso                    Tel. 0039-040-3787428
>> >> > SISSA, Via Beirut 2/4               Fax. 0039-040-3787528
>> >> > 34151 Trieste (Italy)               e-mail: dalcorso at sissa.it
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > Pw_forum mailing list
>> >> > Pw_forum at pwscf.org
>> >> > http://www.democritos.it/mailman/listinfo/pw_forum
>> >> >
>> >>
>> >>
>> >> _______________________________________________
>> >> Pw_forum mailing list
>> >> Pw_forum at pwscf.org
>> >> http://www.democritos.it/mailman/listinfo/pw_forum
>> > -- 
>> > Andrea Dal Corso                    Tel. 0039-040-3787428
>> > SISSA, Via Beirut 2/4               Fax. 0039-040-3787528
>> > 34151 Trieste (Italy)               e-mail: dalcorso at sissa.it
>> >
>> >
>> > _______________________________________________
>> > Pw_forum mailing list
>> > Pw_forum at pwscf.org
>> > http://www.democritos.it/mailman/listinfo/pw_forum
>> >
>>
>>
>> _______________________________________________
>> Pw_forum mailing list
>> Pw_forum at pwscf.org
>> http://www.democritos.it/mailman/listinfo/pw_forum
> -- 
> Andrea Dal Corso                    Tel. 0039-040-3787428
> SISSA, Via Beirut 2/4               Fax. 0039-040-3787528
> 34151 Trieste (Italy)               e-mail: dalcorso at sissa.it
>
>
> _______________________________________________
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://www.democritos.it/mailman/listinfo/pw_forum
> 





More information about the users mailing list