[QE-users] Huge difference between Wall time and CPU time in electron-phonon calculation
Paolo Giannozzi
p.giannozzi at gmail.com
Wed Oct 7 14:22:36 CEST 2020
If your cluster has a disk per node with a local (i.e., directly accessible
by the node but not visible from other nodes) you may try to write data to
that disk. It's a pain in the *** and may require some tweaking, but it
might work. You need to ensure that the directories are there (they are
created if missing only on the node performing I/O) and always run on the
same set of processors (and hope that MPI keeps the same ordering)
Paolo
On Wed, Oct 7, 2020 at 1:21 PM pippo pippo <chiedoper1amico at gmail.com>
wrote:
> Dear Lorenzo and Michel,
>
> thank you for your reply. Yes, unfortunately I am using a small cluster
> and, for memory issues, I cannot use the SSD-based local scratch. Thus I
> am forced to use a
> non-parallel NFS. From your answers, I understand that, given
> the situation, the huge difference between the CPU and WALL time that I am
> observing is reasonable.
> Since I am facing a hardware limitation I guess I cannot do much to
> improve the performances. I will try to find some solutions with the person
> in charge of the cluster.
>
> Best Regards,
> Raffaello Bianco
> UPV/EHU - CFM
>
> On Wed, Sep 30, 2020 at 4:13 PM Michal Krompiec <michal.krompiec at gmail.com>
> wrote:
>
>> Dear Rafaello,
>> Are you using a local (preferably SSD-based) scratch drive, or a very
>> fast parallel file system?
>> Best wishes,
>> Michal Krompiec
>> Merck KGaA
>>
>> On Wed, 30 Sep 2020 at 15:05, Raffaello Bianco <
>> raffaello.bianco.it at gmail.com> wrote:
>>
>>> Dear QE users and developers,
>>>
>>> I am doing an electron-phonon coupling calculation in this way (I am
>>> using QE v 6.6).
>>> First, I have done an scf calculation. Then, I have done a phonon
>>> calculation where I have printed the dvscf files, with
>>>
>>> fildvscf = 'dvscf'
>>>
>>> Subsequently, I have done the electron-phonon calculation changing the
>>> k-mesh grid, with
>>>
>>> trans = .false.
>>> electron_phonon = 'simple'
>>>
>>> The calculation ends correctly, but for some q points I have noticed a
>>> huge difference between
>>> CPU and Wall time, like
>>>
>>> PHONON : 15h55m CPU 3d18h56m WALL
>>>
>>> From the report at the end of the output, the I/O davcio routine seems
>>> to be the
>>> "guilty":
>>>
>>>
>>> General routines
>>> davcio : 107.89s CPU 263856.08s WALL ( 520331 calls)
>>>
>>> Parallel routines
>>>
>>> Electron-phonon coupling
>>> elphon : 41730.55s CPU 309708.87s WALL ( 1 calls)
>>> elphel : 41671.20s CPU 309625.04s WALL ( 60 calls)
>>>
>>> General routines
>>> davcio : 107.89s CPU 263856.08s WALL ( 520331 calls)
>>>
>>> This calculation was done with 10 processors and npool = 10, if I use
>>> 40 processors and npool = 10 it is worse (as can probably be expected due
>>> to the higher number of I / O operations). I have looked at the
>>> documentation but I am not very familiar with these things, thus I still
>>> have several doubts. Any suggestions on tests to do or how to improve
>>> performance, or at least comments to clarify the problem, would be greatly
>>> appreciated.
>>>
>>> Thank you for your time,
>>>
>>> Best,
>>> Raffaello Bianco
>>>
>>>
>>> _______________________________________________
>>> Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
>>> users mailing list users at lists.quantum-espresso.org
>>> https://lists.quantum-espresso.org/mailman/listinfo/users
>>
>> _______________________________________________
>> Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
>> users mailing list users at lists.quantum-espresso.org
>> https://lists.quantum-espresso.org/mailman/listinfo/users
>
> _______________________________________________
> Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
> users mailing list users at lists.quantum-espresso.org
> https://lists.quantum-espresso.org/mailman/listinfo/users
--
Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
Phone +39-0432-558216, fax +39-0432-558222
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20201007/aea053f7/attachment.html>
More information about the users
mailing list