[Pw_forum] GIPAW acceleration
Yasser Fowad AlWahedi
yaalwahedi at pi.ac.ae
Sun Jul 16 10:26:13 CEST 2017
Thanks for your support and my apologies for the late reply. PW and GIPAW are compiled using GNU compilers and the intel MKL libs.
I am running DFT of Ni2P clusters of various surfaces over two computational rigs:
1) The university cluster: Each node consist of dual 8 cores/8 threads CPUs Xeons clocked at 2.2 GHz with 64 GB ram. I only use one node per simulation. For storage it uses a mechanical hard drive . (Later called C1)
2) My home pc: which is equipped with i7 5930K processor 6 cores 12 threads clocked at 3.9 GHz with 128 GB ram (Later called C2). For storage I use a Samsung 850 EVO SSD.
Below table summarize the cases performed/running and the time of finish or expected time of finishing assuming linear extrapolation.
# of atoms npool Cores # kpoints per pool Computer Time (hrs)
30 2 16 17 C1 28.9
38 1 16 25 C1 31.3
49 1 16 34 C1 124.9*
50 2 16 17 C1 474.6*
52 1 10 34 C2 295.2*
* estimated time of finish
I understand that the cases are different and as such they will require more or less time to finish.
But I noticed that the 50 and 52 cases which are quite similar (same k points and similar number of atoms) but done over two different systems attain substantially different time of finish. My guess it is probably due to the SSD being used to write off the data. Considering that C2 uses less computational threads and more atoms but is expected to finish faster.
I also noticed an interesting relation. GIPAW runs succeed if number of cores (np) <= number of k points/npool. I checked this in the 38 atom case which kept failing whenever I chose a number of processors higher than the number of kpoints per pool. Although the SCF runs was finishing successfully all the time. This was also observed in other cases. Is this a general rule?
Below is the timing output of the 38 atoms case:
gipaw_setup : 0.46s CPU 0.50s WALL ( 1 calls)
greenf : 20177.91s CPU 20207.68s WALL ( 600 calls)
cgsolve : 20057.24s CPU 20086.82s WALL ( 600 calls)
ch_psi : 19536.93s CPU 19563.75s WALL ( 44231 calls)
h_psiq : 13685.97s CPU 13707.40s WALL ( 44231 calls)
h_psi : 44527.30s CPU 46802.35s WALL ( 5434310 calls)
apply_vel : 262.98s CPU 263.30s WALL ( 525 calls)
j_para : 559.19s CPU 560.39s WALL ( 675 calls)
biot_savart : 0.05s CPU 0.06s WALL ( 1 calls)
calbec : 39849.22s CPU 37474.79s WALL (10917262 calls)
fft : 0.12s CPU 0.15s WALL ( 42 calls)
ffts : 0.01s CPU 0.01s WALL ( 10 calls)
fftw : 8220.39s CPU 9116.72s WALL (27084278 calls)
davcio : 0.02s CPU 1.88s WALL ( 400 calls)
fft_scatter : 3533.10s CPU 3242.29s WALL (27084330 calls)
GIPAW : 112557.79s CPU 112726.12s WALL ( 1 calls)
From: pw_forum-bounces at pwscf.org [mailto:pw_forum-bounces at pwscf.org] On Behalf Of Davide Ceresoli
Sent: Thursday, July 13, 2017 8:30 PM
To: PWSCF Forum <pw_forum at pwscf.org>
Subject: Re: [Pw_forum] GIPAW acceleration
how many atoms? how many k-points? I/O can always be the reason, but in my experience if the system is very large, time is dominated by computation, not I/O.
You should get some speedup if diagonalization='cg' in GIPAW.
Anyway, if I have time, I will introduce a "disk_io" variable in the input file, to try to keep more data in memory instead that on disk.
On 07/13/2017 10:02 AM, Yasser Fowad AlWahedi wrote:
> Dear GIPAW users,
> For nmr shifts calculations, I am suffering from the extreme slowness
> of GIPAW nmr shifts calculations. I have noticed that GIPAW write off
> the results frequently for restart purposes. In our clusters we have
> mechanical hard drives which stores the off data for. Could that be a reason for its slowness?
> Yasser Al Wahedi
> Assistant Professor
> Khalifa University of Science and Technology
CNR Institute of Molecular Science and Technology (CNR-ISTM)
c/o University of Milan, via Golgi 19, 20133 Milan, Italy
Email: davide.ceresoli at istm.cnr.it
Phone: +39-02-50314276, +39-347-1001570 (mobile)
Pw_forum mailing list
Pw_forum at pwscf.org
More information about the users