[QE-users] time consuming band structure calculation for a supercell
Lorenzo Paulatto
paulatz at gmail.com
Mon Dec 14 14:50:53 CET 2020
Hello,
I've had a look at the output, and a part for the cutoff which appears a
bit too high (you are probably safe with 50/400Ry of ecutwfc/ecutrho) I
only see to small problems:
1. the scf calculation is using 6 pools with 10 k-points, which means
that 4 pools have twice as much work to do as the others. In the ideal
case, the number of pools should be a divisor of the number of k-points
(i.e. 2, 5 or 10 in your case). Also, it is recommended that the number
of CPUs in a pool are a divisor of the number of CPUs on each computing
node, to avoid too much inter-node communication. In your case, the best
choice with 72 CPUs (on two nodes?) could be 2 pools. You may gain a bit
of time, but this is not going to change a lot. You should consider
using more CPUs if you have the budget. For example, 10 pools of 12 or
18 CPUs each.
2. The bands calculation runs on 12 CPUs and has a single k-point, while
each pool of the SCF one has up to 2 k-points. We would expect that the
bands calculation take about half as an scf step, i.e. about 50 seconds.
However, the bands calculation has some trouble diagonalizing the
Hamiltonian, you see it writes:
ethr = 2.76E-12, avg # of iterations =120.0
while typically the very last scf diagonalization is
ethr = 2.98E-12, avg # of iterations = 3.3
This is because, the scf calculation can start with a very good guess
good the wavefunction, while the bands calculation does not. It is still
faster than doing the entire scf procedure, but just by a factor ~2.3
Fortunately, you do not usually need the eigenvalues to a precision of
10e-12. You can set the threshold by hand using the keyword
diago_thr_init, I guess 1.d-6 should be tight enough. However, double
check what you get in output, because I am half-suspecting that it may
be over-written by the value in the restart file
cheers
More information about the users
mailing list