Dear Pietro, Paolo and Davide,

thank you for your hints. Indeed by changing the number of CPUs the 
calculation *may* converge also with QE6.3. For example:

2pools-x-34cpus-x-2omp (ie #MPIxOpenMP cores = #cpus)

are OK, but

2pools-x-68cpus-x-1omp (ie #MPIxOpenMP cores = #cpus)

does not converge again, although I'm not asking for more tasks than 
cpus (see Pietro's comment). Also, KNL nodes in A2 should support 
hyperthreading (4x) 
so I would not expect that asking for a number of threads that is twice 
the number of allocated cpu's would be a problem - nor it is for QE6.0 
and for the inputs with the molecule/surface.

I though this could be related to the size of the system since I had no 
problems with the heavier molecule/surface case; however, the problem is 
also present for larger, clean-Au(111), unit cells.

I can now circumvent the issue, thank you. I'd also be curious to know 
what is the reason...


