On Mon, Feb 26, 2018 at 10:26 AM, Laurens Siemons <laurenssiemons at hotmail.be
> wrote:
> Does anybody have a suggestion why it does work on 1 node with 20 cores,
> but fails when I try to increase my nodes?
The final results do not depend upon the number of processors (modulo minor
numerical differences, within the convergence threshold) but intermediate
results may depend, due to a different starting point and to small
numerical differences. This may unfortunately lead some calculations that
are "on the brink" to fail. I have no evidence that this is related to a
bug or to any other easily solvable problem.
The "Cholesky" problem signals that the overlap matrix is not positive
definite (has a zero or negative eigenvalue). When it happens, it is
invariably due either to a badly wrong structure (not the case here) or to
USPP/PAW with small negative values of augmentation charges. It is a known
problem and there is no simple solution available.
Paolo
> Laurens Siemons
> You can use cell_parameters together with A or celldm(1).
>
> On Sat, Feb 24, 2018 at 8:53 PM, Manu Hegde <mhegde at uwaterloo.ca> wrote:
>
> Hi,
> I do not know much about your system but looking quickly into the crystal
> structure there is something that might be causing problem. Looks like you
> have set ibrav=0, in that case you have to use card cell_parameters. A=xx
> not required. Also double check ypur system with xcrysden before starting
> the calculations.
> Manu
> (SFU)
>
> On Fri, Feb 23, 2018 at 10:19 AM, <elchatz at auth.gr> wrote:
>
> Hello Laurens Siemons,
>
> Although I am not one of the experts, I had the same problem in one of
> the scf runs I was doing for a GW calculation. Because of the high
> number of bands and ecutwfc that I needed to use and in order to get
> any results, I had to run the simulation on 100 cores. The strange
> thing for me also was that the first one I tried run, but then nothing
> again. After a few weeks of trying I was notified by our cluster
> services that I should not use more than 60 cores as the I/O
> operations that are done by QE were too high and the disk could not
> cope. I gave up GW since then, but if there is a solution to this
> problem, I would like to hear it too :S
> Eleni
>
>
> Quoting Laurens Siemons <laurenssiemons at hotmail.be>:
>
> > Dear all,
> >
> >
> > I'm a master student chemistry and I'm using QE (v. 6.1) for a relax
> > calculation of a rutile 101 slab with a vacuum above it.
> >
> > I'm getting the famous error:
> >
> >
> >
> > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> %%%%%%%%%%%%%%%%%%
> > Error in routine cdiaghg (161):
> > problems computing cholesky
> >
> > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> %%%%%%%%%%%%%%%%%%
> >
> > I've read almost every related topic on the forum that I could find
> > and I tried a lot already to overcome this, like:
> > - changing values for ecutwfc and ecutrho
> > - changing mixing_beta
> > - changing functionals
> > - Tried to run the calcualtion with other input files (anatase 101,
> 001...)
> > - Changed diagonalization to 'cg' (In this case it calculates some
> > itterations but then crashes with the error: 'Error in routine
> > c_bands (1): >> too many bands are not converged')
> >
> > Nothing seems to help and I'm out of options... I even tried to run
> > a calculation of my predecessor (that has succeeded in the past) but
> > this also failed (he used an older version of QE though...).
> >
> > I'm postig my input file at the end here and I really hope somebody
> > can help me.
> >
> > Kind Regards,
> > Laurens Siemons
> >
> > &CONTROL
> > calculation = 'relax'
> > restart_mode = "from_scratch",
> > prefix = "testen",
> > pseudo_dir =
> > '/data/antwerpen/204/vsc20442/pseudo/pslibrary.1.0.0/wc/PSEU
> DOPOTENTIALS'
> > outdir = '/data/antwerpen/204/vsc20442'
> > nstep = 100
> > /
> > &SYSTEM
> > ibrav = 0
> > A = 4.59631
> > nat = 36
> > ntyp = 2
> > ecutwfc = 60
> > ecutrho = 600
> > /
> > &ELECTRONS
> > electron_maxstep = 300
> > mixing_beta = 0.10
> > conv_thr = 1.0d-8
> > mixing_mode = 'local-TF'
> > diago_thr_init = 1e-4
> > /
> > &IONS
> > ion_dynamics = 'bfgs'
> > ion_positions = 'default'
> > /
> > CELL_PARAMETERS {alat}
> > 1.000000000000000 0.000000000000000 0.640859733133753
> > 0.000000000000000 2.000000000000000 0.000000000000000
> > 0.000000000000000 0.000000000000000 3.845158398802518
> > ATOMIC_SPECIES
> > O 15.99900 O.wc-n-kjpaw_psl.1.0.0.UPF
> > Ti 47.86700 Ti.wc-spn-kjpaw_psl.1.0.0.UPF
> > ATOMIC_POSITIONS {crystal}
> > Ti -0.000000000000000 -0.000000000000000 0.075000000000000 0 0 0
> > Ti -0.000000000000000 -0.000000000000000 0.408333333333333
> > Ti -0.000000000000000 0.500000000000000 0.241666666666667
> > Ti -0.000000000000000 -0.000000000000000 0.241666666666667
> > Ti -0.000000000000000 0.500000000000000 0.075000000000000 0 0 0
> > Ti -0.000000000000000 0.500000000000000 0.408333333333333
> > Ti 0.500000000000000 0.250000000000000 0.075000000000000 0 0 0
> > Ti 0.500000000000000 0.250000000000000 0.408333333333333
> > Ti 0.500000000000000 0.750000000000000 0.241666666666667
> > Ti 0.500000000000000 0.250000000000000 0.241666666666667
> > Ti 0.500000000000000 0.750000000000000 0.075000000000000 0 0 0
> > Ti 0.500000000000000 0.750000000000000 0.408333333333333
> > O 0.304303000000000 0.152151500000000 0.024282833333333 0 0 0
> > O 0.304303000000000 0.152151500000000 0.357616166666667
> > O 0.304303000000000 0.652151500000000 0.190949500000000
> > O 0.304303000000000 0.152151500000000 0.190949500000000
> > O 0.304303000000000 0.652151500000000 0.024282833333333 0 0 0
> > O 0.304303000000000 0.652151500000000 0.357616166666667
> > O 0.695697000000000 0.347848500000000 0.459050500000000
> > O 0.695697000000000 0.347848500000000 0.292383833333333
> > O 0.695697000000000 0.847848500000000 0.125717166666667 0 0 0
> > O 0.695697000000000 0.347848500000000 0.125717166666667 0 0 0
> > O 0.695697000000000 0.847848500000000 0.459050500000000
> > O 0.695697000000000 0.847848500000000 0.292383833333333
> > O 0.804303000000000 0.097848500000000 0.024282833333333 0 0 0
> > O 0.804303000000000 0.097848500000000 0.357616166666667
> > O 0.804303000000000 0.597848500000000 0.190949500000000
> > O 0.804303000000000 0.097848500000000 0.190949500000000
> > O 0.804303000000000 0.597848500000000 0.024282833333333 0 0 0
> > O 0.804303000000000 0.597848500000000 0.357616166666667
> > O 0.195697000000000 0.402151500000000 0.125717166666667 0 0 0
> > O 0.195697000000000 0.402151500000000 0.459050500000000
> > O 0.195697000000000 0.902151500000000 0.292383833333333
> > O 0.195697000000000 0.402151500000000 0.292383833333333
> > O 0.195697000000000 0.902151500000000 0.125717166666667 0 0 0
> > O 0.195697000000000 0.902151500000000 0.459050500000000
> > K_POINTS {automatic}
> > 4 4 6 1 1 1
>
