[Pw_forum] Problems computing cholensky

Kirk khrusallis at gmail.com
Mon Feb 26 21:10:38 CET 2018


 I often see exactly this error when the number of MPI processes used is
close to or greater than nr3 of the FFT grid. In those cases, dividing up
the MPI via -ntg or using a mixture of MPI & OMP to get the MPI count down
is often helpful for me.

Best

Kirk
ECE Department
Boston University


On Mon, Feb 26, 2018 at 4:26 AM, Laurens Siemons <laurenssiemons at hotmail.be>
wrote:

> Hi,
>
>
> Thanks all for the response. I tried to run Will DeBenedetti's script for
> his anatase (001) slab and the calculation runs without an error (on 4
> nodes and 20 cores per node). I don't understand though why his script does
> run and mine does not. Does anybody has an idea about this?
>
>
> I also tried to add 'ndiag=1' for some of my scripts like Mostafa Youssef
> suggested, but unfortunately without succes.
>
> Somebody at my departement suggested to try and run it on 1 node with
> 20 cores. For some reason this does work. This will probably not be enough
> power to complete the calculation, but I don't get the error 'problems
> computing cholensky'. Does anybody have a suggestion why it does work on 1
> node with 20 cores, but fails when I try to increase my nodes? (except when
> I try to run Will's script)
>
>
> Thanks in advance,
>
> Laurens Siemons
>
>
> ------------------------------
> *Van:* pw_forum-bounces at pwscf.org <pw_forum-bounces at pwscf.org> namens
> Paolo Giannozzi <p.giannozzi at gmail.com>
> *Verzonden:* zondag 25 februari 2018 8:33
> *Aan:* PWSCF Forum
> *Onderwerp:* Re: [Pw_forum] Problems computing cholensky
>
> You can use cell_parameters together with A or celldm(1).
>
> On Sat, Feb 24, 2018 at 8:53 PM, Manu Hegde <mhegde at uwaterloo.ca> wrote:
>
> Hi,
> I do not know much about your system but looking quickly into the crystal
> structure there is something that might be causing problem. Looks like you
> have set ibrav=0, in that case you have to use card cell_parameters. A=xx
> not required. Also double check ypur system with xcrysden before starting
> the calculations.
> Manu
> (SFU)
>
> On Fri, Feb 23, 2018 at 10:19 AM, <elchatz at auth.gr> wrote:
>
> Hello Laurens Siemons,
>
> Although I am not one of the experts, I had the same problem in one of
> the scf runs I was doing for a GW calculation. Because of the high
> number of bands and ecutwfc that I needed to use and in order to get
> any results, I had to run the simulation on 100 cores. The strange
> thing for me also was that the first one I tried run, but then nothing
> again. After a few weeks of trying I was notified by our cluster
> services that I should not use more than 60 cores as the I/O
> operations that are done by QE were too high and the disk could not
> cope. I gave up GW since then, but if there is a solution to this
> problem, I would like to hear it too :S
>
>
> Eleni
>
>
> Quoting Laurens Siemons <laurenssiemons at hotmail.be>:
>
> > Dear all,
> >
> >
> > I'm a master student chemistry and I'm using QE (v. 6.1) for a relax
> > calculation of a rutile 101 slab with a vacuum above it.
> >
> > I'm getting the famous error:
> >
> >
> >
> > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> %%%%%%%%%%%%%%%%%%
> >      Error in routine  cdiaghg (161):
> >       problems computing cholesky
> >
> > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> %%%%%%%%%%%%%%%%%%
> >
> > I've read almost every related topic on the forum that I could find
> > and I tried a lot already to overcome this, like:
> > - changing values for ecutwfc and ecutrho
> > - changing mixing_beta
> > - changing functionals
> > - Tried to run the calcualtion with other input files (anatase 101,
> 001...)
> > - Changed diagonalization to 'cg' (In this case it calculates some
> > itterations but then crashes with the error: 'Error in routine
> > c_bands (1): >> too many bands are not converged')
> >
> > Nothing seems to help and I'm out of options... I even tried to run
> > a calculation of my predecessor (that has succeeded in the past) but
> > this also failed (he used an older version of QE though...).
> >
> > I'm postig my input file at the end here and I really hope somebody
> > can help me.
> >
> > Kind Regards,
> > Laurens Siemons
> >
> > &CONTROL
> >   calculation = 'relax'
> >   restart_mode = "from_scratch",
> >   prefix       = "testen",
> >   pseudo_dir =
> > '/data/antwerpen/204/vsc20442/pseudo/pslibrary.1.0.0/wc/PSEU
> DOPOTENTIALS'
> >   outdir = '/data/antwerpen/204/vsc20442'
> >   nstep = 100
> > /
> > &SYSTEM
> >   ibrav = 0
> >   A =    4.59631
> >   nat = 36
> >   ntyp = 2
> >   ecutwfc = 60
> >   ecutrho = 600
> > /
> > &ELECTRONS
> >   electron_maxstep = 300
> >   mixing_beta = 0.10
> >   conv_thr =  1.0d-8
> >   mixing_mode = 'local-TF'
> >   diago_thr_init = 1e-4
> > /
> > &IONS
> >   ion_dynamics = 'bfgs'
> >   ion_positions = 'default'
> > /
> > CELL_PARAMETERS {alat}
> >   1.000000000000000   0.000000000000000   0.640859733133753
> >   0.000000000000000   2.000000000000000   0.000000000000000
> >   0.000000000000000   0.000000000000000   3.845158398802518
> > ATOMIC_SPECIES
> >    O   15.99900   O.wc-n-kjpaw_psl.1.0.0.UPF
> >   Ti   47.86700   Ti.wc-spn-kjpaw_psl.1.0.0.UPF
> > ATOMIC_POSITIONS {crystal}
> > Ti  -0.000000000000000  -0.000000000000000   0.075000000000000 0 0 0
> > Ti  -0.000000000000000  -0.000000000000000   0.408333333333333
> > Ti  -0.000000000000000   0.500000000000000   0.241666666666667
> > Ti  -0.000000000000000  -0.000000000000000   0.241666666666667
> > Ti  -0.000000000000000   0.500000000000000   0.075000000000000 0 0 0
> > Ti  -0.000000000000000   0.500000000000000   0.408333333333333
> > Ti   0.500000000000000   0.250000000000000   0.075000000000000 0 0 0
> > Ti   0.500000000000000   0.250000000000000   0.408333333333333
> > Ti   0.500000000000000   0.750000000000000   0.241666666666667
> > Ti   0.500000000000000   0.250000000000000   0.241666666666667
> > Ti   0.500000000000000   0.750000000000000   0.075000000000000 0 0 0
> > Ti   0.500000000000000   0.750000000000000   0.408333333333333
> >  O   0.304303000000000   0.152151500000000   0.024282833333333 0 0 0
> >  O   0.304303000000000   0.152151500000000   0.357616166666667
> >  O   0.304303000000000   0.652151500000000   0.190949500000000
> >  O   0.304303000000000   0.152151500000000   0.190949500000000
> >  O   0.304303000000000   0.652151500000000   0.024282833333333 0 0 0
> >  O   0.304303000000000   0.652151500000000   0.357616166666667
> >  O   0.695697000000000   0.347848500000000   0.459050500000000
> >  O   0.695697000000000   0.347848500000000   0.292383833333333
> >  O   0.695697000000000   0.847848500000000   0.125717166666667 0 0 0
> >  O   0.695697000000000   0.347848500000000   0.125717166666667 0 0 0
> >  O   0.695697000000000   0.847848500000000   0.459050500000000
> >  O   0.695697000000000   0.847848500000000   0.292383833333333
> >  O   0.804303000000000   0.097848500000000   0.024282833333333 0 0 0
> >  O   0.804303000000000   0.097848500000000   0.357616166666667
> >  O   0.804303000000000   0.597848500000000   0.190949500000000
> >  O   0.804303000000000   0.097848500000000   0.190949500000000
> >  O   0.804303000000000   0.597848500000000   0.024282833333333 0 0 0
> >  O   0.804303000000000   0.597848500000000   0.357616166666667
> >  O   0.195697000000000   0.402151500000000   0.125717166666667 0 0 0
> >  O   0.195697000000000   0.402151500000000   0.459050500000000
> >  O   0.195697000000000   0.902151500000000   0.292383833333333
> >  O   0.195697000000000   0.402151500000000   0.292383833333333
> >  O   0.195697000000000   0.902151500000000   0.125717166666667 0 0 0
> >  O   0.195697000000000   0.902151500000000   0.459050500000000
> > K_POINTS {automatic}
> > 4 4 6 1 1 1
>
>
>
> --
> Dr. Eleni Chatzikyriakou
> Computational Physics lab
> Aristotle University of Thessaloniki
> elchatz at auth.gr - tel:+30 2310 998109
>
> _______________________________________________
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://pwscf.org/mailman/listinfo/pw_forum
>
>
>
> _______________________________________________
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://pwscf.org/mailman/listinfo/pw_forum
>
>
>
>
> --
> Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
> Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
> Phone +39-0432-558216 <+39%200432%20558216>, fax +39-0432-558222
> <+39%200432%20558222>
>
>
> _______________________________________________
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://pwscf.org/mailman/listinfo/pw_forum
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20180226/ecab6f20/attachment.html>


More information about the users mailing list