[Pw_forum] possible i/o bug in turbo_lanczos.x and turbo_davidson.x 5.3.0

Timrov Iurii iurii.timrov at epfl.ch
Thu Feb 4 16:46:10 CET 2016


Dear Giuseppe,

As far as I understand the code crashes when it tries to write the vectors "d0psi" to the disc. First thing to do, I think, is to check that you have enough space on the disc. If this is not the issue, then let's continue looking for a reason.

You may want to look in the routine TDDFPT/src/lr_solve_e.f90 at lines 110-138 where the code writes vectors to the disc in parallel. Please make sure that the "outdir" is the same in PWscf and in Lanczos/Davidson (and don't specify wfcdir). If this does not solve the problem, could you report please also the output of Lanczos/Davidson (better Lanczos)?

HTH

Best regards,
Iurii Timrov
Post-doctoral researcher
THEOS - École Polytechnique Fédérale de Lausanne
Lausanne, Switzerland


________________________________________
From: pw_forum-bounces at pwscf.org <pw_forum-bounces at pwscf.org> on behalf of Giuseppe Mattioli <giuseppe.mattioli at ism.cnr.it>
Sent: Thursday, February 4, 2016 11:34 AM
To: pw_forum at pwscf.org
Subject: [Pw_forum] possible i/o bug in turbo_lanczos.x and turbo_davidson.x    5.3.0

Dear All
I'm having problems when performing nontrivial runs of turbo_davidson.x and turbo_lanczos.x with 5.2.1 and 5.3.0 versions of QE.
Let me say first that "trivial" runs (CH4 molecule with same pseudopotentials and cutoffs but a smaller 30 a.u.^3 cubic cell) work fine with all the
tested versions.
However, the input files for a nontrivial case that leads to crash should run on a decent pc in about 1 hr, so they provide a significant but not huge
test. *Note* that if I run the same input files with the 5.1.1 version (compiled against the very same environment) everything goes (more slowly but)
fine! The 5.3.0 (and 5.2.1) crashes have been reproduced on two different machines (intel 8 cores 16GB RAM, amd 32 cores 64 GB RAM), so they should not
be considered as erratic.

here is the pw.x run. The PPs are quite old and can be found in the online library (or provided by me on demand).

 &control
    calculation = 'scf'
    restart_mode='from_scratch',
    prefix='l0-5.3.0',
    pseudo_dir = '/home/mattioligi/PP_UPF/',
    outdir='/home/mattioligi/cocat/test_tddft/5.2.1/l0/5.3.0/run/tmp/',
    nstep=300,
    max_seconds=80000,
    disk_io='low',
    tprnfor=.true.,
 /
 &system
    ibrav=1, celldm(1)=40.000000,
    nat=42, ntyp=4, nbnd=75,
    ecutwfc = 40.0,
    ecutrho = 320.0,
    nspin=1,
 /
 &electrons
    diagonalization='david',
    mixing_mode='plain'
    mixing_beta=0.1
    conv_thr=1.0d-8
    electron_maxstep=100
 /
 &ions
 /
ATOMIC_SPECIES
O    15.999    O_pbe.van.UPF
N    14.007    N.pbe-van_bm.UPF
C    12.011    C_pbe.van.UPF
H     1.008    H_pbe.van.UPF
ATOMIC_POSITIONS {angstrom}
C        4.815369179  12.355337788   8.111406911
C        5.639537337  12.072210478   7.018248617
C        6.373883049  10.886794669   6.974735758
H        5.707874252  12.778745273   6.179910928
C        4.734413944  11.441350355   9.166316558
H        4.235443595  13.287281698   8.140567718
C        6.304598307   9.977077773   8.041477142
H        7.012644682  10.659891408   6.111132336
C        5.477180541  10.260422385   9.138835842
H        4.092409998  11.653694694  10.031778418
H        5.418528381   9.546881383   9.971310698
N        7.058612774   8.759574945   8.006208499
C        6.384981399   7.544139013   8.340645249
C        6.997532612   6.588483316   9.168188787
C        5.084708421   7.308024697   7.864810575
C        6.325550737   5.410241765   9.493833204
H        8.006262126   6.776794433   9.557919083
C        4.414663626   6.134355690   8.210976959
H        4.597637090   8.055249046   7.224770074
C        5.030975670   5.176070562   9.020776666
H        6.819890970   4.670618768  10.138154855
H        3.397721512   5.964689741   7.832306200
H        4.503298572   4.249946635   9.284425745
C        8.412602212   8.773905175   7.652414992
C        9.197305040   9.938168667   7.841458619
C        9.043381168   7.634703664   7.098599788
C       10.533008285   9.972397555   7.486007356
H        8.740413757  10.828552107   8.290447985
C       10.383506998   7.674400214   6.758021800
H        8.466388332   6.717306584   6.931252215
C       11.175184928   8.838234071   6.927523312
H       11.098162573  10.894629696   7.663657304
H       10.849606517   6.778483121   6.322529487
C       12.554045113   8.768090174   6.529797787
C       13.538745611   9.729179498   6.474718127
H       12.882286114   7.769870632   6.203237321
C       13.338246843  11.096686263   6.810664645
N       13.160471613  12.223162736   7.083088078
C       14.914360413   9.407055683   6.034105289
O       15.832284936  10.221452163   5.956798921
O       15.091537629   8.085358800   5.710801225
H       16.043983143   8.016066678   5.436328923
K_POINTS {gamma}

And here are the turbo_lanczos.x and turbo davidson.x input files

lanczos

&lr_input
    prefix="l0-5.3.0",
    outdir='/state/partition1/mattioligi/34339',
    wfcdir='/state/partition1/mattioligi/34339',
    restart_step=6,
    restart=.false.
/
&lr_control
    itermax=12,
    ipol=4,
/

davidson

&lr_input
    prefix="l0-5.3.0",
    outdir='/state/partition1/mattioligi/34340',
    restart=.false.
/
&lr_dav
    num_eign=2
    num_init=4
    num_basis_max=10
    residue_conv_thr=1.0E-4
    start=0.1
    finish=1.5
    step=0.0002
    broadening=0.005
    reference=0.2
    p_nbnd_occ=5
    p_nbnd_virt=5
    poor_of_ram=.false.
    poor_of_ram2=.false.
/

In both cases and on both machines the CRASH report is something like

 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
     task #         1
     from davcio : error #        20
     error while writing from file "/state/partition1/mattioligi/34340/l0-5.3.0.d0psi.32"
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

I suppose that it is some kind of I/O error, but I warmly require your opinion...:-)
Thank you in advance
Giuseppe

********************************************************
- Article premier - Les hommes naissent et demeurent
libres et égaux en droits. Les distinctions sociales
ne peuvent être fondées que sur l'utilité commune
- Article 2 - Le but de toute association politique
est la conservation des droits naturels et
imprescriptibles de l'homme. Ces droits sont la liberté,
la propriété, la sûreté et la résistance à l'oppression.
********************************************************

   Giuseppe Mattioli
   CNR - ISTITUTO DI STRUTTURA DELLA MATERIA
   v. Salaria Km 29,300 - C.P. 10
   I 00015 - Monterotondo Stazione (RM), Italy
   Tel + 39 06 90672836 - Fax +39 06 90672316
   E-mail: <giuseppe.mattioli at ism.cnr.it>
   http://www.ism.cnr.it/english/staff/mattiolig
   ResearcherID: F-6308-2012

_______________________________________________
Pw_forum mailing list
Pw_forum at pwscf.org
http://pwscf.org/mailman/listinfo/pw_forum




More information about the users mailing list