[Pw_forum] Problem

Thu Dec 9 21:08:15 CET 2004

Well, I see this behavior almost every time when I do vc-relax on an
Itanium2 cluster. I was wondering if it was due to MKL, but it's not.
However, pw.x runs OK under a IBM RS6000 cluster. 
A "definitely-go-hell" input file:
&control
  calculation = 'vc-relax'
  title = 'alpha-quartz'
  restart_mode = 'from_scratch'
  nstep = 500, iprint = 20
  tstress = .true., tprnfor = .true.
  dt = 10.0
  prefix = 'sio2'
  pseudo_dir = '../../pp/'
  outdir = './'
/
&system
  ibrav = 4, celldm(1) = 9.6, celldm(3) = 1.10,
  nat = 9, ntyp = 2
  ecutwfc = 80.0
  nosym = .true.
/
&electrons
  conv_thr = 1.0D-7
  mixing_mode = 'plain', mixing_beta = 0.2D0
  diagonalization = 'david'
/
&ions
  ion_dynamics = 'damp'
  minimization_scheme = 'damped-dyn', damp = 0.5
  reset_vel = .true.
/
&cell
  cell_dynamics = 'damp-pr'
  cell_factor = 1.5D0
/
ATOMIC_SPECIES
  Si 2.0 Si.pbe-rrkj.UPF
  O  2.0 O.pbe-rrkjus.UPF
ATOMIC_POSITIONS crystal
  Si  0.46880915  0.00000000  0.66666667
  Si  0.00000000  0.46880915  0.33333333
  Si -0.46880915 -0.46880915  0.00000000
  O   0.40622487  0.26840588  0.77904595
  O  -0.26840588  0.13576296  0.44752077
  O  -0.13576296 -0.40622487  0.11246441
  O   0.26840588  0.40622487 -0.77904595
  O  -0.40622487 -0.13576296 -0.11246441
  O   0.13576296 -0.26840588 -0.44752077
K_POINTS automatic
  4,4,4,1,1,1

Best Wishes,

Chao Cao

On Thu, 9 Dec 2004, Paolo Giannozzi wrote:

> On Thursday 09 December 2004 17:35, Chao Cao wrote:
> 
> > Thanks for reply. But does this part of the manual actually apply
> > here? Coz the program actually didn't crash, it was running but 
> > doing weird stuff without any output.
> 
> I have occasionally seen this behavior in the past:
> 
> - when one MPI process encounters an error, while the others 
>   don't. This was happening on the SP3 in Princeton using k-point
>   parallelization: if the diagonalization crashed on a specific k-point,
>   the code hung. It was a problem with diagonalization libraries
>   that has never been clarified.
> 
> - when one MPI process encounters a condition that forces one 
>   process to follow a different flow from that of other processes.
>   For instance: one process thinks that he has converged, while
>   the others don't. This used to happen on the first T3D: processes
>   that were doing exactly the same calculation produced slightly
>   different answers. Of course it was a problem of the operating
>   system, not of the code.
> 
> The only way I know to track similar problems is to follow the
> code flow and to put stops until you understand where and 
> why things go wrong. Unfortunately it is a time-consuming 
> process and the origin of the problem could be quite subtle,
> or not even directly related to bugs in the code. Anyway: if 
> you have a test that hangs reliably, please submit it
> 
> Paolo
> 
> -- 
> Paolo Giannozzi             e-mail:  giannozz at nest.sns.it
> Scuola Normale Superiore    Phone:   +39/050-509876, Fax:-563513 
> Piazza dei Cavalieri 7      I-56126 Pisa, Italy
>