[Q-e-developers] QE random number generator (randy) error: j out of range
Paolo Giannozzi
giannozz at democritos.it
Sun Nov 11 15:52:44 CET 2012
Oh well, I see. There are two problems here:
- the check in randy is bad: "errore" does nothing if the last
argoment is <= 0
- there is no check on the size of the seed. Apparently routine randy
works only
if the seed is smaller than ic=150899. I do not know where this
routine comes
from and who wrote it, so I am just guessing.
I have hopefully fixed this in the svn version. Your fix is also
good, as long as it prevents
the seed from being too large. Not sure why this problem was apparent
only when
PLUMED was used, though. Thank you for reporting this rather funny bug
Paolo
On Nov 11, 2012, at 13:31 , LIDing wrote:
> Hi,
>
> It seems that PLUMED is the cause of the problem.
>
> I tried ten different jobs with the same input, five of which have
> the -plumed option. 4 out of 5 plumed jobs crashed
> with the message "From randy: error #1038 j out of range", and
> those without -plumed just ran with no problem.
>
> So with -plumed, PWscf will crash randomly (with a large
> probability). I don't understand why.
>
> Here is another piece of info that may help:
>
> I found that set_random_seed uses current time components to
> generate a seed:
> 83 ! itime contains: year, month, day, time difference in
> minutes, hours,
> 84 ! minutes, seconds and milliseconds.
> 85 iseed = ( itime(8) + itime(6) ) * ( itime(7) + itime(4) )
> Here in China we have itime(4) as 480 (timezone UTC+8), which is a
> constant for any particular region,
> and quite likely iseed will be larger than the ic = 150889. if
> iseed is negative or smaller than ic,
> which is the case in most European countries, everything will be fine.
>
> In randy, it first processes the seed as idum using:
> 53 idum = MOD( ic - idum, m )
> I think this is the problem of generating many negetive numbers.
>
> If I remove the itime(4) in set_random_seed, PWscf never crashes
> with or without -plumed.
> I believe this correction has no problem since it is what a British
> user (UTC time, with itime(4) = 0) will get. I tested this with
> many runs.
>
> By the way, if j is negative, errore will do nothing (in
> error_handler.f90):
> 42 IF ( ierr <= 0 ) RETURN
> It means that many incorrect random numbers will be returned until
> a positive j stops the program.
>
> So here are the problems:
> 1. It seems that randy (and set_random_seed) does have some problems,
> and removing the itime(4) in set_random_seed fixes the problem,
> but why is it OK without -plumed?
> 3. Every time it crashes (with -plumed), j is 1038, not any other
> value.
>
> Any suggestion?
>
> Thank you for your help!
>
> Regards,
> Li Ding
> Institute of Geology and Geophysics, Chinese Academy of Science
> Email: dingmaotu at 126.com
>
>
>
>
>
> At 2012-11-11 00:43:49,"Paolo Giannozzi" <giannozz at democritos.it>
> wrote: >It seems to me exceedingly unlikely that there is a bug in
> a simple >routine like randy. It is more likely that there is
> either a bug in >the compiler, >or some array going out of
> bounds. In any case, it would be important >to know >whether this
> happens only in conjunction with PLUMED or not, and to >have an
> >input that produces this problem > >Paolo > >On Nov 10, 2012, at
> 15:57 , LIDing wrote: > >> Dear QE developers, >> >> I am using QE
> 4.3.2 with PLUMED 1.3 (PWscf with metadynamics), and >>
> encountered a problem. I think it may be a bug. >> The problem
> occurs most of the time, and only occasionally the >> error did
> not occur. >> The message was always the same: >> >> Molecular
> Dynamics Calculation >> >> Starting temperature = 9000.00 K
> >> >> temperature is set once at start >> >> mass
> Mg = 24.30 >> mass Si =
> 28.09 >> mass O &nbs! p; = 16.00 >>
> Time step = 20.00 a.u., 0.9676 femto-seconds >>
> >> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> %%% >> %%%%%%%%%%%% >> from randy : error # 1038
> >> j out of range >> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> %%%%%%%%%%%%%%%%%%%%%%%%%%% >> %%%%%%%%%%%% >> >>
> stopping ... >> >> I checked the relevant sources, and found that
> it occurred right in >> the md_init call. >> The error was
> reported by randy function in the ramdom_numbers.f90 >> file when
> it was first called by set_random_seed. >> In the
> random_numbers.f90: >> >> 50 IF ( first ) T! HEN >>
> 51 ;! >> 52 first = .false. >>
> 53 idum = MOD( ic - idum, m ) >> 54 ! >> >>
> ic = 150889, if you pass a random seed that is larger than ic,
> then >> idum will be a negative number after line 53. >> I wrote
> a little program to call set_random_seed, most of the time >>
> idum will be a negative number and the test in >> randy >>
> 63 IF( j > ntab .OR. j < 1 ) call errore('randy','j out >>
> of range',j) >> fails. j will be a negative number when I test
> this. >> >> But in actual run, j is always 1038 whenever the error
> occurs, >> which does not match what I see in my little test
> program. It is >> very strange. >> >> If I remove the line 53,
> everything seems OK in my test program but >> I am no! t sure if
> this line should be corrected in QE. >> So is it a bug in QE, or I
> just made some mistakes (the random seed >> can be negative, and
> the error is caused by other problems)? >> >> I work on a Linux PC
> cluster with intel compilers and openmpi, and >> QE is linked
> with the mkl library. >> >> I am not a subscriber of this mail
> list. Contact me by email to >> dingmaotu at 126.com. Thank you! >>
> >> Regards, >> Li Ding >> Institute of Geology and Geophysics,
> Chinese Academy of Science >> >> >>
> _______________________________________________ >> Q-e-developers
> mailing list >> Q-e-developers at qe-forge.org >> http://qe-forge.org/
> mailman/listinfo/q-e-developers > >--- >Paolo Giannozzi, Dept of
> Chemistry&Physics&Environment, >Univ. Udine, via delle Scienze 208,
> 33100 Udine, Italy >Phone +39-0432-558216, fax +39-0432-558222 > > > >
>
>
---
Paolo Giannozzi, Dept of Chemistry&Physics&Environment,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
Phone +39-0432-558216, fax +39-0432-558222
More information about the developers
mailing list