[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: help with random numbers

Subject: Re: help with random numbers
From: "Dimitre Novatchev dnovatchev@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 17 Nov 2022 02:57:04 -0000
Re:  help with random numbers
I hope this information might be useful:
https://fxsl.sourceforge.net/articles/Random/Casting%20the%20Dice%20with%20FX
SL-htm.htm#3._Testing_randomness_with_Monte_Carlo_integration

A  test with 65536 random points is carried out using FXSL (XSLT 1.0) ,
that calculates with the Monte-Carlo method the values of the integrals of
3 well-known functions:

f x = 4 / (1 + x^2),    x  *b* [0, 1]

f x = x, x  *b* [0, 1]


f x = 1 / x,   x  *b* [1, 2]

The well-known results of the integration:  pi, B=  and ln(2)  were
calculated with good precision.

This was done and published more than 20 years ago and is still usable  :)

Thanks,
Dimitre




On Wed, Nov 16, 2022 at 6:25 PM C. M. Sperberg-McQueen
cmsmcq@xxxxxxxxxxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

>
> Thanks for your reply.
>
> "Michael Kay michaelkay90@xxxxxxxxx" <
> xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> writes:
>
> > ...
> >
> > Worth noting in passing that there's a bug in the spec: it says that
> > every xs:double value in the range 0 to 1 should be equally likely to
> > appear, but that's not what you actually want, because there are about
> > as many xs:double values in the range 0 to 0.1 as there are in the
> > range 0.1 to 1.0. Fortunately implementors are unlikely to have taken
> > much notice of that provision; but it does illustrate the dangers.
>
> You're right!  I was so confident that what it said was what it should
> say that I misread
>
>     The value of the number entry should be such that all eligible
>     xs:double values are equally likely to be chosen.
>
> as saying that all real values in the interval are equally likely to be
> chosen, i.e. that the values would be selected from a uniform
> distribution over the interval.
>
> At some point I will doubtless also want to select values from other
> distributions, too, but I suspect that there are ways to make that
> happen.
>
> > If you want a completely uniform distribution, consider using the
> > permute() option. But of course a uniform distribution is very far
> > from random (it's well known that a truly random sequence contains far
> > more duplicates than a typical person will expect).
>
> > I think the idea of fn:random-number-generator() is that you can
> > choose whether you want a repeatable sequence of random numbers or a
> > different sequence each time. For the latter, use current-dateTime() as
> > a seed.
>
> My situation, I regret to say, is that if possible I want both: I would
> like a given set of a hundred or a thousand or ten thousand simulation
> runs to be repeatable, and I would also like each run in such a set to
> have a different sequence.  (And, if possible, I would like each set to
> vary, in a reproducible way.)
>
> But if I use current-dateTime() or some other method to introduce
> variation into the initial seed, I can of course record the seed used
> and use it again if I want to replicate the simulations.  I suspect that
> for my current code, which tends to use just the first two values of the
> sequence of numbers generated by a given seed, I will want more
> variation in the seed that current-dateTime() will give: a minimally
> conforming processor has about 3.1E14 different dateTime values (and the
> simulation runs in a set will vary only in minute, second, and
> millisecond, so maybe 1E5 different values), which is a lot less than
> 4.5E18, which the Web tells me is roughly the number of double precision
> values in the interval [0,1].
>
> (I believe I saw a story once about a flawed online poker system whose
> card-shuffling routine a 32-bit random number to shuffle the cards, with
> the result that there were about 2^32 b	 4E9 possible hands, instead of
> 52! b	 8e67, and worse yet used milliseconds-since-midnight as the seed,
> so those who broke the system just needed to try a few hundred or a few
> thousand clock values to find the value that produced the hand they were
> holding, at which point they could see everyone else's hand, too.  This
> has led me to believe that -- especially if one is taking the first
> number generated from a given seed -- it pays if the range of possible
> seeds is about the same size as the range of possible results of the
> random number generator, and/or the range of possible phenomena being
> selected.)
>
>
> > I'm sure you're right that you want a single "flat" sequence of random
> > numbers, you don't want a branching sequence; and achieving a "flat"
> > sequence when you're doing recursive tree traversal isn't
> > straightforward. Using an accumulator is an interesting idea. I've
> > previously used xsl:number level="any" to index into a sequence of
> > pre-alllocated random numbers.
>
> That seems like a feasible idea:  before calling apply-templates for a
> given time slice I can generate a sequence of random numbers (it's easy
> to know in advance how many are needed) and index into it in the way you
> describe.
>
> > You seem to be doing all the right things. You say the results are
> > "disappointing", and I feel I'd like to know more about what that
> > means.
>
> That's a good question.  The main symptom so far is that the first
> hundred times I ran the simulation, 80 of the runs produced the same
> trivial result: a birth-and-death process in which the initial
> individual died before reproducing - the runs differed on how long that
> individual lived, so they were not completely identical, but the family
> trees they produced were isomorphic: one node with birth and death
> dates.
>
> It is possible, of course, that that form of result is more probable
> than I had expected.  So with a little effort I calculated the
> probability that the birth and death rates I was using should produce
> the result.  The probability *is* higher than I had expected, but if my
> calculation is accurate it's about 0.32, not about 0.80.
>
> > ...
> > Specifically: a small change in the seed results in only a small
> > change in the first value in the sequence. (The Saxon implementation
> > calls Java's Random class passing a seed which is the hash code of the
> > seed passed at the XPath level. For integers, the hash code of a small
> > integer is the integer itself, which may well have something to do
> > with it.)
> >
> > If I multiply the supplied seed by 987654321 before passing it to
> > Java, the pattern looks a lot more "random":
> >
> > ...
> >
> > Alternatively, discard the first couple of items in the sequence.
>
> An experiment I just ran suggests that discarding two values and taking
> the third for each key does indeed lead to a much wider spread in the
> resulting numbers; even discarding a single value helps a lot.  Perhaps
> that will help.  Discarding ten appears to be more than is needed.
>
> I will try one or more of these approaches and see what happens.  If and
> when the simulation starts to produce results that match the
> probabilities I am able to calculate for some simple cases (like the
> one-individual case described above), then I will begin to have more
> confidence in my simulation.
>
> --
> C. M. Sperberg-McQueen
> Black Mesa Technologies LLC
> http://blackmesatech.com

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.