> >> With the formula M=10x0.9^n I calculate that after 15 random draws(d)
> >> one at a time with replacement from an urn with the ten digits 0 to 9
[quoted text clipped - 27 lines]
>
> - Show quoted text -
I make the probabilities for the number of unused digits after 15
draws with replacement from 10 to be
10 0
9 0.00000000000001
8 0.00000000147447
7 0.00000171007272
6 0.00021347398800
5 0.00637359166080
4 0.06360882287760
3 0.24720675819840
2 0.39304955289600
1 0.24359586451200
0 0.04595022432000
If P(n,k,m) is the probability of n remaining unused digits out of an
original k after m draws then
P(n,k,m) = P(n+1,k,m-1) * (n+1)/k + P(n,k,m-1) * (k-n)/k
starting with P(k,k,0) = 1
I make the expected value of this to be 2.05891132094649 and variance
about 0.986389 (sd about 0.993171).
>>> With the formula M=10x0.9^n I calculate that after 15 random draws(d)
>>> one at a time with replacement from an urn with the ten digits 0 to 9
[quoted text clipped - 23 lines]
> Thus I suspect there must be another formula than the binomial for
> predicting the variance. What formula did you use to get 0.986?
se16's figures are correct. The exact figures are:
mean: 2.05891132094649
variance: 0.9863889814197496485705566799
You can verify these by computing the exact probabilities for each
number of missing digits.
Let d(n) be a 10-tuple which contains the number of ways there can be
1..10 digits present with n picks. The probability of each number of
digits present is then just d(n)/10^n.
d(1)={10, 0, 0, 0, 0, 0, 0, 0, 0, 0}
p(1)={1, 0, 0, 0, 0, 0, 0, 0, 0, 0}
In this case, the probability is 1 that there is exactly one digit
present after one pick, and zero that there are any more than one
digit.
You can now calculate d(n) from d(n-1), continuing until you get to
d(15).
To get d(2), note that for each of the 10 ways to get one digit in
d(1), there is one way to get the same digit, and nine ways to get a
different digit. Therefore:
d(2)={10, 90, 0, 0, 0, 0, 0, 0, 0, 0}
p(2)={0.1, 0.9, 0, 0, 0, 0, 0, 0, 0, 0}
In general, you can get k digits in n picks by either having (k-1)
digits in (n-1) picks and picking a *new* digit, or by having k digits
in (n-1) picks and picking an *old* digit. So for example, the second
element of d(3) is arrived at by:
10 ways of having 1 digit in 2 picks times 9 ways of picking a
different digit
plus
90 ways of having 2 digits in 2 picks times 2 ways of picking one of
those 2 digits
or (10*9) + (90*2) = 90 + 180 = 270
The complete 10-tuple for n = 3 is:
d(3)={10, 270, 720, 0, 0, 0, 0, 0, 0, 0}
p(3)={0.01, 0.27, 0.72, 0, 0, 0, 0, 0, 0, 0}
I'll give one more example.
d(4)={10*1, 270*2, 720*3, 0*4, 0*5, 0*6, 0*7, 0*8, 0*9, 0*10}
+{0*10, 10*9, 270*8, 720*7, 0*6, 0*5, 0*4, 0*3, 0*2, 0*1}
={10, 630, 4320, 5040, 0, 0, 0, 0, 0, 0}
p{4)={0.001, 0.063, 0.432, 0.504, 0, 0, 0, 0, 0, 0}
So you just continue until you get d(15), then use the exact probabilities
to calculate the mean and variance of the number of missing digits.
For example, the mean for n=4 is:
mean(4)=0.001*9 + 0.063*8 + 0.432*7 + 0.504*6
=6.561
var(4)=0.001*(9-6.561)^2 + 0.063*(8-6.561)^2 + 0.432*(7-6.561)^2
+0.504*(6-6.561)^2
=0.378279
There is an exact formula for the mean and variance, but you can't use
the binomial variance formula. As was noted by se16:
mean(n) = 10*0.9^n
I derived a formula for the variance:
var(n) = 90*0.8^n - 100*0.81^n + 10*0.9^n
There's probably a simple way to derive or prove this, but I used a
method that is outside the scope of the rest of this message.
Basically, I noted that p(n) is a linear transformation of p(n-1),
wrote out the transition matrix (which is bidiagonal), calculated the
eigenvalues and eigenvectors of the matrix, wrote an expression for
the n'th power of that matrix, multiplied p(1) by the n'th power,
getting formulas for any n for the probabilities of 1..10 digits. I
then used these formulas to get the mean and variance of the number of
missing digits for any n.
Scott

Signature
Scott Hemphill hemphill@alumni.caltech.edu
"This isn't flying. This is falling, with style." -- Buzz Lightyear
Stig Holmquist - 06 Oct 2008 23:01 GMT
>>>> With the formula M=10x0.9^n I calculate that after 15 random draws(d)
>>>> one at a time with replacement from an urn with the ten digits 0 to 9
[quoted text clipped - 101 lines]
>
>var(n) = 90*0.8^n - 100*0.81^n + 10*0.9^n
Thank you for this last formula. It's what the doctor odered.
Bertil
>There's probably a simple way to derive or prove this, but I used a
>method that is outside the scope of the rest of this message.
[quoted text clipped - 7 lines]
>
>Scott
>>> With the formula M=10x0.9^n I calculate that after 15 random draws(d)
>>> one at a time with replacement from an urn with the ten digits 0 to 9
[quoted text clipped - 23 lines]
> Thus I suspect there must be another formula than the binomial for
> predicting the variance. What formula did you use to get 0.986?
Generally, you need on the order of 10^(2*n) simulations to get n
significant figures in your statistics, so you would need 100 to get
one significant figure, 10000 to get two significant figures, etc.
I ran 10^10 (that's 10 U.S. billion) experiments using GNU's
drand48() RNG. The results were:
mean: 2.05891
variance: 0.986385
Scott

Signature
Scott Hemphill hemphill@alumni.caltech.edu
"This isn't flying. This is falling, with style." -- Buzz Lightyear