We start by considering the following problem and then show how it relates to the binomial distribution.
Example 1: Suppose you start at point 0 and either walk 1 unit to the right or one unit to the left, where there is a 50-50 chance of either choice. Now repeat this process for a total of 100 times. What is your expected distance from the starting point?
This is a simple form of what is called a random walk problem. Define the random variables xi as follows:
Now let dn = your distance from the starting point after the nth trial. Thus
Thus to solve Example 1 we need to find the expected value E[d100].
Observation: Suppose you toss a fair coin 100 times. Per Property 1 of Binomial Distribution, the average number of heads is 100(.5) = 50 and the average number of tails is 50. But we know that 100 tosses won’t always result in 50 heads and 50 tails. So on average how many more or fewer heads are there than the average of 50? It is easy to see that the answer to this question is the average value of |# of heads – 50|. But what is this value?
If we view getting a head as going right and getting a tails as going left in our random walk problem, we see that the answer to this problem is also E[d100]. In general, the average number of heads after n tosses is n/2 and so
the average value of |# of heads after n tosses – n/2| = E[dn]
Let x be the random variable which takes the value 1 if heads is thrown and -1 if tails is thrown. Then by definition of mean and variance, x has mean (.5(1) + .5(-1))/2 = 0 and variance ((1–0)2 +(-1–0)2)/2 = 1.
By Corollary 1 of the Central Limit Theorem, for n sufficiently large, the sample mean of x is approximately normally distributed with mean 0 and standard deviation 1/. But the sample mean of x is . Thus
By Property 1 of Basic Characteristics of the Normal Distribution, it now follows that
But by algebra n/ = , and so
By Property 5 of Properties of Distributions Detailed (the proof of which uses calculus), it now follows that
we see that the answer to the problem posed in Example 1 is just under 8. In fact, based on the last observation above, this means that not only is it not necessarily the case that heads and tails break 50-50 in 100 tosses, but on average heads are 8 units away from 50, i.e. on average, heads and tails break 58-42 (although not necessarily is the 58 for heads).
Observation: Since, as we have seen above, ~ N(0,), the probability that dn < k for any value of k is NORMDIST(k, 0, SQRT(n), True) – NORMDIST(-k, 0, SQRT(n), True). E.g. P(d100 < 7) = .758 – .242 = .516.