Definition 1: The Poisson distribution has a probability distribution function (pdf) given by
The parameter μ is often replaced by λ. A chart of the pdf of the Poisson distribution for λ = 3 is shown in Figure 1.
Figure 1 – Poisson Distribution
Observation: Some key statistical properties of the Poisson distribution are:
- Mean = µ
- Variance = µ
- Skewness = 1 /
- Kurtosis = 1/µ
Excel Function: Excel provides the following function for the Poisson distribution:
POISSON(x, μ, cum) where μ = the mean of the distribution and cum takes the values TRUE and FALSE
POISSON(x, μ, FALSE) = probability density function value f(x) at the value x for the Poisson distribution with mean μ.
POISSON(x, μ, TRUE) = cumulative probability distribution function F(x) at the value x for the Poisson distribution with mean μ.
Excel 2010/2013/2016 provide the additional function POISSON.DIST which is equivalent to POISSON.
Real Statistics Function: Excel doesn’t provide a worksheet function for the inverse of the Poisson distribution. Instead you can use the following function provided by the Real Statistics Resource Pack.
POISSON_INV(p, μ) = smallest integer x such that POISSON(x, μ, TRUE) ≥ p
Note that the maximum value of x is 1,024,000,000. A value higher than this indicates an error.
If the average number of occurrences of a particular event in an hour (or some other unit of time) is μ and the arrival times are random without any tendency to bunch up (i.e. the assumptions for what is called a Poisson process) then the probability of x events occurring in an hour is given by
Example 1: A large department store sells on average 100 MP3 players a week. Assuming that purchases are as described in the above observation, what is the probability that the store will have to turn away potential buyers before the end if they stock 120 players? How many MP3 players should the store stock in order to make sure that it has a 99% probability of being able to supply a week’s demand?
The probability that they will sell ≤ 120 MP3 players in a week is
POISSON(120, 100, TRUE) = 0.977331
Thus, the answer to the first problem is 1 – 0.977331 = 0.022669, or about 2.3%. We can answer the second question by using successive approximations until we arrive at the correct answer. E.g. we could try x = 130, which is higher than 120. The cumulative Poisson is 0.998293, which is too high. We then pick x = 125 (halfway between 120 and 130). This yields 0.993202, which is a little too high, and so we try 123. This yields 0.988756, which a little too low, and so we finally arrive at 124, which has cumulative Poisson distribution of 0.991226.
Users of Excel 2010/2013/2016 can arrive at the same answer (124) by using the Real Statistics formula =POISSON_INV(0.99,100).
The 1–α confidence interval for the mean based on x events occurring (in a unit of time) is given by
For Excel 2007, χ2p,df = CHIINV(1−p,df).
Example 2: Suppose the number of radioactive particles that hits a screen per second follows a Poisson process and suppose that 5 hits occurred in one second, find the 95% confidence interval for the mean number of hits per second.
Figure 2 shows the confidence intervals for various values of x and α.
Figure 2 – Confidence intervals for the Poisson mean
The requested confidence interval is
1.623486 ≤ μ ≤ 11.66833
as calculated by the formulas in cells C9 and D9:
Note that =CHISQ.INV(p,0) for any value of p, and so we cannot use this formula to calculate the lower bound when x = 0 (cell C4). In any case, this value is zero.
Relationship with Binomial and Normal Distributions
Theorem 1: If the probability p of success of a single trial approaches 0 while the number of trials n approaches infinity and the value μ = np stays fixed, then the binomial distribution B(n, p) approaches the Poisson distribution with mean μ.
Click here for the proof of this theorem.
Observation: Based on Theorem 1 the Poisson distribution can be used to estimate the binomial distribution when n ≥ 50 and p ≤ .01, preferably with np ≤ 5.
Example 3: A company produces high precision bolts so that the probability of a defect is .05%. In a sample of 4,000 units what is the probability of having more than 3 defects?
We can solve this problem using the distribution B(4000, .0005), namely the desired probability is
1 – BINOMDIST(3, 4000, .0005, TRUE) = 1 – 0.857169 = 0.142831
We can also use the Poisson approximation as follows:
μ = np = 4000(.0005) = 2
1 – POISSON(3, 2, TRUE) = 1 – 0.857123 = 0.142877
As you can see the approximation is quite accurate.
Observation: The Poisson distribution can be approximated by the normal distribution, as shown in the following theorem.
Theorem 2: For n sufficiently large (usually n ≥ 20), if x has a Poisson distribution with mean μ, then x ~ N(μ, ).
Test for a Poisson Distribution
The index of dispersion of a data set or distribution is the mean divided by the variance.
Since the mean and variance of a Poisson distribution are equal, data that conforms to a Poisson distribution must have an index of dispersion approximately equal to 1. This fact can be used to test whether a data set has a Poisson distribution, as described in Goodness of Fit.
In fact in Goodness of Fit, we also show how to use the chi-square goodness-of-fit test to determine whether a data set follows a Poisson distribution.