Poisson Distribution

Basic Concepts

Definition 1: The Poisson distribution has a probability distribution function (pdf) given by

The parameter μ is often replaced by λ. A chart of the pdf of the Poisson distribution for λ = 3 is shown in Figure 1.

Figure 1 – Poisson Distribution

Observation: Some key statistical properties of the Poisson distribution are:

• Mean = µ
• Variance = µ
• Skewness = 1 /$\! \sqrt{\mu}$
• Kurtosis = 1/µ

Excel Function: Excel provides the following function for the Poisson distribution:

POISSON(x, μ, cum) where μ = the mean of the distribution and cum takes the values TRUE and FALSE

POISSON(x, μ, FALSE) = probability density function value f(x) at the value x for the Poisson distribution with mean μ.

POISSON(x, μ, TRUE) = cumulative probability distribution function F(x) at the value x for the Poisson distribution with mean μ.

Excel 2010/2013/2016 provide the additional function POISSON.DIST which is equivalent to POISSON.

Real Statistics Function: Excel doesn’t provide a worksheet function for the inverse of the Poisson distribution. Instead you can use the following function provided by the Real Statistics Resource Pack.

POISSON_INV(p, μ) = smallest integer x such that POISSON(x, μ, TRUE) ≥ p

Note that the maximum value of x is 1,024,000,000. A value higher than this indicates an error.

Poisson Process

If the average number of occurrences of a particular event in an hour (or some other unit of time) is μ and the arrival times are random without any tendency to bunch up (i.e. the assumptions for what is called a Poisson process) then the probability of x events occurring in an hour is given by

Example 1: A large department store sells on average 100 MP3 players a week. Assuming that purchases are as described in the above observation, what is the probability that the store will have to turn away potential buyers before the end if they stock 120 players? How many MP3 players should the store stock in order to make sure that it has a 99% probability of being able to supply a week’s demand?

The probability that they will sell ≤ 120 MP3 players in a week is

POISSON(120, 100, TRUE) = 0.977331

Thus, the answer to the first problem is 1 – 0.977331 = 0.022669, or about 2.3%. We can answer the second question by using successive approximations until we arrive at the correct answer. E.g. we could try x = 130, which is higher than 120. The cumulative Poisson is 0.998293, which is too high. We then pick x = 125 (halfway between 120 and 130). This yields 0.993202, which is a little too high, and so we try 123. This yields 0.988756, which a little too low, and so we finally arrive at 124, which has cumulative Poisson distribution of 0.991226.

Alternatively, you can arrive at the same answer (124) by using the Real Statistics formula =POISSON_INV(0.99,100).

Confidence Intervals

The 1–α confidence interval for the mean based on x events occurring (in a unit of time) is given by

where

For Excel 2007, χ2p,df  = CHIINV(1−p,df).

Example 2: Suppose the number of radioactive particles that hits a screen per second follows a Poisson process and suppose that 5 hits occurred in one second, find the 95% confidence interval for the mean number of hits per second.

Figure 2 shows the confidence intervals for various values of x and α.

Figure 2 – Confidence intervals for the Poisson mean

The requested confidence interval is

1.623486 ≤ μ ≤ 11.66833

as calculated by the formulas in cells C9 and D9:

=CHISQ.INV(B9/2,2*A9)/2

=CHISQ.INV.RT(B9/2,2*(A9+1))/2

Note that CHISQ.INV(p,0) = #NUM! for any value of p, and so we cannot use this formula to calculate the lower bound when x = 0 (cell C4). In any case, this value is zero.

Relationship with Binomial and Normal Distributions

Theorem 1: If the probability p of success of a single trial approaches 0 while the number of trials n approaches infinity and the value μ = np stays fixed, then the binomial distribution B(n, p) approaches the Poisson distribution with mean μ.

Observation: Based on Theorem 1 the Poisson distribution can be used to estimate the binomial distribution when n ≥ 50 and p ≤ .01, preferably with np ≤ 5.

Example 3: A company produces high precision bolts so that the probability of a defect is .05%. In a sample of 4,000 units what is the probability of having more than 3 defects?

We can solve this problem using the distribution B(4000, .0005), namely the desired probability is

1 – BINOMDIST(3, 4000, .0005, TRUE) = 1 – 0.857169 = 0.142831

We can also use the Poisson approximation as follows:

μ = np = 4000(.0005) = 2

1 – POISSON(3, 2, TRUE) = 1 – 0.857123 = 0.142877

As you can see the approximation is quite accurate.

Observation: The Poisson distribution can be approximated by the normal distribution, as shown in the following theorem.

Theorem 2: For n sufficiently large (usually n ≥ 20), if x has a Poisson distribution with mean μ, then x ~ N(μ, $\sqrt \mu$).

Test for a Poisson Distribution

The index of dispersion of a data set or distribution is the variance divided by the mean.

Since the mean and variance of a Poisson distribution are equal, data that conforms to a Poisson distribution must have an index of dispersion approximately equal to 1. This fact can be used to test whether a data set has a Poisson distribution, as described in Goodness of Fit.

In fact in Goodness of Fit, we also show how to use the chi-square goodness-of-fit test to determine whether a data set follows a Poisson distribution.

60 Responses to Poisson Distribution

1. Colin says:

Sir
Example 2 Question 1: “What is the probability that the store will run out of MP3 players in a week if they stock 120 players? ”

You wrote: “The probability that they will sell ≤ 120 MP3 players in a week is
POISSON(120, 100, TRUE) = 0.977331 ”

So the probability of sell <= 120 is 0.977331, so the demand exceeds the inventory stock which is 120 pics (ran out of stock) is 1-0.977331.

I think the probability the store ran out of stock is 1-0.977331.

Am I wrong?

• Charles says:

Hi Colin,
You have a sharp eye. Thanks for pointing out the mistake. I have now made the appropriate correction to the webpage.
Charles

2. Greg says:

http://www.real-statistics.com/binomial-and-related-distributions/poisson-distribution/

Typo: the setup for the question stipulates a 0.05% probability of occurrence and the calculated result agrees with that. However, the formula as shown below appears to use 0.06% [.0006] rather than 0.05%. Just a typo; the result is based on the .05% value.
1 – BINOMDIST(3, 4000, .0006, TRUE) = 1 – 0.857169 = 0.142831

• Charles says:

Greg,
Yes, you are correct. Thanks for catching this typo. I have now corrected it on the referenced webpage.
Charles

3. Pingback: Phil 3.20.15 | viztales

4. Hossen Bayezid says:

The average number of signals sent from a station is 3 per day which do not reach properly to the another station . Find the probability that the signals which are sent in a day but not reached properly is at least 3

• Charles says:

1-POISSON(2,3,TRUE) assuming a Poisson distribution.
Charles

5. Mariano says:

Excellent resource!

I have used it to sucessfully simulate a poisson process (counts of a radioactive material) and has helped me to make a worksheet to obtain detection tresholds and minimun detectable activity (they are dependant of sigma) for the contaminations sensors i use.

Keep the good work!

6. Alice says:

Hi Charles,

I wonder if it would be correct to use poisson distribution to determine safety stock for a product when the product has the following demand pattern for the last 12 months:

3 580
0
0
1 135
1 000
363
2 175
72
620
92
228
373

The reason behind this pattern is that one of the customers is buying given product in batches.

7. Israel says:

Hi Charles,

Many thanks for the explanation, i’m trying to find a way to apply this to sports betting using Excel, i’ve managed to locate a Poisson template and i’m wondering if there are any know ways of viewing football prediction results

• Charles says:

I don’t know any way of applying this to sports betting, but perhaps someone else in the community can help.
Charles

8. Anson says:

When n (from 5 to 10 and 20)increases what happens on the probability distribution graph?(binomial, poisson and normal)

• Charles says:

It really depends on what happens with the other parameters.

Binomial: If the other parameters in the BINOMDIST function are held constant then the cumulative distribution values decrease (e.g. compare BINOMDIST(4,n,.7,TRUE) for n = 5, 10, 20.

Poisson: If you assume that the mean of the distribution = np, then the cumulative distribution values decrease (e.g. compare POISSON(2,np,TRUE) where p = .5 for n = 5, 10, 20.

Normal: It really depends on how you are going to use n since NORMDIST doesn’t directly use n.

Charles

9. bitslayer says:

and each signup lasts for exactly 30 days
what is the probability that I will have at least a hundred people signed up at once?

• Charles says:

Assuming that you are looking for the probability that at least 100 people are signed up in any specific 30 day period, then the result should be =1-POISSON.DIST(99,60,TRUE), which has a value of 1.48E-06, a very small number.
Charles

10. HM says:

Charles, May I ask you a question about Poisson function on excel ? what would be the best way to ask you?

• Charles says:

You can ask as a Comment.
Charles

11. Able says:

Say a user can check out a maximum of 2000 licenses at a time. Multiple users can check out one or more licenses each hour for a product but not more than 2000. The product owner has log that shows data for each hour in the year (365 days). For each hour, the log contains the highest number (quantity) of licenses checked out in that hour. Using this log of every hour over the past year, how would you determine for the next year if 2000 licenses is enough based on this data using a Poisson distribution function in excel? What would be the average be? Note that in many cases, the licenses checked out one hour is the same licenses check out by the same person in a previous hour(the license continues to be check out by the same user over multiple hours or days). Do you believe this is a Poisson Distribution? Also, how would you calculate the mean?

• Charles says:

Able,

Please explain the premise more clearly. If a user checks out 2,000 in one hour, can he/she check out another 2,000 licenses in the next hour? (i.e. are these like checking out books in a library?) If you have say 10 users can they each check out 2,000 licenses in an hour (for a total of 20,000 licenses) or are they restricted to 2,000 licenses in aggregate?

What criterion are you using for determining whether “2000 licenses is enough based on this data using a Poisson distribution function in excel?” Do you mean that (1) in no hour is more than 2,000 licenses checked out or (2) the probability of more than 2,000 licenses being checked out is zero or (3) something else?

Charles

• Able says:

Charles,

1)If a user checks out 2000 licences one hour, in the next hour either their will be 0 licenses checkout out or the same 2000 licenses that were checked out in the previous hour(s). It all depends if a users software job has ended or not and the user does not require any more licenses.

2)If you have 10 users they are they restricted to 2,000 licenses in aggregate per hour.

3)I am looking for probability of more than 2,000 licenses being checked out is zero. I am also looking for the lowest license count where the probability of more than x number licenses being in use is 0.

• Charles says:

It appears that you are assuming an “arrival time” (i.e. checking out of licenses) that follows a Poisson distribution, but you haven’t said anything about the “service time” (i.e. for how long the license is checked out). If we assume, for example that the arrival time follows a Poisson distribution with mean m and a service time whose average is say n, then we have a typical queueing model.

Is this is what you are interested in?

Charles

• Able says:

Charles,

I do not have information regarding the service time, just the arrival time. Can Poisson or perhaps another statistical method be used to determine the probability of more than 2,000 licenses being checked out is zero with just the arrival time information?

12. Able says:

If a cumulative probability distribution function is the probability that a random variable will have a value less than or equal to X, what would be the function name for the opposite(he probability that a random variable will have a value grater than or equal to X, or even just greater than X)?

• Charles says:

It depends on the context. For example, in survival analysis this is called the survival function.
Charles

13. Mayukh Sarkar says:

Hi Charles,

I am doing a project on diagnostic errors in medicine. I want to show the number of errors per month over a 6 month period. Overall, for 200 events I have about 150 with errors and 50 without errors. How do I use poisson distribution in this case? I do know in each month how many errors occur. Do we need this data? I am using Excel, so what will be the formula?

• Charles says:

Mayukh,
You know that the probability of an error is 3/4. Now you need to know how many events occur in a month, and from this it is easy to calculate the average number of errors per month. If however you want to know the probability that you will have say m errors in the month (instead of the average number), then you can use the approach described on the referenced webpage.
Charles

14. Akhil Dayalu says:

Sir, I have 12 data sets how to check distribution on these data sets?

15. Anti says:

Hi.

How can I calculate this:

A fire brigade A is on average called out 5 times a day. 1 call takes on 1hour for the brigade. a) What is the probability that the men can rest at least 2 hours between the calls. b) What is the probability that the brigade B has to be sent out because the brigade A is still on its duty.

The answes should be a) 0.535 and b) 0.188

I would be really glad if you could help me…

• Charles says:

Sorry Anti, but this looks like a school assignment and I don’t want to answer such questions. I am happy to clarify concepts, but not provide such specific answers.
Charles

16. Zarish Saleem says:

Hy Charles..
I have to make an assignment on “fitting a poison distribution to observed data”
So tell me how many numbres of data will be suitable. ???
I shall be
very thankful to you.

• Charles says:

Zarish,
Sorry, but I don’t know how many numbers is suitable. In fact, first you need to define what “suitable” means.
Perhaps someone else has some insight into this issue.
Charles

17. Ed says:

I came across an article on the number of accidents per # man hours worked.
The article did not detail how the adjusted accident was derived. Neither did it specify how the UCL, Avg and LCL was derived based on the adjusted accident.
Need help to determine adjusted accident, UCL, Avg and LCL.

# man hours Accidents Adjusted accident
287287 4 3.843
270828 8 8.228
298531 4 3.630
271726 4 4.149
280771 7 6.961
282547 9 8.905
226415 18 20.685
226410 2 2.917
253482 2 2.429
270376 4 4.176
283216 15 14.843
278348 5 5.018
296736 5 4.633
308166 19 17.731
298507 9 8.466
295858 5 4.651
307692 5 4.414
273973 4 4.104
293849 4 3.718

Analysis was performed using adjusted accident with UCL=16.794, Avg=7.000, LCL=1.206

• Charles says:

Ed,
Without reading the article, I don’t know what is intended by these terms.
Charles

• Ed says:

Charles,

Thanks.
Ed

• Ed says:

Charles,
I still can’t figure out how the author “uses standard statistical theory to adjust the number of accidents based on the variation of the man-hours worked from month to month.”
Any help is much appreciated.

Thanks.

Ed

• Charles says:

Sorry Ed, but I haven’t had time to look into to this further. Have you been able to figure out what the author did?
Charles

18. Avi says:

Hi Charles,

Please help me with this problem as i do not understand how the standard deviation affects the processing time.
An analyst in the statistics office of a government department in Wellington
requests for reports. On average the office receives 19 requests per (40 hour) week, with the arrival pattern resembling a Poisson process. Once working on a request the processing time averages 2.0 hours with a standard deviation of 1.0 hours. After the request is processed the report is automatically emailed to the requester.

(a) What is the utilisation of the analyst?
(b) What is the average time a request waits before processing begins (Tq)?

Are my answers correct for (a) 0.2375 and (b) 0.6557?

• Charles says:

Avi,
Charles

19. Eugene says:

Dear Charles,
You state on this page that “The index of dispersion of a data set or distribution is the mean divided by the variance.” Isn’t the definition of the index of dispersion “the variance divided by the mean”?
Eugene.

• Charles says:

Eugene,
Yes, you are correct. I have just changed the webpage to correct this error.
Thanks for your help in making the website more accurate.
Charles

20. Lena says:

Hi Charles,

If my question is
A sample of 32 hotels show that the mean of numbers of guests is 55, and I would like to find a P(100<=x<=150), is it correct my formula is =POISSON(150,55,TRUE)-POISSON(99,55,TRUE) ?

Hope you can assist me in this matter.
Thanks a lot.

• Charles says:

Lena,
Assuming the data follows a Poisson distribution, then the formula you have given is correct.
Charles

• Lena says:

Do I need to check whether the data is poisson distribution?
Or I need to testing the normal distribution using QQ plot…and etc?

Thanks Charles.

• Charles says:

Lena,
1. If you know that the data should follow a Poisson distribution on theoretical grounds (e.g. there is a Poisson process), then you should be ok. One quick check to see whether data follows a Poisson process is to see whether the mean is roughly equal to the variance (as described on the website).
2. If you believe the data follows a Poisson distribution, then there is no reason to test for a normal distribution
Charles

• Lena says:

1. Does it possible that the data set simultaneously follow a Poisson distribution and normal distribution?

2. If it follow a Poisson distribution, how to I find the P(X=O) ? When mean= 0.36667, SD=0.4901.

Hope you can assist me on this, thanks Charles.

• Charles says:

Lena,
1. No
2. I am not sure what you mean by P(X=O), but you need to use the POISSON function.
Charles

• Lena says:

As Poisson distribution is approximately normal, so can I use normal distribution as well in calculate the probability of discrete random variable?

• Charles says:

Lena,
For n sufficiently large (usually n ≥ 20), if x has a Poisson distribution with mean μ, then x has an approximately normal distribution with mean μ and variance μ.
If n is sufficiently large, then, yes, you can use the normal distribution.
Charles

21. Dickson Voon says:

Hi Charles

Need help Poisson Distributiin.

Recently my team assignment selected Hotel beds unoccupied with 50 counts and our expected number=mean= lambda. However, my lambda = 34.36 that unable fit me to do the calculation.

My lecturer told to reduce lambda. My I know how to reduce the lambda for average of 34.36 for beds.

I found Lambda only able calculate not more than 10.

I hope to heard from you.

Thank you

• Charles says:

Dickson,
Sorry, but without knowing more about your scenario I don’t know how to reduce lambda.
Charles

• Romanee says:

Hi Charles, by using excel to calculate the probability. Wat is =1-poisson.dist(X,Mean,True/False) VS =poisson.dist(,Mean,True/False)? When to use 1-Poisson?

• Charles says:

Romanee,
If F(a) = the probably that x < = a, then 1 - F(a) = the probability that x > a.
Thus, =poisson.dist(X,Mean,True) is the probability that the Poisson distribution is less than or equal to X and
=1-poisson.dist(X,Mean,True) is the probability that the Poisson distribution is greater than X.
Note that this is not true if True is replaced by False.
Charles

22. A company publishes statistics concerning car quality. The initial quality score measures the number of problems per new car sold. For one​ year, Car A had 1.451.45 problems per car. Let the random variable X be equal to the number of problems with a newly purchased model A car. Complete​ (a) and​ (b) below.

A) If you purchased a model A​ car, what is the probability that the new car will have zero​ problems?
B) If you purchased a model A​ car, what is the probability that the new car more than 3 or fewer problems?

• Charles says:

Hasten,
Assuming that the assumptions for the Poisson distribution are satisfied, these sorts of problems are described on the referenced webpage. Here your mean is 1451.45.
Charles

23. Ainul says:

I want to determine poisson distribution with kolmogorov smirnov with excell and R. How to calculate manual of poisson distribution in excell by using kolmogorov-smirnov? And if i use software R : ks.test(data,ecdf(data)). Is it true?