# Negative Binomial and Geometric Distributions

### Negative Binomial Distribution

Definition 1: Under the same assumptions as for the binomial distribution, let x be a discrete random variable. The probability density function (pdf) for the negative binomial distribution is the probability of getting x failures before k successes where p = the probability of success on any single trial. Thus the pdf is

f(x) = C(x+k−1, x)pk(1−p)x

Excel Functions: Excel provides the following function regarding the negative binomial distribution:

NEGBINOMDIST(x, k, p) = the probability of getting x failures before y successes where p = the probability of success on any single trial; i.e. the pdf of the negative binomial distribution.

Excel 2010/2013 provide the following additional function: NEGBINOM.DIST(x, k, p, cum) where cum takes the values TRUE or FALSE. In particular, NEGBINOM.DIST(x, k, p, FALSE) = NEGBINOMDIST(x, k, p), while NEGBINOM.DIST(x, k, p, TRUE) = the probability of getting at most x failures before k successes, where p = the probability of success on any single trial; i.e. the cumulative probability function.

Real Statistics Function: Excel doesn’t provide a worksheet function for the inverse of the negative binomial distribution. Instead you can use the following function provided by the Real Statistics Resource Pack.

NEGBINOM_INV(p, k, pp) = smallest integer x such that NEGBINOM.DIST(x, k, pp, TRUE) ≥ p.

Note that the maximum value of x is 1,024,000,000. A value higher than this indicates an error. This function is only available for users of Excel 2010 or later.

Key statistical properties of the negative binomial distribution are:

• Mean = k(1 – p) ⁄ p
• Variance = k(1 – p) ⁄ p2
• Skewness = (2 – p) ⁄ $\!\sqrt{k(1-p)}$
• Kurtosis = (p2 – 6p + 6) ⁄ [k(1 – p)]

### Geometric Distribution

The geometric distribution is a special case of the negative binomial distribution, where k = 1. The pdf is

The cumulative distribution function (cdf) of the geometric distribution is

The pdf represents the probability of getting x failures before the first success, while the cdf represents the probability of getting at most x failures before the first success.

Observation: The geometric distribution is memoryless, which means that if you intend to repeat an experiment until the first success, then, given that the first success has not yet occurred, the conditional probability distribution of the number of additional trials required until the first success does not depend on how many failures have already occurred. The die one throws or the coin one tosses does not have a “memory” of any previous successes or failures. The geometric distribution is in fact the only memoryless discrete distribution that we will study.

Other key statistical properties of the geometric distribution are:

• Mean = (1 – p) ⁄ p
• Mode = 0
• Range = [0, ∞)
• Variance = (1 – p) ⁄ p2
• Skewness = (2 – p) ⁄ $\!\sqrt{1-p}$
• Kurtosis = 6 + p2 ⁄ (1 – p)

### 18 Responses to Negative Binomial and Geometric Distributions

1. MB says:

I’m using the NEGBINOM_INV(p, k, pp) function but I keep getting an error.

“Compile error in hidden module: Misc”

I tried the following NEGBINOM_INV(0.5, 2, 0.25)

I’m assuming that pp = p^2.

• MB says:

I figured out the “Compile error in hidden module: Misc” error. Apparently, this doesn’t work on the Mac version of Excel. It worked fine on my Excel 365 version.

I am still unclear about the “pp” in the argument. Is it p^2?

• Charles says:

MB,
pp is just the name of a variable. It is not p^2.
Charles

• MB says:

Charles,

What variable does pp denote?

MB

• Charles says:

MB,

The webpage states that

NEGBINOM_INV(p, k, pp) = smallest integer x such that NEGBINOM.DIST(x, k, pp, TRUE) ≥ p.

Perhaps, it would have been clearer if I had written this as

NEGBINOM_INV(alpha, k, p) = smallest integer x such that NEGBINOM.DIST(x, k, p, TRUE) ≥ alpha.

In any case, pp is the probability of success on any one trial (just like the p in the formula for BIONOM.DIST(x,n,p,cum)).

Charles

• Charles says:

MB,
NEGBINOM is not supported in Excel 2007 or the Mac version of Excel. This may be why you got the error message. pp is not equal to p^2.
See my response to your later comment.
Charles

2. Gami Nasir says:

“although I may indeed add a donation request sometime in the future so that I can recover some of my costs.”

When you do, please let me know at gami.nasir@gmail.com

Keep up the good work

Thanks

Gami

• Charles says:

Gami,
Thanks for your support. I hope to decide soon, but something keeps coming up that slows down the process.
Charles

3. Angy says:

Hello Dr. Charles,

I wanted to know if there was a way to calculate 95% confidence interval from data points that follow a negative binomial distribution.

I know that it is possible to get 95% CIs using the Poisson distribution in excel using CHIINV, see this link: http://www.nwph.net/Method_Docs/User%20Guide.pdf

I would greatly appreciate any suggestions,
Angy

• Charles says:

Angy,

In general you can calculate the 95% confidence for the mean as lower bound = mean – critical value at .05 x s.e. and upper bound = mean + critical value at .95 x s.e.

I have just updated the referenced webpage to give a formula for the variance of the negative binomial distribution. The webpage already describes a Real Statistics function NEGBINOM_INV which can be used to calculate the critical values. This is probably enough to calculate the confidence interval.

There are many descriptions on the web for calculating approximate intervals using the Poisson or normal distributions. The following webpage may be of help in using a normal approximation or calculating an exact value.

http://uu.diva-portal.org/smash/get/diva2:532980/FULLTEXT01.pdf

Charles

4. Gami Nasir says:

Hello Dr. Charles,

Is there a function in excel 2010 for the “Geometric Distribution”?

Thanks

PS: Why don’t you include a donation bottom at the end of each page?
People like me would be very happy to donate to the great work you are doing

• Charles says:

Hello Gami,

There is no explicit geometric distribution function. Instead you need to use the formula =NEGBINOM.DIST(x,1,p,cum).

I appreciate your support. I haven’t tried to raise any money from the site, although I may indeed add a donation request sometime in the future so that I can recover some of my costs.

Charles

5. Pierre McKenty says:

If NEGBINOM.DIST(x, y, p, TRUE) = the p(y) of “at most” x failures before a y success;
can I use 1-negbinom.dist(x, y, p, true) = for the p(y) of “at least” x failures before a y success ?

• Charles says:

Pierre,
1-negbinom.dist(x, y, p, true) = the probability of at least x+1 failures before y successes.
Charles

6. Jitendra kumar says:

Can you suggest me a real application of nagetive binomial distribution in reliability and survival analysis? Which are used as a life time model in reliability analysis.

• Charles says:

With a constant failure rate and a defined number of eliminations (i.e. deaths), the expected survival rate follows the negative binomial distribution. There are lots of examples of this in healthcare, econometrics, etc. You can look at the Survival Analysis webpages on the website for some examples.

Charles

7. Dr. Hassan Okasha says: