**Definition 1**: Given an experiment with the following characteristics:

- the experiment consists of
*n*independent trials, each with*k*mutually exclusive outcomes*E*_{i} - for each trial the probability of outcome
*E*is_{i}*p*_{i}

Let *x _{1 }…, x_{k}* be discrete random variables whose values are the number of times outcome

*E*occurs in

_{i}*n*trials. Then the probability distribution function for

*x*is called the

_{1 }…, x_{k}**multinomial distribution**and is defined as follows:

The case where *k* = 2 is equivalent to the binomial distribution.

**Example 1**: Suppose that a bag contains 8 balls: 3 red, 1 green and 4 blue. You reach in the bag pull out a ball at random and then put the ball back in the bag and pull out another ball. This experiment is repeated a total of 10 times. What is the probability that the outcome will result in exactly 4 reds and 6 blues?

The possible outcomes for each trial in this experiment are *E _{1}* = a red ball is drawn,

*E*= a green ball is drawn and

_{2}*E*= a blue ball is drawn. Thus

_{3}*p*= 3/8,

_{1}*p*= 1/8 and

_{2}*p*= 4/8,

_{3}*x*= 4,

_{1}*x*= 0 and

_{2}*x*= 6.

_{3}**Excel Function**: While Excel does not provide a function for the multinomial distribution, it does provide the following function:

**MULTINOMIAL**(*x _{1 }…, x_{k}*) =

*n*! / (

*x*!∙…∙

_{1}*x*!)

_{k}Thus we could also calculate the answer to Example 9.10 by using the formula

MULTINOMIAL(4,0,6)*(3/8)^4*(1/8)^0*(4/8)^6 = .064888

We can also use a range as the argument of MULTINOMIAL as in Figure 1.

**Figure 1 – Multinomial distribution**

We can use the following Excel array formula to calculate the same result

=PRODUCT(B9,B6:B8^B3:B5)

Alternatively, we can use the following more complicated non-array formula

=B9*EXP(SUMPRODUCT(B3:B5,LN(B6:B8)))

**Real Statistics Excel Function**: The following supplemental function in the Real Statistics Resource Pack can be used to calculate the multinomial distribution.

**MULTINOMDIST**(R1, R2) = the value of the multinomial pdf where R1 is a range containing the values *x _{1, }…, x_{k}* and R2 is a range containing the values

*p*

_{1, }…, p_{k}Referring to Figure 1, we have MULTINOMDIST(B3:B5,B6:B8) = 0.064888.

So. i get the probability from the stated calculations… how do i get the confidence intervals for each proportion?Is it somewhere on the real statistics site?

The site contains lots of examples of how to calculate confidence interval. Please be more specific about the confidence interval that you are looking for. Is it for a proportion distribution? If so, see the webpage Proportion Distribution.

Charles

Hi Charles,

I found the answer to my question already. No need to spend time on it.

Thanks

Hi,

I’m interested in the answer.

What would be the formula for Confidence Interval in this case ?

Thanks

Hi,

If you have a proportional multinomial distribution with probabilities p1, p2, …, pk for mutually exclusive events E1, E2, …, Ek. Then for any pi you can look at this as a binomial distribution with p = pi. 1-p is therefore the sum of the pj’s excluding pi. The confidence interval for the population version of pi is therefore as calculated for the proportional binomial distribution. This is described on the following webpage using the normal distribution approximation.

http://www.real-statistics.com/binomial-and-related-distributions/proportion-distribution/

Later today I plan to add a slightly more accurate version of the confidence interval due to Edwin Wilson, which can be used for both the binomial and multionomial distributions.

Charles

Charles,

I need to sort the weights of a large sample of people within different intervals (110kg), from which I calculate proportions.

Then I compute the confidence intervals.

So I think that I can use the CI of the proportional binomial distribution. Right?

Thanks

(110kg)

less than 50;50-69;70-89;90-109; more than 110kg

Sorry, but I don’t have enough information to provide a response.

Charles

Hi Charles,

I have a question that relates to a multinomial distribution (not even 100% sure about this) that I hope you can help me with.

If I take a sample (lets assume n=400) on a categorical variable that has more than two possible outcomes (e.g. blue, black, green, yellow) and plot the frequencies so that I can get the probabilities. E.g.:

black 10%

blue 25%

green 35%

yellow 30%

How could I compute the 95% confidence interval for those probabilities?

Any help is highly appreciated.

Thanks a lot again for creating such a great resource.

Kind regards,

Dirk

Sir

This formula (=B9*EXP(SUMPRODUCT(B3:B5,LN(B6:B8)))) is cool ! I love it!

Hi,

an alternative method to calculate the probability would be an array formula:

=PRODUCT(B9,B6:B8^B3:B5) + CTRL+SHIFT+ENTER

Hi Dirk,

Yes. I like your approach, especially since it is simpler than the one I wrote on the webpage. I have just added your suggestion to the referenced webpage. Thanks for your very useful comment.

Charles