Multinomial Distribution

Definition 1: Given an experiment with the following characteristics:

  • the experiment consists of n independent trials, each with k mutually exclusive outcomes Ei
  • for each trial the probability of outcome Ei is pi

Let x…, xk be discrete random variables whose values are the number of times outcome Ei occurs in n trials. Then the probability distribution function for x…, xk is called the multinomial distribution and is defined as follows:

Multinomial function

Here
image602

The case where k = 2 is equivalent to the binomial distribution.

Example 1: Suppose that a bag contains 8 balls: 3 red, 1 green and 4 blue. You reach in the bag pull out a ball at random and then put the ball back in the bag and pull out another ball. This experiment is repeated a total of 10 times. What is the probability that the outcome will result in exactly 4 reds and 6 blues?

The possible outcomes for each trial in this experiment are E1 = a red ball is drawn, E2 = a green ball is drawn and E3 = a blue ball is drawn. Thus p1 = 3/8, p2 = 1/8 and p3 = 4/8, x1 = 4, x2 = 0 and x3 = 6.

image611

Excel Function: While Excel does not provide a function for the multinomial distribution, it does provide the following function:

MULTINOMIAL(x…, xk) = n! / (x1!∙…∙xk!)

Thus we could also calculate the answer to Example 9.10 by using the formula

MULTINOMIAL(4,0,6)*(3/8)^4*(1/8)^0*(4/8)^6 = .064888

We can also use a range as the argument of MULTINOMIAL as in Figure 1.

Multinomial distribution Excel

Figure 1 – Multinomial distribution

We can use the following Excel array formula to calculate the same result

=PRODUCT(B9,B6:B8^B3:B5)

Alternatively, we can use the following more complicated non-array formula

=B9*EXP(SUMPRODUCT(B3:B5,LN(B6:B8)))

Real Statistics Excel Function: The following supplemental function in the Real Statistics Resource Pack can be used to calculate the multinomial distribution.

MULTINOMDIST(R1, R2) = the value of the multinomial pdf where R1 is a range containing the values x1, …, xk and R2 is a range containing the values p1, …, pk

Referring to Figure 1, we have MULTINOMDIST(B3:B5,B6:B8) = 0.064888.

13 Responses to Multinomial Distribution

  1. savvas says:

    So. i get the probability from the stated calculations… how do i get the confidence intervals for each proportion?Is it somewhere on the real statistics site?

    • Charles says:

      The site contains lots of examples of how to calculate confidence interval. Please be more specific about the confidence interval that you are looking for. Is it for a proportion distribution? If so, see the webpage Proportion Distribution.
      Charles

  2. Dirk says:

    Hi Charles,

    I found the answer to my question already. No need to spend time on it.

    Thanks

    • vbhdb says:

      Hi,

      I’m interested in the answer.
      What would be the formula for Confidence Interval in this case ?

      Thanks

      • Charles says:

        Hi,

        If you have a proportional multinomial distribution with probabilities p1, p2, …, pk for mutually exclusive events E1, E2, …, Ek. Then for any pi you can look at this as a binomial distribution with p = pi. 1-p is therefore the sum of the pj’s excluding pi. The confidence interval for the population version of pi is therefore as calculated for the proportional binomial distribution. This is described on the following webpage using the normal distribution approximation.
        http://www.real-statistics.com/binomial-and-related-distributions/proportion-distribution/

        Later today I plan to add a slightly more accurate version of the confidence interval due to Edwin Wilson, which can be used for both the binomial and multionomial distributions.

        Charles

        • vbhdb says:

          Charles,
          I need to sort the weights of a large sample of people within different intervals (110kg), from which I calculate proportions.
          Then I compute the confidence intervals.
          So I think that I can use the CI of the proportional binomial distribution. Right?
          Thanks

  3. Dirk says:

    Hi Charles,

    I have a question that relates to a multinomial distribution (not even 100% sure about this) that I hope you can help me with.
    If I take a sample (lets assume n=400) on a categorical variable that has more than two possible outcomes (e.g. blue, black, green, yellow) and plot the frequencies so that I can get the probabilities. E.g.:
    black 10%
    blue 25%
    green 35%
    yellow 30%

    How could I compute the 95% confidence interval for those probabilities?
    Any help is highly appreciated.
    Thanks a lot again for creating such a great resource.

    Kind regards,

    Dirk

  4. Colin says:

    Sir
    This formula (=B9*EXP(SUMPRODUCT(B3:B5,LN(B6:B8)))) is cool ! I love it!

    • Dirk says:

      Hi,

      an alternative method to calculate the probability would be an array formula:
      =PRODUCT(B9,B6:B8^B3:B5) + CTRL+SHIFT+ENTER

      • Charles says:

        Hi Dirk,
        Yes. I like your approach, especially since it is simpler than the one I wrote on the webpage. I have just added your suggestion to the referenced webpage. Thanks for your very useful comment.
        Charles

Leave a Reply

Your email address will not be published. Required fields are marked *