Hypothesis Testing for Binomial Distribution

Example 1: Suppose you have a die and suspect that it is biased towards the number three, and so run an experiment in which you throw the die 10 times and count that the number three comes up 4 times. Determine whether the die is biased.

Define x = the number of times the number three occurs in 10 trials. This random variable has the binomial distribution  where π is the population parameter corresponding to the probability of success on any trial. We use the following null and alternative hypotheses:

H0: π ≤ 1/6; i.e. the die is not biased towards the number 3
H1: π > 1/6

Setting α = .05, we have

P(x ≥ 4) = 1–BINOM.DIST(3, 10, 1/6, TRUE) =  0.069728 > 0.05 = α.

and so we cannot reject the null hypothesis that the die is not biased towards the number 3 with 95% confidence.

Example 2: We suspect that a coin is biased towards heads. When we toss the coin 9 times, how many heads need to come up before we are confident that the coin is biased towards heads?

We use the following null and alternative hypotheses:

H0π ≤ .5
H1π > .5

Using a confidence level of 95% (i.e. α = .05), we calculate

BINOM.INV(n, p, 1–α) = BINOM.INV(9, .5, .95) = 7

which means that if 8 or more heads come up then we are 95% confident that the coin is biased towards heads, and so can reject the null hypothesis.

We confirm this conclusion by noting that P(x ≥ 8) = 1–BINOM.DIST(7, 9, .5, TRUE) =  0.01953 < 0.05 = α, while P(x ≥ 7) = 1–BINOM.DIST(6, 9, .5, TRUE) = .08984 > .05.

Example 3: Historically a factory has been able to produce a very specialized nano-technology component with 35% reliability, i.e. 35% of the components passed its quality assurance requirements. They have now changed their manufacturing process and hope that this has improved the reliability. To test this, they took a sample of 24 components produced using the new process and found that 13 components passed the quality assurance test. Does this show a significant improvement over the old process?

We use a one-tailed test with null and alternative hypotheses:

H0: p ≤ .35
H1: p > .35

p-value = 1–BINOM.DIST(12, 24, .35, TRUE) = .04225 < .05 = α

and so conclude with 95% confidence that the new process shows a significant improvement.

54 Responses to Hypothesis Testing for Binomial Distribution

  1. Hello, I like your website. I wonder if you could help me with a problem. I am a custody evaluator and I want to examine 89 reports to assess for whether I am biased in my decisions, for either fathers or mothers. Probability of an outcome is of course 50:50. So the null hypothesis is mothers and fathers are equal. Can the binomial test be used to show examine if my outcomes depart from equality? How many decisions in either direction would be necessary to show a bias in 89 reports? Thanks, Marvin

    • Charles says:

      Hello Marvin,
      It sounds like your problem is equivalent to Example 2 on the referenced webpage with n = 89 and p = .5.

  2. Eliza says:

    Hi, this website is so helpful! Really glad I found it. I wonder if you could help me with a problem too. My English isn’t that good so I hope you understand. I did a discrimination test in school with two brands of popcorn. There where two cups of B-brand cheap popcorn and 1 cup of A-brand popcorn. The students that follow the same subject (statistics) where the test persons. There where 16 test persons, 8 of them pointed out the A brand popcorn as the different one, which was correct. Now I have to analyse these results with the five steps of hypothesis testing from the book of Agresti & Franklin. Can I use a binomial one tailed test? And is Ho: p=0.33 and Ha: p>0.33? I’m not sure if I’m doing it right and I’m not allowed to use my Texas Ti 83 calculator. And can I use a=0.05? I hope you can help me out or give me some hints. Thanks, Eliza.

    • Charles says:

      If I understand the problem correctly, then I believe that the approach you are taking is correct. Assume p = probability of selecting A-brand on any single trial = 1/3, based on the null hypothesis that people pick completely at random. You can then apply the binomial one-tailed test as you have described.

  3. Rochelle says:

    Hi! This site helps me a lot in answering some of our assignments. But there’s one problem that I, myself, can’t understand. Hope you can help me. Thank you!

    Suppose a coin is tossed 170 times and you observed that 50 heads and 120 tails. Now, if you are going to toss this coin 23 times, how many heads you must observe so that the probability of getting head is not equal to 5/17 on the average? (Hint: Hypothesis testing with interval estimation)

    • Charles says:

      I am reluctant to do your homework assignment, but I will give you a possible hint. If your were told that the coin is biased so that the probability of a head occurring is 5/17, then you could use the approach shown in Example 2 of the referenced webpage. Here 50 heads and 120 tails yields 50/170 = 5/17. Since you were told to use confidence intervals, you need to look beyond just the averages but at some interval around 5/17 (see how to calculate confidence intervals).

  4. Rochelle says:

    Based on the problem, the question was “how many heads you must observe so that the probability of getting head is not equal to 5/17 on the average?”. How was that?

    • Charles says:

      Remember that the probabilities that an event E occurs or doesn’t occur are interrelated, namely the probability that event E doesn’t occur is equal to 1 – the probability that it does occur. In your problem you need to look at confidence intervals. Inside the interval you are confident that some event occurs, while outside that interval you are confident that the event doesn’t occur.

  5. Erik says:

    Hello, I have a question about example 2, tossing a coin 9 times and the result of the Critbinom function is 7 heads.
    When I input that in my statistical program and choose Non-parametric statistics – Binomial test, using a test proportion of 0.5, it gives a p-value of 0.18 (2-tailed)!
    8 heads out of 9 tosses gives a p-value of 0.04 (2-tailed).
    Doesn’t that mean that we need 8 heads to be 95% confident that the coin is biased towards heads?

    • Charles says:

      I am not sure where you got your p-value from, but 1-BINOM.DIST(8,9,.5,TRUE) = 0.001953. Thus p-value = .003906, which is close to .004 not .04.

      • Erik says:

        Hello Charles,
        I got the p-value from my statistics program PSPP, it is similar to SPSS. I get the same result at this site:

        Here is the result:
        “Sign and binomial test
        Number of “successes”: 7
        Number of trials (or subjects) per experiment: 9
        Sign test. If the probability of “success” in each trial or subject is 0.500, then:
        The one-tail P value is 0.0898
        This is the chance of observing 7 or more successes in 9 trials.
        The two-tail P value is 0.1797
        This is the chance of observing either 7 or more successes, or 2 or fewer successes, in 9 trials. ”


  6. Erik says:

    I understand it now. Excel gives cumulative probability, which for 1 to 7 heads is 0.9804. But 7 heads in itself has an exact probability of 0.0731, see the table below.
    So when Excel says 7 heads is the critical value, it means that 8 and above is 95% confident. Adding the exact probability of 8 heads (0.01758) and 9 heads (0.00195), gives a probability of 0.01953. 2-tailed that is 0.04 like my statistics program says.
    So when Excel says BINOM.DIST(8,9,.5,TRUE) = 0.001953, it is the cumulative probabality of getting 1 to 8 heads, and that is the same as the probability of getting 9 heads.

    Number of Successes
    Number of Failures
    Exact Probability
    Cumulative Probability
    0 9 0.195% 0.195%
    1 8 1.758% 1.953%
    2 7 7.031% 8.984%
    3 6 16.406% 25.391%
    4 5 24.609% 50.000%
    5 4 24.609% 74.609%
    6 3 16.406% 91.016%
    7 2 7.031% 98.047%
    8 1 1.758% 99.805%
    9 0 0.195% 100.000%

  7. Michael says:

    Hi, Charles.
    I think the calculation of a critical value of events you demonstrate using CRITBINOM function is not correct. The critical value is defined as

    k_crit = min{k : P(X>=k) <= alpha}

    but what the Excel function returns is

    k_excel = min{k : P(X= 1- alpha} = min{k : 1- P(X<=k) k) k) P(X>=k), the difference being, of course, P(X=k).

    I am not sure, actually, if there is a simple way to get the correct critical value in Excel using CRITBINOM. Any suggestions?


    • Michael says:

      Sorry, don’t know what happed with formulas in my previous post above. The second one should read

      = min{k : P(X= 1- alpha}
      = min{k : 1- P(X<=k) k) <= alpha }

      Hope this one come out OK.

  8. Michael says:

    Nope, it did not come out again. Let me try for the last time. What CRITBINOM gives is: k_excel = min{k : P(X>k) <= alpha}.

    • Michael says:

      This time it worked. So, we see that k_crit k_excel since for discrete distributions, such as the binomial, P(X>=k) P(X>k).
      Thanks and sorry for polluting your site with several posts. Feel free to correct formulas in the first one and delete all the others.

      • Michael says:

        What’s going on? There were (should be) “not equal” signs between the k’s and P’s.
        PS: Maybe you should allow some LaTeX type support in comments. Cheers.

        • Charles says:

          Sorry, but I no longer understand what your final comment is. Please send one more comment which captures what you are trying to say without referring to any of the previous comments.

          • Michael says:

            Hi Charles,
            Sorry for the confusion. The point is that Excel function returns k for which P(X >k) is = k) is <= alpha. This two are not the same.

          • Michael says:

            Hi Charles,
            Sorry for the confusion. The point is that Excel function returns k for which a probability of observing a value strictly greater than k is less than or equal to alpha. But the critical values of k is defined as that for which a probability of observing a value greater than or equal to k is less than or equal to alpha. There, said it in words. This should now come out right.

          • Charles says:


            Let’s use a specific example. Suppose we are looking at a binomial distribution with n = 5 and p = .5. The probability of 0 successes is BINOM.DIST(0,5,.5,FALSE) = .03125 and the probability of 1 success is BINOM.DIST(1,5,.5,FALSE) = .15625. It also follows that the probability of 0 or 1 successes is given by .03125 + .15625 = .1875 or BINOM.DIST(1,5,.5,TRUE) = .1875.

            Not let alpha = .1875. The critical value as defined by Excel is BINOM.INV(5,.5,.1875) = 1, whereas BINOM.INV(5,.5,.18749999) = 1 and BINOM.INV(5,.5,.187500001) = 2. Some might consider the critical value for alpha = .1875 to be 2 instead of 1.


  9. Michael says:


    It’s not about what “some might consider”. It is about the definition. And the definition tells us that the critical value is the minimum number of events such that the probability of observing THAT MANY OR MORE events is LESS THAN OR EQUAL TO alpha. But what Excel function returns is the minimum number of events such that the probability of observing STRICTLY MORE events is LESS THAN OR EQUAL TO alpha.

    Example. I will also take n = 5 and p = 0.5. The cumulative probability of observing 4 events is BINOM.DIST(4, 5, 0.5, TRUE) = 0.96875. Let’s take alpha = 0.05. Then Excel function returns BINOM.INV(5, 0.5, 1 – 0.05) = 4. However, the probability of observing 4 or more events is 1 – BINOM.DIST(3, 5, 0.5, TRUE) = 1 – 0.8125 = 0.1875 > 0.05. That’s because Excel returns a value for which the probability of observing STRICTLY MORE events is less than or equal to alpha, and strictly more than 4 is 5, and the probability of observing 5 or more events is 1 – BINOM.DIST(4, 5, 0.5, TRUE) = 0.03125 < 0.05. So the correct number actually is 5, not 4.


    • Charles says:


      For any distribution with cumulative distribution function F(x), the inverse distribution function I(alpha) should equal the smallest x such that F(x) < = alpha (at least on the left tail), i.e. the largest value of x that is in the critical region of F (i.e. the region where the null hypothesis is rejected). Similarly, on the right tail, the inverse function should find the smallest value of x which is in the critical region (i.e. where the null hypothesis is rejected). For the binomial distribution F(x) = BINOM.DIST(x,n,p,TRUE). Thus I(alpha) should equal the smallest x such that BINOM.DIST(x,n,p,TRUE) <= alpha In your example n = 5, p = 0.5, alpha = .05. Now BINOM.DIST(1,5,.5,TRUE) = .1875 and BINOM.DIST(0,5,.5,TRUE) = .03125. Since .03125 <= .05 but .1875 > .05, the critical value is 0 and not 1. But BINOM.INV(5,.5,.05) = 1 and so Excel doesn’t find the right answer.

      If instead we take alpha = .95 (the right tail), in Excel we get BINOM.DIST(3,5,.5,TRUE) = .8125, BINOM.DIST(4,5,.5,TRUE) = .96875 and BINOM.DIST(5,5,.5,TRUE) = 1. But BINOM.INV(5,.5,.95) = 4, which is the smallest value where the null hypothesis is rejected, which I believe is the correct answer.

      So it seems that at the very least, Excel is inconsistent, producing the correct answer on one tail and the incorrect answer on the other tail.

      Of course, many of the tables of critical values that I have seen published use a different definition of critical value, namely the smallest value not in the critical region on the left tail and the largest value not in the critical region on the right (the critical region is again defined as the region where the null hypothesis is rejected). In this case, Excel is still incorrect on one tail and correct on the other tail for the binomial distribution.

      The issue of what is significant is also quite confusing in the literature. What happens when the p-value is exactly equal to .05 (or some other value of alpha). Usually based on the definition given, the null hypothesis is rejected when p <= .05; having said this whenever a significant result occurs, the result is written up as "p < .05" and not as "p <= .05". Charles

  10. Sarah says:


    I’ve got a set of data for occurrences of a health condition in a number of different geographical populations.

    To look to see if the rate in a given country is significantly different from the overall worldwide average rate is it valid us use BINOMDIST(no of cases, number in sample group, worldwide average rate, TRUE) and look to see if the value is 0.95 ?

    • Charles says:

      Do you have any evidence that this type of data has a binomial distribution (which if the number of countries is large enough is equivalent to having a normal distribution)?

      • António Teixeira says:


        I think that this situation also happens with the Real Statistics Resource Pack inverse functions for Poisson and Hypergeometric distributions.

  11. Suzybest says:

    Pls how do I find P-Value

  12. Mike says:


    I found this site very helpful. Thank you. However, I differ in opinion regarding the “critical value” that Excel returns and at what point one Rejects the null hypothesis at the level of alpha. Focusing on example 2; it seems that the critical value returned by Excel is the value which causes the cumulative probability to pass from the “Fail to Reject” region into the “Reject” region–however, since this is a discrete, rather than continuous distribution, there is no distinct point at which this transition occurs (we “jump” from one cumulative probability to the next). Which means that “enough” (as it cannot specifically be assigned given we are dealing with a discrete distribution) of the probability for the specific occurrence for the “critical value” returned by Excel exists in the “Fail to Reject” region that to be at a minimum level of alpha, one would only reject if one observed a number of events GREATER than the critical value returned by Excel.

    For instance, in example 2 the discrete probability of observing EXACTLY 7 heads is BINOMDIST(7, 9, .5, FALSE) = 0.0703, which is greater than alpha = 0.05 by itself! So the way I see it, to be confident at a minimum level of alpha, I could only reject if I observed greater than or equal to 8 heads. For this specific example, I do not see how it is possible to say I am rejecting p = 0.5 at a 95% percent confidence level if I observe 7 heads when there is a 7.o3% likelihood of observing the exact outcome of 7 out of 9 heads given p is actually equal to 0.5. I contend that my confidence level to reject at 7 or more heads (rather than 8 or more) is only BINOMDIST(6, 9, .5, TRUE) = 91.02%. The next confidence level that exists for this specific example is our “jump” to observing 8 or more heads with a corresponding confidence of BINOMDIST(7, 9, .5, TRUE) = 98.05%

    • Charles says:

      Thanks for your comment. António Teixeira has just written what I found to be a very clear description of how we should look at the this issue. See his comment on this webpage on 2015/10/19. Please let me know whether you agree with his approach.

  13. António Teixeira says:


    In reference to the messages posted earlier about the BINOM.INV function, we must take into account the following:
    1. The function does not follow the rules you presented for an inversion function.
    2. In fact, contrary to its name, the purpose of the function is not inversion but to answer the following type of question: what is minimum number of tosses of a coin for which there is a p% chance of at least x heads.
    3. This fact is reflected when saying that alpha is the criterion value and not significance level or type I error. The function does not even know if the user is considering the right or left tail.
    4. In fact the former CRITBINOM function had a more appropriate name.
    5. To achieve the answer of the question type presented in 2 it is inevitable that for left tails we have: value returned =critical value +1.
    6. In view of this: (1) the right tail c. value is correct; (2) the left tail critical value is always inflated by one and needs to be corrected.
    7. To perform this correction it is necessary to indicate to which tail if the value wanted refers.
    8. This can be achieved at spreadsheet level using formulas or with UDF function such as the one below:

    Option Explicit
    Option Base 1

    Function xlBinom_CV(n As Integer, p, alpha, nTails, pTail)

    ‘ n – sample size
    ‘ p – probability
    ‘ alpha – Probability of type I error
    ‘ nTails – number of tails (1,2)
    ‘ pTail -Position of the tail (1 – Lower, 2 – Upper)

    ‘ ATTENTION – This function is not fully tested

    With WorksheetFunction

    If nTails = 1 And pTail = 2 Then
    alpha = 1 – alpha
    ElseIf nTails = 2 And pTail = 1 Then
    alpha = alpha / 2
    ElseIf nTails = 2 And pTail = 2 Then
    alpha = 1 – alpha / 2
    End If

    If pTail = 1 Then
    xlBinom_CV = _
    .Binom_Inv(n, p, alpha) – 1
    xlBinom_CV = _
    .Binom_Inv(n, p, alpha)
    End If
    End With

    End Function

    • António Teixeira says:

      An additional indication about the VBA routine is that a valuie -1 indicates a non-existing left tail when P(x=0) > alpha or alpha/2.

    • Charles says:

      Thanks Antonio for the clear explanation. I will add this function to the Real Statistics Resource Pack to help people with this concept.

  14. António Teixeira says:


    1. I am still analyzing the subject and found that, for example, Mathematica and Maple return values equal to those of Excel. MINITAB provides the following type of output:
    Inverse Cumulative Distribution Function

    Binomial with n = 100 and p = 0,03

    x P( X <= x ) x P( X <= x )
    2 0,419775 3 0,647249

    Statistical software in general associates the inverse of the distribution function F(x) to quantiles, calculate using the criterion of the BINOM.INV function.

    2. In what concerns the code sent, I think there is one situation in which no correction should be made to the Excel value:

    xBI Result of the BINOM.INV function
    xLC Lef tail critical value

    x xLC Non rejection interval
    xBI >= xLC

    If F(xBI) = alpha
    THEN xBI = xLC
    Do not subtract 1 from xBI

    • Charles says:

      From what you are saying, it seems that there is a role for both a BINOM.INV function and a BINON.CRIT function, where sometimes the values are different.
      I didn’t completely understand the situation where the values should be equal (your item #2).

  15. kimani hames says:

    i appreciate the good job of this site

  16. allison says:

    Hello. I am a statistic student with a question. With hypothesis test proportion binomial distribution, is it possible to have a left tail? In example
    Ho: p=.062
    Hi:p< .o62
    And if it possible how would that be solved?

    • Charles says:

      Hi Allison,
      For Example 3 on the referenced webpage I calculated p-value = 1 – BINOM.DIST(13, 24, .35, TRUE). For the case of
      H0: p >=.062
      H1: p < .062
      I would calculate p-value = BINOM.DIST(x, n, p, TRUE).

  17. Anonymous says:

    What if the test value isn’t given and you have to guess and find the critical region?

  18. William says:


    As my understanding, p-value is the probability that, using a given statistical model, the statistical summary (such as the sample mean difference between two compared groups) would be the same as or more extreme than the actual observed results (Wikipedia), given the null hypothesis is true.

    As for Example 1, we try to find the probability of 4 or more times of #3, as the p-value, and compare it with α.

    dbinom(0,10,1/6), the density of 0 #3 is: .1615056,
    dbinom(1,10,1/6) =.3230112
    Therefore, sum(dbinom(1,10,1/6) + … +dbinom(3,10,1/6)) = .9302722
    Hence, the p-value of 4 or more #3 is:
    1 – sum(…) = 1 – .9302722 = .0697278, which is larger than .05, therefore, we fail to reject the null hypothesis.

    Example 2:
    dbinom(6,9,.5) = 0.1640625
    dbinom(7,9,.5) = .0703125
    dbinom(8,9,.5) = 0.01757812
    dbinom(9,9,.5) = 0.001953125
    Therefore, if 8 or more head come up, null hypothesis should be rejected.

    Example 3:
    Similar to Example 1:
    p-value = 1- pbinom(12,24, .35)= 1- .9577469 = .04225307, therefore, we reject null hypothesis. Same conclusion, but weaker.

    • Charles says:

      Thank you for catching these errors. I have now corrected the referenced webpage.
      On behalf of all the users of this website, I appreciate your help in improving the accuracy and quality of the website.

  19. Benson Okech says:

    Try to solve this for me.

    The number of credit card holders of a bank in two different cities (city – X and city – Y) settling their excess withdrawal amounts in time without attracting interest follows binomial distribution. The manager (collections) of the bank feels that the proportion of the number of such credit card holders in the city – X is not different from the proportion of the number of such credit card holders in the city – Y. to test his intuition, a sample of 200 credit card holders is taken from the city – X and it is found that 160 of them are settling their excess withdrawal amount in – time without attracting interest. Similarly a sample of 180 credit card holders is taken from the city – Y and it is found that 50 of them are settling their excess withdrawal amount in – time without attracting interest, check the intuition of the sales manager at a significance level of 0.05.

    • Charles says:

      This sounds like a homework assignment and I have decided that I shouldn’t do other people’a homework for them. In any case, whether or not this is a homework assignment, here is a hint: Look at the two sample hypothesis testing for the Proportion Distribution at
      Proportion Distribution

  20. Samuel J. says:

    What’s the approach using Excel for finding the upper limit of a sample proportion with a given level of confidence (1-alpha) for a one-tailed distribution.

  21. Caroline S. says:

    For the first and third examples, you use one less than the number of successes mentioned. For example, in the first example, a “3” was rolled 4 times, but in the excel function, you used 3 as the number of successes. Similarly, in the third example, there were 13 successes and you used 12 successes in the =1-BINOM.DIST function. What is your reasoning for doing this?

    • Charles says:

      In the first example, you want to find out the probability that three comes up 4 of more times (i.e. 4, 5, 6, 7, 8, 9 or 10 times). The BINOM.DIST(x,n,p,TRUE) function computes the probability that an event occurs at most x times (i.e. 0, 1, 2, …, x times). The probability that three comes up 4 or more times is equal to 1 minus the probability that three comes up at most 3 times, which is P(x ≥ 4) = 1–BINOM.DIST(3, 10, 1/6, TRUE).
      Another way to look at this is that P(x>=4) + P(x<4) = 1, and so P(x>=4) = 1 – P(x<4) = 1 - P(x<=3) = 1–BINOM.DIST(3, 10, 1/6, TRUE). Charles

  22. Charlène P. says:

    Dear Charles,

    I’m puzzled with a statistic issue.
    I would like to compare two diagnosis test applied on the same samples and same number of samples with the following results
    Test 2 + –
    test 1
    + 64 10
    – 3 36

    I wanted to use the McNemar test but apparently it is recommended to use a binomial test (or sign test?) in that case because of the low amount of patient (b+c = 13 (<25)).
    Could you tell me if this is correct and if yes, should I do a two-tailed test?
    Also, how do I run a binomial test when the answer is yes or no and not a percentage?
    Thank you in advance for your help,


Leave a Reply

Your email address will not be published. Required fields are marked *