# Chi-square Test for Normality

The chi-square goodness of fit test can be used to test the hypothesis that data comes from a normal hypothesis. In particular, we can use Theorem 2 of Goodness of Fit, to test the null hypothesis:

H0: data are sampled from a normal distribution.

Example 1: 90 people were put on a weight gain program. The following frequency table shows the weight gain (in kilograms). Test whether the data is normally distributed with mean 4 kg and standard deviation of 2.5 kg.

Figure 1 – Frequency table and histogram for Example 1

We begin by calculating the probability that x < b for b = 0, 1, …, 8, assuming a normal distribution with mean 4 and standard deviation 2.5. This probability is NORMDIST(b, 4, 2.5, TRUE). The probability that x is in the interval (a, b] is then NORMDIST(b, 4, 2.5, TRUE) – NORMDIST(a, 4, 2.5, TRUE). Multiplying these figures by the sample size of 90, gives us the expected frequency.

Figure 2 – Chi-square test based on known mean and standard deviation

We now perform the Chi-square goodness of fit test. Since the observed and expected frequencies of the first and last interval are less than 5, it is better to combine the 1st and 2nd as well as the last and second to last intervals. The chi-square test statistic is 4.47, which is less than the critical value of CHIINV(.05,7) = 14.07, and so we can conclude that there is a good fit. Note that the df = number of interval – 1 = 8 – 1 = 7 since the mean and standard deviation are given.

Example 2: In the above example, the population mean and variance were known. This is usually not the case. This time we will simply ask whether the above data comes from a normal population.

We first calculate the sample mean and variance as described in Frequency Tables using the midpoint of each interval, although for the first and last intervals (-∞,0] and [8,∞) we need to guess at acceptable representative values, which we take as -1 (i.e. a weight gain of 1 kg) and 9 respectively.

Figure 3 – Calculation of mean and standard deviation for Example 1

We next test the null hypothesis that the data is normally distributed using the sample mean and variance (3.74 and 4.84 respectively as see in Figure 3) as estimates for the population mean/variance. As in Example 1, we combine the first two and the last two intervals so that all frequencies are at least 5. Once again we use a chi-square goodness of fit test based on 8 intervals, but this time since the mean and variance are estimated parameters, per Theorem 3 of Goodness of Fit, we use df = 8 – 1 – 2 = 5.

Figure 4 – Chi-square test based on estimated mean and standard deviation

Since $\chi^2$ = 1.35 < 11.071 = $\chi^2_{crit}$, we again retain the null hypothesis that the data is normally distributed.

### 78 Responses to Chi-square Test for Normality

1. Gerd says:

Hello Charles,

first let me say ” great web-site” !!!
I’ve a question regarding observed freq resp. classes to calculate Chi-square value. In example one you combine classes to get more than five samples for each class! Is this number “five” specific? Isn’t there “room for manipulating the chi-square value” by adapting the number of classes?

Thanks and best regards,
Gerd

• Charles says:

Hello Gerd,
Generally you would like at least 5 sample items for each cell in the contingency table. With large contingency tables, a small percentage of cells with fewer than 5 items can be acceptable. Even with smaller contingency tables this may not cause big problems, but it is probably a better choice to use Fisher Exact Test in this case. In any case, you want to avoid using chi-square for contingency tables with an expected frequency of less than 1 in any cell.
Charles

2. Jen says:

Hi, in example two, you say “…using the sample mean and variance (3.74 and 2.20 respectively as see in Figure 3)…” but Figure 3 shows 2.20 as the standard deviation, not the variance- should “variance” or “2.20” be changed in this example?

• Charles says:

Hi Jen,
Thanks for identifying this typo. The 2.20 value is the standard deviation not the variance. I have now used the correct value for the variance (i.e. the value that appears in Figure 3). Thanks for catching this error.
Charles

3. Sandeep R says:

hello sir, i have asked for lognormal distribution problem in K-S test for which you replied. thank you very much.

if possible please explain one problem of Log normal distribution in chi square test. it’ll be great helpful people like me, who are new to statistics.

thank you sir

• Charles says:

Hello Sandeep,
To replicate Example 1 and 2 on the referenced page with the log normal distribution instead of the normal distribution, just replace formulas of the form =NORMDIST(x,mean,stdev,TRUE) by =LOGNORMDIST(x,mean,stdev) or LOGNORM.DIST(x,mean,stdev,TRUE).
Charles

4. sundarrajan says:

sir. How to calculate for forecasting

• Charles says:

Sorry, but I don’t understand your comment.
Charles

5. Gustav says:

Hi Charles,

Thank you for the great article.

I’m confused. In example 2 you use a df of 5 (k-m-1 = 8-2-1). 2 since mean and variance are unknown but what causes the -1? I can see that you refer to Theorem 3 but according to wiki:
http://en.wikipedia.org/wiki/Goodness_of_fit

“where \nu is the number of degrees of freedom, usually given by N-n-1, where N is the number of observations, and n is the number of fitted parameters, ASSUMING THAT THE MEAN VALUE IS AN ADDITIONAL FITTED PARAMETER. ”

I guess the “-1” is due to the mean and the “n” is the additionally fitted parameters.
So for your example 2 it should be 8-1-1 = 6 as ONLY variance is an additional parameter?

Please correct me if i’m wrong.

Best Regards
Gustav

• Charles says:

Gustav,

I beleve that in the example given in wikipedia the population mean is unknown (and is estimated by the sample mean) and the population variance is known. Thus df = N-n-1 = N-1-1 = N-2. Here N = number of obervations and n = number of fitted parameters = 1 in this case. If N were 8, then df = N-2 = 6.

In Example 2 of the referenced webpage, both the population mean and the population variance are unknown, and so n = 2. Since N = 8, we have df = N-n-1 = 8-2-1 = 5.

Charles

• Gustav says:

Thanks!

6. Mandy says:

Hello Sir.
Im just wondering why do we need to combine classes if the expected frequency is less than 5. Why 5 but not other values? And how does it affect our results if we do not combine classes with expected frequency which is less than 5?

• Charles says:

The value 5 is a requirement for the use of the chi-square test. With larger contingency tables you can have some cells with fewer than 5 elements, but with smaller tables (e.g. 2 x 2) cells should have at least 5 elements. With fewer than 5 elements the results of the test won’t be reliable. See webpage http://www.real-statistics.com/chi-square-and-f-distributions/independence-testing/ for more details about this requirement.
Charles

7. Anthony says:

Can I use this procedure to test whether a sample data set came from a chi-square distribution? If not how do I to test for the chi-square distribution?

• Charles says:

Yes, you can use this procedure to test whether sample data fits a chi-square distribution. See http://www.real-statistics.com/chi-square-and-f-distributions/goodness-of-fit/, especially Theorem 3 and Example 4.
Charles

• Anthony says:

Thank you for your fast reply. I need a little further clarification. I wish to test a column of computed chi square values that is 10,000 entries long. Applying Theorem 3 in example 4, as you suggested, I would use CHIDIST(chi square, df) = CHIDIST(?,9999). What would be entered into the chi square portion? I want to test the whole column, not just a single number as in example 4, so would I just enter the column in which the data is in ? A histogram of the data leads me to believe that the it does indeed fit the chi square distribution. I just need a p-value to confirm it.

8. Jared says:

I have an unrelated question, I looked through the comments above an thought I would ask my question, I am performing a goodness of fit test and the mean and SD were given to me as percentages. I am not sure what to do with these values or how to convert them into a number usable for my expected values.
Many thanks

• Charles says:

Jared,
It really depends on what these percentages represent, but the likely answer is that you simply multiply the percentages by the sample size.
Charles

9. John Wright says:

I’m looking for a non-traditional way to explain GOF.
In Example 2 with df=5 and Chi^2=1.35 is there about a 7% probability that we would be correct if we said the data were not normally distributed?
Does that imply that there is a 93% probability that the data are normally distributed?
Alternatively, if we try fits for several types of distributions we can say that there is a 7% chance that we are wrong if we reject normal; an x% chance that we are wrong if we reject uniform, etc.
Do we need to make the negative statement or can we make a positive statement?

• Charles says:

John,

No this is not correct. Actually you need to look at the conditional probabilities given that the null hypothesis true.

“Suppose we perform a statistical test of the null hypothesis with α = .05 and obtain a p-value of p = .04, thereby rejecting the null hypothesis. This does not mean that there is a 4% probability of the null hypothesis being true, i.e. P(H0)=.04. What we have shown instead is that assuming the null hypothesis is true, the conditional probability that the sample data exhibits the obtained test statistic is 0.04; i.e. the probability of D given that H0 is true = P(D|H0)=.04 where D = the event that the sample data exhibits the observed test statistic.”

Charles

10. Affiq says:

Where do you get the values of ‘a’

• Charles says:

The value of a is simply the value of x prior to b in the frequency table. For this example, if b = 3 then a = 2.
Charles

11. Srikanth says:

Hello Mr.Charles,

It is my understanding that using Chi-Square test, I can check goodness of fit of my data. So, I can check for example, if my data follows binomial distibution with some probability of success.

Now, suppose I believe my data follows a Chi-Square distribution then how would I check it? Hope it is not an absurd question, in which my apologies.

• Charles says:

Hello,

This is certainly not an absurd question. You can use the chi-square goodness of fit test as described on the webpage
Goodness of Fit

Charles

12. Andres says:

Excellent guidance. Congratulations!

Andres Rubio
Finance Professor and Consultant

13. Andres says:

How can I test normality for a sample of 36 monthly returns in percentage for a stock?

Is N = 36 a large enough sample te reasonably test normality or should I increase N to say, 48 or 60…?

Thanks Charles!

• Charles says:

N = 36 should be a big enough sample. I suggest that you use a test like Shapiro-Wilk instead of Chi-square to test for normality.
Charles

14. Michael says:

Hey, great example!

I’m trying to use the Chi-Squared Goodness of Fit test to see if I can assume normality for further tests on my 2 samples of data. Basically I recorded battery drain times for 2 popular brands of batteries, 20 samples per brand. I want to see if I can assume normality for the 2 samples. What would my Ho and Ha be?

Thanks for the help!

• Charles says:

As stated on the referenced webpage, H0: data are sampled from a normal distribution, and so Ha: data are not sampled from a normal distribution.
Charles

15. Juan says:

Dear Charles,

I would like to thank you for this extremely useful resource !

I have a question regarding normality check via Chi-Square testing and sample size. I am applying your calculation to a case in which measurement of dust is involved. This means that there is a very large sample size. Since the dust grains measured are not really counted, but only weighted its amount in classified sizes, the results of frequency are given in percentage. Thus, I assume a sample size of 100, but I get extremely large X2 values, that, compared to an independent to the sample size X2 critical, and thus constant, make my conclusion always NOT NORMAL distribution. I fulfill all the criteria for the tests (more than 5 classes, larger frequency than 5, or grouped frequencies, etc) I’ve cross checked some of this distributions with Shapiro-W test and they are normally distributed.

I tried to lower and to increase the population number for the Chi2-testing keeping the % fractions but still, I either get too low frequencies or too large X2…

To rule out that I have overseen something, I took your example “Norm Chi-sq 1” and multiplied by 10 or 100 the given frequencies and the same effect occurs. Is there any explanation to this phenomena? Am I overseeing something? What would be your recommendation to proceed?

Thanks.

• Charles says:

Dear Juan,
I don’t completely understand the problem that you are having with the chi-square test, but this is not really a great test for normality. Shapiro-Wilk is usually one of the best tests for normality. I would also create a graph (e.g. Q-Q plot) to make sure.
Charles

• Juan says:

In short what I mean is that the tests seems very sensitive for sample size: if sample size goes up, the X2 calculated goes up very much and it is then very easy to be out of normality… If you take your example and keep the ratios between frequencies (imagine that they would be given as percentages) and you increase the “n”, the test changes drastically… is that a known effect?

• Charles says:

Juan,
Most statistical tests are sensitive to sample size. With very big samples it is often easier to find a significant effect.
Charles

16. Gelo says:

Please, next time, indicate how the mean, variance in figure 3 are computed. Or, better yet, show the formulas for every equation so that we wouldn’t have to make guesses as to how they were computed.

• Charles says:

Gelo,
As stated on the referenced webpage, the calculation is described on the webpage Frequency Table. You can also get the formulas from the Examples Wprkbook which contains all the spreadsheets shown on the website. You can download this for free.
Charles

Charles

17. Laura says:

I was studied that we use normality test to test our data normal distributed or not. And it will decide the method we use for hypothesis testing: parametric or non-parametric test. For testing in 1 sample as your example, we can easily to conclude. However, how about use normality test for 2 more samples in a problem? We use normality test for each 2 sample, right? So if other sample is not normal distribution, how we can conclude it? I confuse this when we have 2 more samples to decide the method for hypothesis testing.
Thank you so much.

• Charles says:

Laura,
For tests such as ANOVA you need to test each group sample for normality. In a 3 x 3 design, this means that you need to test each of the 9 groups for normality. Remember though that ANOVA and many other tests are pretty robust for departures from normality. Happy New Year.
Charles

• Laura says:

Happy New Year, sir
In the case of 2 samples. If this population is normal distributed, I will using testing 2 means for hypothesis testing. However, if this population is not normal distributed. My hypothesis testing will Mann-Whitney U test for independent sample or Wilcoxon Matched-Pairs Singed Rank test for dependent sample.
So, my question is if 1 sample in normality test is normal distributed, other is not normal distributed? Does this case happended? This question is same for 3 means or more in order to decide using ANOVA or Krushal test. I’m so confused between when we use parametric test and non-parametric test.
Thank you so much, sir.

• Charles says:

Laura,
When comparing two samples, each sample should be normal. If one is normal and the other is not, then the test may not be valid. Even so, a t test is pretty robust to violations of normality. Generally, a problem occurs when one or both samples are far from symmetric. If both samples are skewed to the right, then you are probably better off using a nonparametric test (Mann-Whitney).
Charles

18. ardi says:

Hello Sir!
Happy new year!
I have a question, , it is about motivation of students in learning English,
How I could know the normality and homogeneity in order to compare who is more motivated girls or boys? Motivation is consists of integrative and instrumental motivation, but I should do it manually, my question is how I could do this? Do you think I have to use chi_ square or another way…?

• Charles says:

There are many tests for normality. In general, I suggest that you use the Shapiro-Wilk test. You should test both the boys sample and the girls sample for normality (separately). See the following webpage:
Shapiro-Wilk

There are also many tests for homogeneity of variances. I suggest that you use Levene’s test. See the webpage
Levene’s Test

If you use the t-test with unequal variances, then you don’t need to check for homogeneity of variances. See the webpage
t test with unequal variances

Charles

19. hedi says:

Hello Sir!
I have a question, , it is about motivation of students in learning English,
How I could know the normality and homogeneity in order to compare who is more motivated girls or boys? Motivation is consists of integrative and instrumental motivation, but I should do it manually, my question is how I could do this? Do you think I have to use chi_ square or another way…?

• Charles says:

Hedi,

There are many tests for normality. In general, I suggest that you use the Shapiro-Wilk test. You should test both the boys sample and the girls sample for normality (separately). See the following webpage:
Shapiro-Wilk

There are also many tests for homogeneity of variances. I suggest that you use Levene’s test. See the webpage
Levene’s Test

If you use the t-test with unequal variances, then you don’t need to check for homogeneity of variances. See the webpage
t test with unequal variances

Charles

• Robmat59 says:

Hi Charles
Great site, and v useful pages. Just wondering why you recommend SW over ChiSq (which is easy to implement and well-recognised). Is it because of better power ?
Thanks

• Charles says:

Better power and more accurate.
Charles

20. hedi says:

Forget to write that there are 50 participants, 18 boys n 32 girls. I want to compare them. Are they normal and homogene or not..? If it was not normal then for comparing what I should use…?

21. sam says:

Dear Charles
i would like to ask how i check normality or the distribuation of my data by prism or excel for biological data for ex : westrenblotting data to decide to use anova or nonparmetric test

22. John says:

Hi Charles,

In example 1, when you say: “The probability that x is in the interval (a, b] is then NORMDIST(b, 4, 2.5, TRUE) – NORMDIST(a, 4, 2.5, TRUE)” can you please tell me what is the meaning of “a”?. I have tried to do the calculations taking “a” as the frequency or fx or fx^2 but none of those work. Thanks

• Charles says:

John,
Here I am referring to cumulative probability, i.e. F(x). F(a) = the probability that the outcome is less than a. Thus, the probability that the outcome is between a and b is F(b) – F(a).
Charles

• Minh says:

Hi Charles,

Could you tell me value of “a” in this example. In a general case, how to choose value for “a”?
Thanks.

• Charles says:

Minh,
a can take any value. In fact, in Figure 2 you can see that a takes a variety of values.
Charles

23. Ben says:

Hi, I’m a bit of a noob in stats and I’m stuck with the Chi squared methods at the moment. I need to use it to test the normality of some data I’ve been supplied with (sample size of 40, sorted into 8 groups of 5), I’ve sorted it into ascending order, found the average values at the boundary of each group, and then used these to find the value i need to use to compare to a normal distribution curve, however I’m stuck trying to find out how to do this in excel? any help would be great thank you 🙂

• Charles says:

Ben,
Is there some reason why you are testing for normality in this way? Why can’t you simply test normality on all 40 elements? (although for some tests — e.g. Anova — you need to check each group for normality) Also, generally chi-square is not the best test for normality. Shapiro-Wilk is usually a better test.
Charles

24. Ramon Bernal says:

Hey
Good page
But, could you put Excel formula in order to calculate cell O15
Thanks

• Charles says:

Ramon,
The critical value is CHIINV(alpha,df) = CHIINV(.05,7) using Excel 2007 or CHISQ.INV.RT(alpha,df) = CHISQ.INV.RT(.05,7) using more recent versions of Excel.
Charles

• Ramon Bernal says:

That is =CHIINV(0.95, 7)
It is not =CHIINV(0.05, 7)

• Charles says:

CHIINV is the right-tailed inverse (equivalent to CHISQ.INV.RT) and so I think it is CHIINV(.05,7).
Charles

• Ramon Bernal says:

If you put =CHIINV(0.05,7) in Excel you get 2.16734991
If you put =CHIINV(0.95,7) in Excel you get 14.0671404
So, correct formula is the second
Best regards

• Charles says:

Ramon,
That’s interesting; when I enter =CHIINV(.05,7) on my computer I get 14.067… If I enter =CHISQ.INV(.05,7) I get 2.167… If I enter =CHISQ.INV.RT(.05,7) I get 14.067…
Charles

• Ramon Bernal says:

It’s interesting
Maybe Excel configuration
I will check it

25. Chris says:

Hi Charles,

Thanks for the excellent web page, extremely useful!

I am getting slightly confused when using different significance levels and whether or not we would accept the null hypothesis.

In your example, the test statistic is 1.35 and as this is less than the critical region CHIINV(0.05,5)=11.07 then we accept.

Imagine our test statistic was 12. Under a 5% significance level we would reject H0. But if we used a 1% significance level the critical region would be CHIINV(0.01,5)=15.09. This would mean we would reject the null hypothesis under 5% but accept under 1%.

However I though using a smaller significance level is ‘more reliable’. So I am confused that a sample could exhibit less Normal qualities i.e. higher test statistic and still pass a ‘more robust test’.

Thanks!

• Charles says:

Chris,
The smaller the value of alpha, the smaller the critical region, i.e. the region where the null hypothesis is rejected. This means that for a lower alpha value, it is less likely that the null hypothesis would be rejected. This is consistent with your example. At 5% the data is not consistent with a normal population, while for 1% the data is consistent (enough) with a normal population.
Perhaps another way at looking at this is that 1% the acceptance region is larger than the acceptance region at 5%. Also at 5% we can afford to be wrong 1 out of 20 times, while at 1% we can afford to be wrong only 1 out of 100 times.
Charles

26. Guero says:

Hello Charles,
Can we use the Chi-Squared test for normality when we have actual sample data ? I see the two cases you presented are:
1)When data is presented in terms of frequency tables
2)When we are testing against a specific pair (mean, st.dev)

Now , I have 120 sample data points. Can I test whether these points come from a normal population by calculating the sample mean and sample deviation ( S.E ?) and applying method 2?

• Charles says:

Guero,
Yes, but in this case I suggest that you use the Lillifiers version of the test since you will get more accurate answers. See the webpage
Lilliefors Test
In general, I find that the Shapiro Wilk test for normality is more accurate than the chi-square approach. See the following webpage for information about Shapiro-Wilk
Charles

• Guero says:

Say we have a multilinear regression:
Y ~a1X1+a2X2+a3X3

We want the residuals (Y- (a1X1+a2X2+a3X3)) ,
to be normally distributed. If they are,

does it follow that the residuals Y|X1, Y|X2, Y|X3 ; Y|Xi
means Y restricted to Xi

( i.e., we regress Y against X1, holding X2=X3=0) are also
normally -distributed?

I guess this is equivalent to asking whether the residuals
for Y~a1X1+a2X2+a3X3 are jointly normal?

Hope I didn’t make this confusing and thanks again.
Guero.

27. Tanmay says:

Hi Charles,
I have data of students with age , gender , IQ scores and thumb size. I want to test normality can you guide how to proceed.

• Charles says:

Tanmay,
I suggest that you use the Shapiro-Wilk test and the QQ plots. These are described on the Real Statistics website.
Charles

28. nur says:

Hello, i wanna ask, if the data is normally distributed which means it is parametric, can i use chi square test, which actually for the parametric test?
Thank you.

• Charles says:

Nur,
It depends on what you want to use the chi square test for.
Charles

29. David says:

Now , I have 4 column (4 categories), each catergories have 10 points. Can we use the Chi-Squared test for normality and how can I do it? (Using only Chi Square Test). Since my lecturer only taught Chi-squared Test, I can not apply another method such as Lilliefors Test,……

• Charles says:

David,
Sorry, but I don’t understand the first sentence of your comment.
Note that Lilliefors test is the same as the Chi-square Goodness of Fit test using a different table of critical values. You should use Lilliefors test when you are estimating the mean and standard deviation from the data and the Chi-square test when the mean and standard deviation are known.
Charles

• David says:

For example, I have a biostatistics problem like this:

A scientist determined the effectiveness of segmental wire fixation in athletes with spondyolysis. Between 1993 and 2000, 20 athletes (6 women and 14 men) with lumbar spondyolysis were treated surgically with the technique. The following table gives the Japanese Orthopedics Association evaluation score for lower back pain syndrome for men and women prior to the surgery. The lower score indicates less pain.

Gender JOA scores
Female 14,13,24,21,20,21
Male 21,26,24,24,22,23,18,24,13,22,25,23,21,25

Give conclusion for the evaluation of the segmental wire fixation treatment between male and female?

So, this is the question. To solve this problem, I have to do 3 steps:
– test the variance (F -test)
– Normality test (Chi- square distribution) to determine the population is normally distributed or not.
– After using the normality test and depending on the condition’s question to apply ANOVA or kinds of non-parametric test.

I get stuck in question 2. Can I gather all data points in one group and use chi square test to find the population is normally distributed or not ?

• Charles says:

David,
Each group should be tested for normality. I suggest that you use the Shapiro-Wilk test instead of the chi-square test. If you use the chi-square test, I suggest that you use the Lilliefors version of the test.
Charles

30. Moin says:

Hi,
i have financial data for 80 firms for 10 years,2007-16,,,,with 3 explanatory and 1 moderating variable.
1- how can i check the normality of my data?
2- my R2 is very low (0.04), even i creased the sample size from 60 firms to 80, but still result the same….while P.value is less than 0.01.

• Charles says:

Moin,
1. I suggest that you use the Shapiro-Wilk test to check for normality.
2. It may be that the data is not a good fit for the regression that you are conducting. If you are conducting multiple linear regression, then you should draw scatter plots (e.g. each independent variable vs. dependent variable). If these don’t look linear, then you may have a problem.
Charles

31. Hello Charles,
Could you please provide some insights or point to reference work that would explain why midpoint are used in Example 2. I understand we use them because the population’s mean and stdev are unknown, but I’d like to be able to understand the mathematical intuition behind this? Thanks very much for your time and your awesome contribution to statistics on the WWW!

• Charles says:

Martin,
You need to pick some value and the midpoint seems a reasonable choice. If the data is heavily skewed, you might actually pick a different, more representative point.
I am please that you appreciate my contribution to statistics on the web. I am trying to do my part.
Charles