**Definition 1**: The **chi-square distribution** with *k* **degrees of freedom**, abbreviated χ^{2}(*k*), has probability density function

*k* does not have to be an integer and can be any positive real number.

Click here for more technical details about the chi-square distribution, including proofs of some of the propositions described below. Except for the proof of Corollary 2 knowledge of calculus will be required.

**Observation**: The chi-square distribution is the gamma distribution where *α = k*/2 and* β* = 2.

**Property 1**: The χ^{2}(*k*) distribution has mean* k* and variance 2*k*

**Observation**: The key statistical properties of the chi-square distribution are:

- Mean =
*k* - Median =
*k*^{-2⁄3}for large*k* - Mode =
*k*– 1 for*k*> 2 - Range = [0.∞)
- Variance = 2
*k* - Skewness =
- Kurtosis = 12/
*k*

The following are the graphs of the pdf with degrees of freedom *df* = 5 and 10. As *df* grows larger the fat part of the curve shifts to the right and becomes more like the graph of a normal distribution.

**Figure 1 – Chart of chi-square distributions**

**Theorem 1**: Suppose *x* has standard normal distribution *N*(0, 1) and let *x _{1}, …, x_{k} *be

*k*independent sample values of

*x*, then the random variable

has a chi-square distribution χ^{2}(*k*).

**Corollary 1**:

- If
*x*has distribution*N*(0, 1) then x^{2}has distribution χ^{2}(1) - If
*x*~*N*(*μ, σ*) and*z*= (*x–μ*)/*σ*then over repeated samples*z*^{2}has distribution χ^{2}(1) - If
*x*are independent observations from a normal population with distribution_{1}, …, x_{k}*N*(*μ,σ*) and for each*i*,*z*= (*x–μ*)/*σ*, then the following random variable has a χ^{2}(*k*) distribution

**Property 2**: If *x* and y are independent and *x* has distribution χ^{2}(*m*) and y has distribution χ^{2}(*n*), then *x* + y has distribution χ^{2}(*m + n*)

**Theorem 2**: If *x* is drawn from a normally distributed population *N*(*μ, σ*) then for samples of size *n* the sample variance *s*^{2} has distribution

**Corollary 2**: *s*^{2} is an unbiased, consistent estimator of the population variance

**Corollary 3**: If *x* is drawn from a normally distributed population *N*(*μ, σ*), then for samples of size *n* the random variable has a χ^{2}(*n*–1), distribution

**Property 3**: The mean of the sample variance *s*^{2} is *σ*^{2} and the variance is

Proof: This can be seen from the proof of Corollary 2.

**Excel Functions**: Excel provides the following functions:

**CHIDIST**(*x, df*) = the probability that the chi-square distribution with *df* degrees of freedom is ≥ *x*; i.e. 1 – *F*(*x*) where *F* is the cumulative chi-square distribution function.

**CHIINV**(*α, df*) = the value *x* such that CHIDIST(*x, df*) = 1 – *α*; i.e. the value *x* such that the right tail of the chi-square distribution with area *α* occurs at *x*. This means that *F*(*x*) = 1 – *α*, where* F* is the cumulative chi-square distribution function.

With Excel 2010/2013 there are a number of new functions (**CHISQ.DIST, CHISQ.INV, CHISQ.DIST.RT **and **CHISQ.INV.RT**) that provide equivalent functionality to CHIDIST and CHIINV, but whose syntax is more consistent with other distribution functions. These functions are described in Built-in Statistical Functions.

In Excel 2010 CHISQ.DIST(*x, df*, TRUE) is the cumulative distribution function for the chi-square distribution with *df* degrees of freedom, i.e. 1 – CHIDIST(*x, df*), and CHISQ.DIST(*x*, *df*, FALSE) is the pdf for the chi-square distribution.

**Real Statistics Functions**: The Real Statistics Resource Pack provides the following functions.

**CHISQ_DIST**(*x, df, cum*) = GAMMA.DIST(*x, df*/2, 2, cum) = GAMMADIST(*x, df*/2, 2, *cum*)

**CHISQ_INV**(*p, df*) = GAMMA.INV(*p, df*/2, 2) = GAMMAINV(*p, df*/2, 2)

These functions provide better estimates of the chi-square distribution when *df* is not an integer. The first function is also useful in providing an estimate of the pdf for versions of Excel prior to Excel 2010, where CHISQ.DIST(*x, df*, FALSE) is not available.

The Real Statistics Resource also provides the following functions:

**CHISQ_DIST_RT**(*x, df*) = 1 – CHISQ_DIST(*x, df*, TRUE)

**CHISQ_INV_RT**(*p, df*) = 1 – CHISQ_INV(*p, df*)

**Example 1**: Suppose we take samples of size 10 from a population with normal distribution *N*(0, 2). Find the mean and variance of the sample distribution of *s*^{2}.

Hello Charles,

I think there is a small typo in Theorem 1, the sum should be from i=1 to i=k I believe, not i=n.

Many thanks,\

Fred

Fred,

Thanks for catching this typo. I have just make the correction. I appreciate your help in improving the website.

Charles

Hello Charles, I would like to ask you for a help. I measured p-bodies in different cell lines and different times. I have groups for 0, 1, 2, 3 and more p-bodies. I have two replicates for each cell lines. May I use the chi-square test to compare, if there is any differece? And how handle the replicates, it is possible sum p-bodies for each replicate?

thanks for you response.

Vendula,

You haven’t provided enough information for me to give you a definitive answer, but it doesn’t sound like a fit for chi-square test of independence.

Charles

Hi Charles,

When I run a Chi-Sq Test in real statistics I get the following output:

Chi Sq p-value X-Critical Sig Cramer V

Pearson’s – 623.097 2.9E-122 26.296 Yes 0.345099

Since X-Critical is less than Chi-Sq it gives the result that the variables are associated. In this case the p value is > 0.05 so i assumed its not significant. Do we not consider p-value ?

Kind regards

Shri

Shri,

chisq-crit < chisq is equivalent to p-value < alpha. If the result is significant using the first inequality it will be significant using the second inequality and vice versa. Charles

Four dice were thrown 112 times and the number of times 1 or 3 or 5 was thrown were as under

Number of dice throwing 1 or 3 or 5 0 1 2 3 4

Frequency 10 25 40 30 7

Find the value of chi-square presuming that all dice were fair

Devi,

See the webpage Independence Testing

Charles

Hi Charles,

This might be a silly question, but I want to be clear on something:

Even though the chi sq distribution is X2(k), k would actually demarcate the x that’s in the PDF, correct?

Hi Jonathan,

No, k in X2(k) is the degrees of freedom.

Charles

Okay, perfect.

Thank you.

how you reproduce this chi square graph? I mean what is the x and y-axis ?

All the examples on the website are contained in Excel spreadsheets that you can download for free. For this example, please go to the webpage http://www.real-statistics.com/free-download/real-statistics-examples-workbook/ and download the Real Statistics Examples Part 1 file.

Charles

Hi sir,

I have 200 measurements of a random variable for whom i have estimated mean and sigma. Now, i want to estimate the error bars on the standard deviation using chi-square function. I don’t know how to do that. Can you please help me on this.

Thanks

Karan

Karan,

I presume that you want to create a chart which shows error bars related to the standard deviation. The website contains the following two references which should help you do this

1) One Sample Hypothesis Testing of the Variance (using Chi-square)

2) Special Excel Charting Capabilities – towards the end of the webpage with the heading: Chart of standard error of the means

Charles

Sir

Property 3: The mean of the sample variance s2 is σ2 and the variance is ？

I cannot see the formula, it seems something wrong with the picture

Colin,

The variance is 2 times sigma raised to the 4th power divided by n-1.

This formula is displayed using latex. I hope there isn’t a problem with latex displays. Can you see the formula in Corollary 3? It also uses latex.

Charles

Sir

Thank you for your reply! I found the result in the proof section. Some picture in this website cannot display. It may attribute to the internet. But it doesn’t matter, your website is fabulous.