The one sample hypothesis test described in Hypothesis Testing using the Central Limit Theorem using the normal distribution is fine when one knows the standard deviation of the population distribution and the population is either normally distributed or the sample is sufficiently large that the Central Limit Theorem applies.

The problem is that the standard deviation of the population is generally not known. One approach for addressing this is to use the standard deviation *s* of the sample as an approximation for the standard deviation *σ* for the population. In fact, as is described below, such an approach is possible using the *t* distribution.

**Definition 1**: The (**Student’s**) **t**** distribution** with *k* **degrees of freedom**, abbreviated *T*(*k*) has probability distribution function given by

**Observations**: Key statistical properties of the *t* distribution are:

- Mean = 0 for
*k*> 0 - Median = 0
- Mode = 0
- Range = (-∞, ∞)
- Variance =
*k*⁄ (*k*– 2) for*k*> 2 - Skewness = 0 for
*k*> 3 - Kurtosis = 6 ⁄ (
*k*– 4) for*k*> 4

The overall shape of the probability density function of the *t* distribution resembles the bell shape of a normally distributed variable with mean 0 and variance 1, except that it is a bit lower and wider. As the number of degrees of freedom grows, the *t* distribution approaches the standard normal distribution, and in fact the approximation is quite close for *k* ≥ 30.

**Figure 1 – Chart of t distribution by degrees of freedom**

**Theorem 1**: If *x* has normal distribution *N*(*μ, σ*), then for samples of size *n*, the random variable

has distribution *T*(*n* – 1).

Click here for a proof of Theorem 1.

**Corollary 1**: For samples of sufficiently large size *n*, the random variable

has distribution *T*(*n *– 1).

Proof: This follows from the theorem by the Central Limit Theorem.

**Observation**: The test statistic in the theorem and corollary are the same as

from Central Limit Theorem with the population standard deviation *σ* replaced by the sample standard deviation *s*. What makes this useful is that usually the standard deviation of the population is unknown while the standard deviation of the sample is known.

**Excel Functions**: Excel provides the following functions regarding the* t* distribution:

**TDIST**(*x, df*, *tails*) = the right tail at *x* of the Student’s *t* cumulative probability distribution function with *df* degrees of freedom when *tails* = 1 (for a one-tailed test). When *tails* = 2 (for a two-tailed test), TDIST(*x, df,* *tails*) is the sum of the right and left tails.

Since the *t* distribution is symmetric about *x* = 0, TDIST(*x, df*, 2) is simply 2 * TDIST(*x*, *df*, 1). Also note that *x* must be non-negative, but since the *t* distribution is symmetric about *x* = 0, the left tail when *x* < 0 is TDIST(*-x, df*, *tails*). Thus we can use the formula TDIST(ABS(x), *df*, *tails*) for any *x*. The cumulative probability distribution function is given by 1 – TDIST(*x, df*, 1) when *x* ≥ 0 and by TDIST(*-x, df*, 1) when *x* < 0.

**TINV**(*p, df*) = *x* such that TDIST(*x, df*, 2) = *p*; i.e. TINV is the inverse of TDIST in the two-tailed case. For the one-tailed case simply double *p*; i.e. TINV(2**p, df*) = *x* such that TDIST(*x, df*, 1) = *p*.

With Excel 2010/2013/2016 there are a number of new functions (**T.DIST, T.INV, T.DIST.RT, T.INV.RT** and **T.INV.2T**) that provide equivalent functionality to TDIST and TINV, but whose syntax is more consistent with other distribution functions. These functions are described in Built-in Statistical Functions.

**Real Statistics Function**: In all these Excel functions that support the t distribution, the value of *df* is rounded down to the next lower integer. Thus, *df* = 3.7 is treated the same as *df* = 3. Furthermore, for versions of Excel prior to Excel 2010 there is no function equivalent to **T.DIST**(*x,* *df*, FALSE), i.e. there is no function that provides the pdf for the t distribution.

To address these issues, the Real Statistics Resource provides the following functions:

**T_DIST**(*x, df ,cum*), **T_DIST_RT**(*x, df*), **T_DIST_2T**(*x, df*)

**T_INV**(*p, df*), **T_INV_2T**(*p, df*)

Except for the fact that the *df* is not rounded, these functions are identical to their standard Excel counterparts.

hey, charles Good job. i have a doubt in my project, i need to find out t- test for each of my sample. In order to get the final analysis of my project i need it, but i cant make it out can u help me please.

Gopi,

How to proceed depends on the specific t test that you need to perform. You can select the required test from the webpage http://www.real-statistics.com/students-t-distribution/. E,g, for the one sample t test, see http://www.real-statistics.com/students-t-distribution/one-sample-t-test/.

Charles

Typo:

What makes this useful is that usually the standard deviation of the population is unknown while the standard devastation (sic) of the sample is known.

Peter,

Thanks for identifying this “devastating” typo. I have now corrected the webpage, although I am devastated that I didn’t catch it sooner. All joking aside, thanks for your help.

Charles

Hello Charles,

I’m having problems calculating my degrees of freedom as I have to work out VaR using the t-distribution on Excel. I tried to rearrange the formula to d=2v/v-1, but I am getting -0.000314724, which doesn’t seem write. I got an error when using the T.INV function, showing #NUM! .

If you could figure this problem out, please help!

Thank you

Heta

Hello Heta,

Sorry, but I don’t understand the situation you are describing. T.INV will produce an error if the df is less than 1.

If you send me an Excel file with your data and calculation I will try to figure out what is going on. You can get my email address at Contact Us.

Charles

Hi Sir Charles, my professor told me that in running a correlation in excel there must be three tables, i think one of them is a t-test table. He told me that in the final table it shows which variable has the most significant correlation. I’m asking if you happen to know how to make these three tables?

Gilbert,

Sorry, but I don’t know what three tables he/she is referring to.

Charles

HEllo Charles,

What is the meaning of t.dist results. For example, if i get 0.011489364 for t.dist, what does this number imply?

This is the value of the T distribution. If the cum argument is TRUE then this is the cumulative distribution value (often viewed as the p-value), while if it is FALSE then this is the probability density value.

Charles