# Two Sample t Test: equal variances

We now consider an experimental design where we want to determine whether there is a difference between two groups within the population. For example, let’s suppose we want to test whether there is any difference between the effectiveness of a new drug for treating cancer. One approach is to create a random sample of 40 people, half of whom take the drug and half take a placebo. For this approach to give valid results it is important that people be assigned to each group at random. Such samples are independent.

When the population variances are known, hypothesis testing can be done using a normal distribution, as described in Comparing Two Means when Variances are Known. But population variances are not usually known. The approach we use instead is to pool sample variances and use the t distribution.

We consider three cases where the t distribution is used:

• Equal variances
• Unequal variances
• Paired samples

We deal with the first of these cases in this section.

Theorem 1: Let  and ȳ be the sample means of two sets of data of size nx and ny respectively. If x and y are normal, or nx and ny are sufficiently large for the Central Limit Theorem to hold, and x and y have the same variance, then the random variable

has distribution T(nx + ny – 2) where

Observations, as defined above, can be viewed as a way to pool sx and sy, and so s2 is referred to as the pooled variance. Also note that the degrees of freedom of t is the value of the denominator of s2 in the formula given in Theorem 1.

Click here for a proof of Theorem 1.

Real Statistics Excel Functions: The following supplemental functions are provided in the Real Statistics Resource Pack.

VAR_POOLED(R1, R2) = pooled variance of the samples defined by ranges R1 and R2, i.e.  s2 of Theorem 1

STDEV_POOLED(R1, R2) = pooled standard deviation of the samples defined by ranges R1 and R2, i.e. s of Theorem 1

STDERR_POOLED(R1, R2, b) = pooled standard error of the samples defined by ranges R1 and R2. This is equal to the denominator of t in Theorem 1 if b = TRUE (default) and equal to the denominator of t in Theorem 1 of Two Sample t Test with Unequal Variances if b = FALSE. When the sample sizes are equal, b = TRUE or b = FALSE yields the same result.

Observation: Each of these functions ignores all empty and non-numeric cells.

Example 1: A marketing research firm tests the effectiveness of a new flavoring for a leading beverage using a sample of 20 people, half of whom taste the beverage with the old flavoring and the other half who taste the beverage with the new favoring. The people in the study are then given a questionnaire which evaluates how enjoyable the beverage was. The scores are as in Figure 1. Determine whether there is a significant difference between the perception of the two flavorings.

Figure 1 – Data and box plot for Example 1

As we can see from the box plot in Figure 1 the data in each sample is reasonably symmetric and so we use the t test with the following null hypothesis:

H0: μ1μ2 = 0; i.e. there is no difference between the two flavorings

Since the sample variances are similar we decide that the population variances are also likely to be similar and so apply Theorem 1.

And so s =$\sqrt{16.05}$ = 4.01. Now,

Since p-value = TDIST(t, df) = TDIST(2.18, 18) = .043 < .05 = α, we reject the null hypothesis, concluding that there is a significant difference between the two flavorings. In fact, the new flavoring is significantly more enjoyable.

The same result can be obtained by use of Excel’s Two-Sample Assuming Equal Variances data analysis tool, the results of which are as follows.

Figure 2 – Output from Excel’s data analysis tool

Observation: The Real Statistics Resource Pack also provides a data analysis tool which supports the two independent sample t test, but provides additional information not found in the standard Excel data analysis tool. Example 3 in Two Sample t Test: Unequal Variances gives an example of how to use this data analysis tool.

Example 2: To investigate the effect of a new hay fever drug on driving skills, a researcher studies 24 individuals with hay fever: 12 who have been taking the drug and 12 who have not. All participants then entered a simulator and were given a driving test which assigned a score to each driver as summarized in Figure 3.

Figure 3 – Sample data and histograms for Example 2

As in the previous example, we plan to use the t-test, but with a sample this small we first need to check to see that the data is normally distributed (or at least symmetric). This can be seen from the histograms. Also the variances are relatively similar (15.18 and 17.88) and so we can again use the t-Test: Two-Sample Assuming Equal Variances data analysis tool to test the following null hypothesis:

H0: μcontrol = μdrug

Figure 4 – Two sample data analysis results

Since tobs = .10 < 2.07 = tcrit (or p-value = .921 > .05 = α) we retain the null hypothesis; i.e. we are 95% confident that any difference between the two groups is due to chance.

Observation: The t-test is quite robust even when the underlying distributions are not normal provided the sample size is sufficiently large (usually over 25 or 30). The t-test can be valid even with smaller sample sizes, provided the samples have similar shape and are not too skewed.

### Effect size

The Cohen effect size d can be calculated as in One Sample t Test, namely:

This is approximated by

Example 3: Find the effect size for study in Example 2.

This means that the control group has a driving score 4.1% of a standard deviation more than the group that is taking the hay fever medication.

### 57 Responses to Two Sample t Test: equal variances

1. Jo G. says:

I am using a t test to compare before and after weights using a diet. n=78. In excel, I am not sure whether to use 1, 2, or 3, under type in the formula box. Can you explain clearly the differences? Thanks.

• Charles says:

Jo,

You use type = 1 when the two samples are not independent. E.g. (1) when the first sample contains men and the second contains their wives or, as in your case (2) the first sample contains each person’s weight before dieting while the second sample contains their weight after dieting. In example (2) the same person is being sampled and so the samples can’t be independent.

Type 2 and 3 are used when the two samples are independent. E.g. 20 people are selected at random and half are randomly put in group 1 and half are randomly put in group 2. The difference between type 2 and type 3 relates to the variances of the populations from which the samples are drawn. If the variances are equal then use type = 2, while if the variances are unequal use type = 3. In reality the variance don’t have to be identical to use type = 2. Even if they are close you will usually get good results You usually judge two populations to have equal variances if the two samples have variance that are not too different; in fact even if one sample has a variance which is 4 times the other, the results will be pretty good even if you use type = 2.

For your situation, it looks like you want to use type = 1.

Charles

2. marzieh says:

Dear Charles
First , many thanks for sharing such valuable knowledge . I have a question regarding this page . In example 1 , while I understand that the null hypotheis is rejected because of P value of tow-tail is lower than 0.05, I can not understand this sentence that you have added at the end of example :”In fact, the new flavoring is significantly more enjoyable.” I do not know how should I recognize whether the dependent variable in sample one is increased or decreased significantly compared to the sample 2. All my appreciation for any advice as I am stucked in it

• Charles says:

Dear Marzieh,
Before you collect data and conduct the test you don’t know whether the mean of the sample will be in the right tail, left tail or neither. After the test you have evidence as to whether or not the sample mean is in one of the tails and if so which tail. It is on this basis that I drew a conclusion. Of course this conclusion won’t always be correct, but the evidence points in the direction indicated.
Charles

3. Noemi says:

Assuming I have 2 treatments and three trials per treatment. How should I solve the statistical problem of this?

• Charles says:

Noemi,
Are you saying that you have two treatments and a sample of 3 for each treatment? You aren’t going to do much with such a small sample, but it is probably best to use the Mann-Whitney test instead of a t-test in this case.
Charles

4. Swarup says:

a number of samples 170, in 2012 mean 12366 std dev 3891 and in year 2011 mean 12549 std dev 4232 . corelation-coeff .92879. Can it be concluded that the value of stock holding decreased? how to solve this problem using excel? i mean i am facing trouble to determine std err

• Charles says:

You are referencing the webpae regarding two sample t-test with independent samples, but the correlation coeffcient of .92879 shows that these samples are not likely to be independent. Perhaps you are trying to run a two paired samples test — e.g. where the stock holdings are in say 170 different factories, comparing 2011 with 2012. In this case you might be able to use the fact that var(x-y) = var(x) + var(y) – cov(x, y) and cov(x,y) = corr(x,y) * sqrt(var(x))*sqrt(var(y)). This is the best I can do with the information you have provided. I hope that it is helpful.
Charles

5. Dilshodjon says:

Hi Charles,
In the first example, can I know the meanings of data entered in Old and New columns?
For example, does the greater the number explain the greater the enjoyment?
What scale could be used?
Clarification could be helpful.
Thank you in advance for the post and the answer as well.

• Charles says:

Presumably the higher the score the more enjoyable the beverage (although for the t test it really doesn’t matter whether a higher score represents more or less enjoyment). Scale also really doesn’t matter, but let’s assume that each point represents the more enjoyment option in a series of True/False questions.
Charles

6. Nikita says:

What do you do if the number of cases are not equal? How would you calculate the df then?

• Charles says:

The formulas and functions on the referenced webpage don’t require that the number of cases be the same. Just use the formulas and functions shown on the webpage.
Charles

7. Kennedy says:

I typed a comment last evening asking for help trying to decide how to handle a statistics problem. I need to form a hypothesis where two groups of students at a Private Catholic School one group who lives in North Bethesda (120 observations) are said to receive higher grades on the SATs. Those students living in South Bethesda (132 observations) received lower test scores on the SAT. Sample mean is 86 and 87 respectively with 8.1 and 7.3 populations variances and an .01 level of significance. Is there evidence that students living in South Bethesda may be getting lower grades on the SATs?

• Charles says:

Kennedy,

I sent a response yesterday. I am resending it below:

Since you have the population variances you can use a two sample test using the normal distribution, as described in Theorem 1 of Comparing Two Means.

The null hypothesis is mean1 >= mean2 (these are population means). The test statistic is z = (m1-m2)/stdev, where m1-m2 = 88-87 = 1 (sample means) and stdev = sqrt(var) where var = v1^2/n1 + v2^2/n2 = 6.2^2/132 + 7.0^2/125. If NORMSDIST(z) > .99 then you reject the hypothesis that the the workers receive the same pay. This is a one tailed test. If you want a two-tailed test you need to replace .99 by .995.

If instead of the population variance you had the sample variances you would use Theorem 1 of Two Sample t Test instead.

Charles

8. jaynos cortes says:

Hello. Can anybody help me with my problem? I have a research and initially had 3 treatments, so I proposed ANOVA as my statistical tool in determining the significant difference. But along the experiment it happened that one of the 3 treatments did not respond to my procedure. So, my results have only 2 treatments to determine the difference. Is it applicable to shift to T-test instead of ANOVA since I only have to get the significant difference of two treatments because the third treatment has no variance?

• Charles says:

Yes you would use a t-test. You could use ANOVA, but in this case you would not include the treatment with no results. The results will be the same as for the t test, but the t test is probably easier to calculate and interpret. When you report your results, you should explain that you had planned to analyze three treatments and explain why you got no data for this treatment.
Charles

9. aldiz says:

Hi, I have a specific problem. I have 2 different distributions (samples) i.e. in measure in 2 different conditions. I want to perform a statistical test to detect a mean difference of at least 5 between the 2 samples. My H0 hypothesis is : M1 – M2 = 5, my H1 hypothesis is M1-M2 < 5 (because I want to prove that the difference between the 2 means is less than 5 units). How can I do that with Excel?

• Charles says:

You can use Excel’s t test data analysis tool as described on the referenced webpage. Insert 5 in the Hypothesized Mean Difference field and look at the one tail results.
Charles

• aldiz says:

Ok, I cant have directly the p value or statistic of that in one excel cell (I want the excel sheet to be already ready and I just plug the data)?

• Charles says:

If I understand correctly, you want to conduct a one-tailed two independent sample t test. In Excel this can be done using the Two-Sample Assuming Equal Variances data analysis tool. Click on the Data ribbon and choose the Data Analysis tab. Select Two-Sample Assuming Equal Variances data analysis tool (assuming the variance are similar; otherwise choose the Two-Sample Assuming Unequal Variances data analysis tool). Fill in the dialog box with the input ranges and fill in the Hypothesized Mean Difference field with the value 5.
Charles

10. aldiz says:

I also have another question. I want to perform a t test for means, I have a sample (with a mean and variance) and a reference value. The classic formula for the t statistic uses means and variances. I know how to compare it by using the t test formula. But what if my reference value has a standard deviation? Do I use the formula and assume my reference is a sample too and use n = 1 for my “reference sample”? For example, I have a sample with mean 3,4 and variance 1,2, and I want to compare it to a reference value of 3, and I assume that my reference value has a variance of 2…

• Charles says:

I don’t completely understand your question, especially since I don’t know what you mean by “reference value”. Based on how you describe the problem, the approach you suggest might be appropriate, although it is more likely that you need to use a sample size for the second “sample” equal to the sample size of the first sample.
Charles

• aldiz says:

Ok, for example : I have a distribution of 50 temperatures, with the mean and standard deviation. Then, I want to compare it with a reference value. For example, in that time of the day, in a particuliar place the real temperature should be 20 degrees, with a standard deviation of ± 5 degrees. This reference value is not a distribution, I only have one value. In the formula, the result depends on n, but if I take n = 1 for the reference, the test will be very easy to pass, on the other side if I take n = 50 for both, its not really true beceause I only have one value for the reference…

• Charles says:

As you can see from Theorem 1 of the referenced webpage, for each sample you need to know the sample mean, sample variance and sample size of each sample to conduct a t test. Without this information, you won’t be able to conduct the test.
Charles

11. Jess says:

Hi

Are you still able to perform a Ttest when there are different numbers of participants for each component?

Thanks

• Charles says:

Jess,
You can perform a two sample t test even if the samples have a different number of elements.
Charles

• Jess says:

Perfect thanks for you help!

12. Rebecca says:

Hi,

I am trying to find out if there is a significant improvement between a groups mental wellbeing scores at 2 different points of time (ie. before intervention and after intervention).

Rather than having each individuals score to use for calculations, I have the mean score for each question on the questionnaire before intervention and the mean score after intervention.

For example my data table looks like this

Question Average Initial Score Average Final Score
Feeling useful 2.5 4.6
Feeling relaxed 2.5 4.6
Feeling loved 2.5 4.6
Feeling cheerful 2.5 4.6

Using excel, would I use a t-test, two tailed, and type 1?

• Charles says:

Rebecca,
You should use the detailed data not the sums or average scores (since you also need to know the sample sizes and variances). This will be a paired samples (aka repeated measures) t test (which if I remember correctly is type 1). Generally you would use a 2 tailed test (unless you are very confident that the scores won’t go down after intervention).
Charles

13. Melody says:

Hello

How can i prove that the p-value of 2 sample t-test with equal variance is the same as the p-value of a one-way Anova with 2 samples?

Thank you

14. George Szmn says:

I just want to make sure that I am doing this correctly.
N1 126 Salaries for mlb team that gained a playoff spot btwn 2000 and 2014
N2 324 Salaries for mlb team that didnt gain a spot.
compared the means using t test non paired gave me statistically differente
I assumed that variances were equal Ftest 1.54 prob 0.999

can you correct me if i´m wrong

• Charles says:

George Szmn,
Unless for some strange reason the two samples are highly skewed (not likely to be a problem), the approach you used looks correct.
Charles

15. Ken says:

Can I use t test if the actual distribution not normal?

• Charles says:

Ken,
The t test assumes that the data is normal, but it is pretty forgiving of violations of normality. The test should work quite well unless the data is very skewed, in which case the Mann-Whitney test may be a better solution.
Charles

16. debashri boruah says:

Hi,
I have 2 different samples from 2 different states. I want to perform a statistical test to detect p-value. In case of comparing the same disease results of two different samples,which test we can use?

• Charles says:

If the assumptions for the t test are satisfied you can use the two sample t test. If the variances of the two samples are similar then you can use the t sample t test with equal variances. This test is pretty forgiving about what “similar” means. E.g. generally even if one sample has variance 4 and the other has variance 8, the equal variances test will usually work quite well. If the variances are very different (say one has variance 4 and the other has variance 40) then you should use the two sample t test with unequal variances. When in doubt use the two sample t test with unequal variances.
Charles

17. aparna says:

Hi
I am taking a survey on expected and percieved service levels and then conduct a t test, to know the difference is significant or not. So the participants have to be different for expected and percieved? and do i have to assume equal variance or unequal variance

Thank you

• Charles says:

You can create two samples with different participants. If the sample variances are quite similar then you can use the equal variance test. Otherwise you should use the unequal variance test.

It is likely (depending on the exact nature of the surveys) that you can create one sample and test expected and perceived service levels with all the participants in the single sample. In this case you would use the paired t test.

Charles

• aparna says:

Thank you very much

18. Laudy Peters says:

Hi,
can you give a formula of T-Test for related measures if the S as X and S1 as Y?? Cause i don’t really understand about this. I’m a student of university and i have a subject about apply statistic.

• Charles says:

Sorry, but I don’t understand your question.
Charles

19. Vu Nguyen says:

Dear Charles,

Could you help to interpret the below result from pairs sample t-test for a group of inspector before & after applied kappa statistic:
. # of inspectors: 23

t-Test: Paired Two Sample for Means

Variable 1 Variable 2
Mean 0.002449451 0.001223162
Variance 6.37156E-06 8.40796E-07
Observations 23 23
Pearson Correlation 0.252600165
Hypothesized Mean Difference 0
df 22
t Stat 2.392373093
P(T<=t) one-tail 0.012857038
t Critical one-tail 1.717144374
P(T<=t) two-tail 0.025714076
t Critical two-tail 2.073873068

The result showed that, p value = 0.025<0.05, so we reject the null hypothesis.
It means that, there is a significant mean different between two period of times (before and after applied kappa) ?

Best regards,
Vu Nguyen

• Charles says:

Yes, if p value = 0.025<0.05, you would reject the null hypothesis, which means that there is a significant mean different between two period of times (before and after).
Charles

20. Jessica says:

Charles,

I am in graduate school trying to write my research paper. I am struggling with the statistics I must include. I used a quasi experimental one group pretest posttest design. The pretest mean was 72.9 and the SD was 22.019. The posttest mean was 83.6 and the SD was 20.71. I am having trouble deciding which t-test to use? Do I use a one tailed t-test paired two sample for means?
Jessica

• Charles says:

Jessica,
It sounds like you have one sample and got a measurement for each member of the group at two moments in time (pretest and posttest). If this correct, then you should use a paired samples t test (assuming the assumptions for a t test are satisfied).
Charles

• Jessica says:

Charles,

I read this on your website…Another approach is to take a sample of 20 people and have each person drink a glass of wine and take a memory test, and then have the same people drink a glass of beer and again take a memory test; finally we compare the results. This is the approach used with paired samples.
I am just confused by statistics in general. I guess my brain does not think that way.
I used an online calculator to do the one tailed t-test paired for two sample means and it gave me T value 1.899552, P value 0.031234 and result is significant at p<0.05. I do not understand what this means. Is there any way to put this in every day language?
Confused

• Charles says:

Jessica,

A lot of people have trouble understanding the basic concept of hypothesis testing. It does take some getting used to.

In your example, you are trying to test whether the paired samples come from populations whose means are the same (or at least equal enough so that any difference may be due to random processes). The way the testing works is that you assume this (null) hypothesis is true (i.e. the two samples do come from populations with the same mean). The testing shows that (based on the null hypothesis being true) the probability that the sample data exhibit the obtained test statistic of 1.899552 is only 3.1234%, which is not so high. Since this probability is less than 5%, we are doubtful that the null hypothesis is really true, and so we reject this hypothesis and so conclude there is likely to be a significant difference between the means.

You can get more information on the webpage Hypothesis Testing.

Charles

• Jessica says:

I appreciate your help. The more I read, the more confused I seem to get. The figures I gave you were from an online calculator but when I put the numbers in Excel and did the t-test, I got different numbers with no explanation about if it is statistically significant or not.
Again, thank you for trying to help me.
Jessica

• Jessica says:

Charles,

Here I go again….
These are the results I got when I did a one tailed t-test paired: two sample for means.
What does this mean? Why is there a negative number when the posttest scores were higher than the pretest? Is this statistically significant or not?

t-Test: Paired Two Sample for Means

Variable 1 Variable 2
Mean 72.9 83.6
Variance 501.5413793 443.6965517
Observations 30 30
Pearson Correlation 0.899895057
Hypothesized Mean Difference 0
df 29
t Stat -5.974725464
P(T<=t) one-tail 8.54E-07
t Critical one-tail 1.699127027
P(T<=t) two-tail 1.70719E-06
t Critical two-tail 2.045229642

I hope you can clarify this for me.

• Charles says:

Jessica,
A p-value less than .05 is usually considered to be statistically significant. For the two-tailed test you show a value of 1.70719E-06. This equivalent to the decimal number .00000170719, which is clearly less than .05. This statistically significant.
I don’t see any negative number, unless you are referring to the exponent in 1.70719E-06. This not a negative number. A negative exponent just moves the decimal point to the right (as described above).
Charles

21. Jessica says:

Charles,

The t-stat number was negative at -5.97472 and I conducted a one tailed test so the P is 8.54E-07. I am not sure why the t-stat was negative. The MD was 10.7 between pre and post.

22. isaac says:

hi, good article! but i have a question: when using a two sample t test (independent), when should you assume equal variances?

• Charles says:

To be on the safe side you can always use the unequal variances test. If the variances are close to equal the equal variances test will be quite similar to the unequal variances test.
Charles

23. aween says:

hello, this was really useful but could you please help me i do not know which type to use. i have two groups A and B and both groups did a PRE and POST test. the tests were the same in both groups

• Charles says:

I am not sure which types you are referring to. Equal vs unequal variances? single sample vs two independent samples vs paired samples? Please explain.
Charles