# When t test assumptions are violated

As we have discussed elsewhere, to use the t-test for independent samples, the data in each sample must be normal (or at least symmetric) and the presence of outliers should not distort the results. In the case of paired samples the differences in measurements must be normal or at least symmetric and there shouldn’t be significant outliers in these difference measurements.

In case one of these conditions is not met, we have the following choices:

• Check the data – in particular, make sure that that the problematic data are true outliers and not errors in copying
• Ignore the problem – not recommended since this will usually cause problems
• Transform the variable, the Box-Cox transformation can be especially useful
• Use a non-parametric test
• Use robust estimators of the mean and variance – e.g. use the median (which is more resilient to outliers than the mean)

### 20 Responses to When t test assumptions are violated

1. Colin says:

Sir
When the population’s variance is unknown, and the sample size is large (e.g. >= 30), some people use z-test while others prefer t-test. What is your opinion ? Which one is better?

• Charles says:

Colin,
For large values of n the results of both tests are very similar. I tend to use the t test.
Charles

2. jamila bibi says:

sir if value of df>50 how can we find the critical value for a particular data

• Charles says:

If you are using the t distribution, then you can employ the T.INV or T.INV.2T function to get the critical value.
Charles

3. jamila bibi says:

sir how can we know that particular data represent either equal varience or not in case when not mention in problem

• Charles says:

You can use a statistical test (such as Levene’s test)to determine whether the variances are significantly different. See the webpage
Homogeneity of Variances

In any case, when in doubt use the t test with unequal variances. If the variances are equal, the result of this test will be very similar to the t test with equal variances.

Charles

4. Takwa says:

Dear Charles,
In a between-group design, the normality distribution is met for one group and not in the other using tests for normality. Besides, the homogeneity of variance assumption is not violated. Can I proceed with independent t-test even if the normality assumption is not met for one level of the independent variable without converting to the non-parametric one?

• Charles says:

Takwa,
It depends on how far from normality the group is. If it is reasonably symmetric, then it is usually reasonable to use the t test.
Charles

• Takwa says:

Dear Charles,
the mean and standard deviation for the groups involved are (M=8.22/ S.d= 3.75) and (M=7,14/sd= 5.5) respectively.
Shapiro-Wilk results for the two independent groups are p=,206 and p=,003 respectively.
Are these numbers enough to know how far from normality the group is? are there special tests for symmetry?

• Charles says:

Takwa,
To test for symmetry you can use a Boxplot.
Charles

5. Piero says:

Dear Dr. Charles,

I have to perform a T-test for two independent samples to compare an anthropometric measure between males and females populations.
I use Shapiro-Wilk test to check Normality assumption in both samples.
While male sample passes the Shapiro-Wilk test, the female sample doesn’t (p = 0.013).
Homegenity of variances is satisfied.

Now, the parametric t-test is significant (p = 0.038, two tails), but the Mann-Whitney test resulted not significant (p=0.117, two tails).

The female sample (the one that didn’t pass Shapiro-Wilk) has Kurtosis = 0.29, Skewness = -0.55, sample size n =57. Even looking QQ or Box-Plot diagrams, I am still uncertain for symmetry.

In your opinion, which test should I trust more? Do you have any suggestion?

Thank you very much for any help
Best Regards
Piero

• Charles says:

Piero,
It is a good question.
With these values of kurtosis and skewness, I would have thought that the data would pass the SW test. Do you have a lot of ties (esp. in the female sample)? With a lot of ties, SW is not so accurate.
If you send me an Excel file with your data, I will try give you my judgement.
Charles

• Piero says:

Hello Charles,

of course I can send you an Excel file, thank you very much!
How can I do to send it?
There are no ties in the sample and no outliers or “strange” values (from a visual inspection of data)

Best Regards
Piero

6. ahmed says:

If sample size in one group 39 and in another group 7 can I use independent-t test
or the difference between two groups very large
and what is the valid difference between two groups which we can use with independent-t test to compare means of 2 groups

• Charles says:

Ahmed,
Yes, you can use the independent t test even when the group sizes are different. In this case the test is more sensitive to the variances being unequal, and so you may need to use the unequal variances version of the test.
Charles

• Ahmed says:

thanks
the unequal variances version of the test. you mean Welch test

7. Greg says:

Hello.

I have a slightly different question… In the 5 options at the top of the page for dealing with data that does not conform to a normal distribution, it is still not clear to me how best to deal with outliers when performing a 2-sample t-test to compare means.

For my problem, I want to compare the means of two groups (n=40) that are normally distributed except for 3 points that are clear outliers and significantly affect the mean and standard deviation.

Is it appropriate to simply throw these outliers out and ignore them for the t-test? I am reluctant to use a non-parametric test or transform the data if I can help it.

Thanks.

• Charles says:

Greg,
It depends on the impact of keeping the outliers. If the data is reasonably normal or at least symmetric, then you should be ok simply using the t test (including the outliers).
If you do decide to remove the outliers, then you should also report the results of some test with the outliers included.
Note that the Mann-Whitney is a reasonable test to use when you have outliers.
Charles