As we have discussed elsewhere, to use the t-test for independent samples, the data in each sample must be normal (or at least symmetric) and the presence of outliers should not distort the results. In the case of paired samples the differences in measurements must be normal or at least symmetric and there shouldn’t be significant outliers in these difference measurements.

In case one of these conditions is not met, we have the following choices:

- Check the data – in particular, make sure that that the problematic data are true outliers and not errors in copying
- Ignore the problem – not recommended since this will usually cause problems
- Transform the variable, the Box-Cox transformation can be especially useful
- Use a non-parametric test
- Use robust estimators of the mean and variance – e.g. use the median (which is more resilient to outliers than the mean)

Dear Dr. Charles,

I have to perform a T-test for two independent samples to compare an anthropometric measure between males and females populations.

I use Shapiro-Wilk test to check Normality assumption in both samples.

While male sample passes the Shapiro-Wilk test, the female sample doesn’t (p = 0.013).

Homegenity of variances is satisfied.

Now, the parametric t-test is significant (p = 0.038, two tails), but the Mann-Whitney test resulted not significant (p=0.117, two tails).

The female sample (the one that didn’t pass Shapiro-Wilk) has Kurtosis = 0.29, Skewness = -0.55, sample size n =57. Even looking QQ or Box-Plot diagrams, I am still uncertain for symmetry.

In your opinion, which test should I trust more? Do you have any suggestion?

Thank you very much for any help

Best Regards

Piero

Piero,

It is a good question.

With these values of kurtosis and skewness, I would have thought that the data would pass the SW test. Do you have a lot of ties (esp. in the female sample)? With a lot of ties, SW is not so accurate.

If you send me an Excel file with your data, I will try give you my judgement.

Charles

Hello Charles,

of course I can send you an Excel file, thank you very much!

How can I do to send it?

There are no ties in the sample and no outliers or “strange” values (from a visual inspection of data)

Best Regards

Piero

Piero,

You will send my email address at Contact Us.

Charles

Dear Charles,

In a between-group design, the normality distribution is met for one group and not in the other using tests for normality. Besides, the homogeneity of variance assumption is not violated. Can I proceed with independent t-test even if the normality assumption is not met for one level of the independent variable without converting to the non-parametric one?

thank you for your cooperation.

Takwa,

It depends on how far from normality the group is. If it is reasonably symmetric, then it is usually reasonable to use the t test.

Charles

Dear Charles,

the mean and standard deviation for the groups involved are (M=8.22/ S.d= 3.75) and (M=7,14/sd= 5.5) respectively.

Shapiro-Wilk results for the two independent groups are p=,206 and p=,003 respectively.

Are these numbers enough to know how far from normality the group is? are there special tests for symmetry?

Thank you for your help.

Takwa,

To test for symmetry you can use a Boxplot.

Charles

sir how can we know that particular data represent either equal varience or not in case when not mention in problem

You can use a statistical test (such as Levene’s test)to determine whether the variances are significantly different. See the webpage

Homogeneity of Variances

In any case, when in doubt use the t test with unequal variances. If the variances are equal, the result of this test will be very similar to the t test with equal variances.

Charles

sir if value of df>50 how can we find the critical value for a particular data

If you are using the t distribution, then you can employ the T.INV or T.INV.2T function to get the critical value.

Charles

Sir

When the population’s variance is unknown, and the sample size is large (e.g. >= 30), some people use z-test while others prefer t-test. What is your opinion ? Which one is better?

Colin,

For large values of n the results of both tests are very similar. I tend to use the t test.

Charles