When t test assumptions are violated

As we have discussed elsewhere, to use the t-test for independent samples, the data in each sample must be normal (or at least symmetric) and the presence of outliers should not distort the results. In the case of paired samples the differences in measurements must be normal or at least symmetric and there shouldn’t be significant outliers in these difference measurements.

In case one of these conditions is not met, we have the following choices:

  • Check the data – in particular, make sure that that the problematic data are true outliers and not errors in copying
  • Ignore the problem – not recommended since this will usually cause problems
  • Transform the variable, the Box-Cox transformation can be especially useful
  • Use a non-parametric test
  • Use robust estimators of the mean and variance – e.g. use the median (which is more resilient to outliers than the mean)

10 Responses to When t test assumptions are violated

  1. Takwa says:

    Dear Charles,
    In a between-group design, the normality distribution is met for one group and not in the other using tests for normality. Besides, the homogeneity of variance assumption is not violated. Can I proceed with independent t-test even if the normality assumption is not met for one level of the independent variable without converting to the non-parametric one?
    thank you for your cooperation.

    • Charles says:

      It depends on how far from normality the group is. If it is reasonably symmetric, then it is usually reasonable to use the t test.

      • Takwa says:

        Dear Charles,
        the mean and standard deviation for the groups involved are (M=8.22/ S.d= 3.75) and (M=7,14/sd= 5.5) respectively.
        Shapiro-Wilk results for the two independent groups are p=,206 and p=,003 respectively.
        Are these numbers enough to know how far from normality the group is? are there special tests for symmetry?
        Thank you for your help.

  2. jamila bibi says:

    sir how can we know that particular data represent either equal varience or not in case when not mention in problem

    • Charles says:

      You can use a statistical test (such as Levene’s test)to determine whether the variances are significantly different. See the webpage
      Homogeneity of Variances

      In any case, when in doubt use the t test with unequal variances. If the variances are equal, the result of this test will be very similar to the t test with equal variances.


  3. jamila bibi says:

    sir if value of df>50 how can we find the critical value for a particular data

    • Charles says:

      If you are using the t distribution, then you can employ the T.INV or T.INV.2T function to get the critical value.

  4. Colin says:

    When the population’s variance is unknown, and the sample size is large (e.g. >= 30), some people use z-test while others prefer t-test. What is your opinion ? Which one is better?

Leave a Reply

Your email address will not be published. Required fields are marked *