When t test assumptions are violated

Assumptions

As described in One-sample t-Test, to use a one-sample t-test, you need to make sure that the data in the sample is normally distributed or at least reasonably symmetric. In particular, you need to make sure that the presence of outliers does not distort the results.

The situation for the paired t-test is similar, in that you need to make sure that the differences in the data pairs are normal or at least reasonably symmetric, and that the presence of outliers in these differences do not distort the results.

For the t-test on independent samples, the data in each sample must be normal or at least reasonably symmetric and the presence of outliers does not distort either of these results.

When assumptions are not met

In case one of these conditions is not met, we have the following choices:

In particular, a non-parametric test has the following advantages:

  • Uses more robust estimators – e.g. use of the median instead of the mean, since it is more resilient to outliers
  • Often data ranks are used instead of the raw data, which addresses the normality assumption since data ranks are already normally distributed.

References

Howell, D. C. (2010) Statistical methods for psychology (7th ed.). Wadsworth, Cengage Learning.
https://labs.la.utexas.edu/gilden/files/2016/05/Statistics-Text.pdf

Kim, T. K., Park, J. H. (2019) More about the basic assumptions of t-test: normality and sample size. Korean Journal of Anesthesiology
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6676026/

Laerd Statistics (2018) Independent t-test using SPSS statistics
https://statistics.laerd.com/spss-tutorials/independent-t-test-using-spss-statistics.php

69 thoughts on “When t test assumptions are violated”

  1. Hello Charles
    Many thanks for your feed
    I got stuck on this question and would appreciate your help

    B. Provide evidence that you met the necessary assumptions of an independent samples t-test
    According to my data
    The value of W is not statistically significant given
    (p=0.981)

    As per our presented lessons those regarding the scale of measurement (in this case it is not valid, random sampling, normality of the data distribution, ‘adequacy of sample size, and the equality of variance in the given standard deviation.
    According to our data above: The value of W is not statistically significant given
    (p=0.981)
    Moreover, this should assume that the presented distribution of score is not statistically significant, nor does it provide a normal distribution, suggesting that this test is not the most ‘significant’ test available, and this would lead us into performing a Welch’s Test, as informed in class.

    Reply
  2. Hi all,

    I have a similar question as well for my thesis. I am testing whether there are differences in two groups – entrepreneurship students who were obliged to create their own venture during their master program and entrepreneurship students who did not have to create a venture during their master program. With that, I am testing whether the first group has a higher number of created firms. Yet, the number of firms was not normally distributed. Therefore, I wanted to perform the Mann-Whitney U test, however, the assumption of equal distributions was violated (test for homogeneity of variance based on Median/median adjusted df resulted in significance level of <.001). Therefore, I am a bit lost whether I can perform other tests that can still indicate (statistical) differences between the groups. Your help will be tremendously appreciated!

    Reply
    • Hello Marijn,
      What test for the assumption of equal distributions did you use?
      Was this the test that resulted in p-value < .001? What is the p-value for the Mann-Whitney U test? Charles

      Reply
  3. Hi Charles,
    I have a data set with a limited number of participants (6-10 per group) but with 4 data points (neuroimaging outputs) per participant. I can average the data for each participant and then do a ttest on the data. However, this approach reduces the complexity of the data set and information by a factor of four. Is there a significance test (i.e. significance relative to zero) I could do that would utilize the number of data points per subject or even the variability within subject? I looked at non-parametric ttests but they all seem to have the independence assumption, which would be violated by pooling the data (i.e. not averaging).
    Thanks!
    Noelle

    Reply
  4. Hello!

    I have a simple question. I have one sample with 260 data. I would like to compare it’s mean to a given number, but I can’t use the one sample t-test, because my data is not normally distributed (I tested it with a Kolmogorov-Smirnov test, what can I do now? I wasn’t able to find the non-parametric version of the one sample t-test.

    Thanks in advance, Csaba

    Reply
  5. Hi Charles,

    Is the homoscedasticity assumption also included in the case of paired t tests? If so, I understand that Welch test is not applicable for paired samples. Is there any other test to solve the problem of inequality of variances in paired data? Thanks in advance,

    Wilson

    Reply
    • Wilson,
      There is no homogeneity of variance requirement for the paired t-test. Since this test is essentially a test of one sample, namely the differences between the pairs, there is only one variance and not two.
      Charles

      Reply
      • Thank you Charles, it makes sense to me when dealing with one sample. However, occasionally I have seen people doing paired t tests for matched data (case-control studies). In this case, we would be working with two samples, so my question would be whether it is appropriate using a paired t-test, and if so, how to address the potential problem of heteroscedasticity. Thanks a lot for your comments.

        Wilson

        Reply
  6. Hello Charlez,

    -Is this statement right ” Often variances can differ between the two groups being tested, suggesting performing an unequal t-test by default”

    -I am performing a bivariate analysis using a t-test on multiple Independent variables using continuous dependent variables. I find it quite confusing to reach a decision on whether I should use a non-parametric test or keep using a t-test. Firstly, my DV shows significance using the Shapiro test while the histogram showed skewness to the left. I read multiple resources that if the sample size is >50 I should not worry about the normality assumption which in my case I have 180 subjects. While other resources mentioned a 1.5 size ratio should not be exceeded between the two samples to use the t-test safely.

    Reply
    • Sara,
      1. You can use the equal variance version of the t-test even when the variance of one sample is twice the other or even more, but when in doubt use the unequal variance version of the test.
      2. If the sample sizes are reasonably similar and the data are symmetric you should be able to use the t-test. If the Shapiro test shows that the data is not normally distributed and the data is clearly skewed, then large samples are not sufficient to satisfy the normality requirement. When in doubt, use a non-parametric test.
      Charles
      Charles

      Reply
  7. Hi, As data was skewed on some of my data, I transformed using a non-parametric test (Wilcox Signed Rank) in SPSS, this provided normalised scores, can I now run paired t-test analysis ?

    Many thanks

    Reply
    • Claire,
      Wilcoxon’s Signed Ranks test is used in place of a paired t test.
      You should use the results from this test and not perform a paired t test on the results from Wilcoxon’s Signed Ranks test.
      Charles

      Reply
  8. Hello Charles,
    Lets say i have 2 datasets and one violating levene’s test, while another is not. Should I transform both data or just the one violating test assumption? Thanks!

    Reply
    • Yolanda,
      You may be able to use the version of the t-test that does not require that the variances of the two samples are equal.
      Levene’s test is used to determine whether a number of datasets have the same variance. I don’t know how you are able to use Levene’s test on just one dataset.
      Charles

      Reply
  9. One of the assumptions of a t-test is that there is a Random sample of data from the population. However, if my population had to meet criteria to complete the survey, how would this be random? What does this do to my study?

    Reply
    • Kerri,
      Please explain why you think the sample that you are considering using is not a random sample? Sometimes the criteria that are imposed just means that the population that you are studying is smaller than you had originally thought.
      Charles

      Reply
    • It depends. If the data is close to normally distributed, probably there is little harm, but for data that is highly skewed the accuracy of the t-test may be questionable.
      Charles

      Reply
  10. Charles; here’s a really back-to-basics question. I’m doing a t-test where the population mean and standard deviation are unknown. It’s required that the SAMPLE be normally distributed, NOT the unknown population; correct? I’ve read some confusing stuff that seems to say it’s a requirement of the population. I just want to clarify. Thanks.

    Reply
    • Hello Richard,
      The requirement is that the population be normally distributed, but since you usually don’t have access to the population, you use the sample as a substitute.
      Charles

      Reply
  11. I have a little challenge. Trying to run a one sample T-test using around 60,000 sample size. However the data is still heavily skewed to the right. For simplicity, the mean of the whole sample is roughly 3.5 but I have data points as big as 3500. Shapiro Wilk’s Normality test was failed. I had thought of using non-parametric test but not sure because I am assuming I’m still good to go with the T-test given the Central Limit Theorem. Do you advise I drop the outliers and if yes how many deviations away from the mean? Or do you recommend I continue with the T-test with them?

    Reply
  12. Charles, excellent summary. Can I please expand on your May 15, 2020 exchange with Alina? Before using the t-test, I was using the Inter-Quartile Range test to check my sample for outliers. But does the Central Limit Theorem mean I can ignore the IQR for samples of 30 or more, and proceed with the t-test regardless? What about for a sample of, say 50, that has an outlier?
    Thank you!

    Reply
    • Hello Richard,
      1. Although the central limit theorem is valid, you will often run into large samples that are not normally distributed. You can use a variety of tests to see whether the sample comes from a normally distributed population (e.g. the Shapiro-Wilk test).
      2. The t-test will work pretty well provided the data is not too far from normally distributed (e.g. the data is symmetric).
      3. With a sample of size 50, even if normally distributed, you can expect that there will be an “outlier”.
      Charles

      Reply
    • Thank you Charles. So it sounds like the blanket statement I’ve heard; “Because of the Central Limit Theorem, you can assume normality with a sample size of 30.” really isn’t true. Correct? I have a sample of n=150. Shapiro-Wilk assumes data is NOT, with 7 outliers. I would think I would NOT trust the t-test on this sample??? THANK YOU!

      Reply
      • Hello Richard,
        The t test might still be ok if the data is reasonably symmetric, but it looks like Mann-Whitney might be a better approach (assuming two independent samples).
        Charles

        Reply
        • Charles, my problem keeps expanding, beyond what is reasonable to ask in this setting. Are you available for a fee for a more in depth discussion?

          Reply
  13. Hi,
    Can you please help me with this question

    A random sample of n=25 with X=25 is taken from a population of 1000 with a population standard deviation of σ=30. Suppose that we know that the population from which the sample is taken is not normally distributed. Find the 95% confidence interval for the unknown population mean. 

    We’ve only studied the z-test and t-test for confidence intervals so we definitely have to apply one of those. But i cant figure out which one or why.

    Reply
  14. Dear Charles,

    I would like to have you suggestions about this point.
    A 20-questions questionnaire on a Likert scale (possible values 0,1,2) has been submitted to two groups of children (groups have unequal size).
    I have to verify if there is a statistical difference between answers of groups A and B.
    So, for each kid of the two groups, I computed the sum of scores of the 20 questions and then I did a t-test for independent samples on this sum of scores between the two groups.

    Homogeneity of variances with the F-Test is satisfied and also the data passed the normality test, as I verified both with Shapiro-Wilks and by visual inspection of QQ plots.
    But my doubt is: by considering that the raw data comes from a Likert scale with only 3 possible values for each question (0,1 or 2), is it correct to make a parametric test or should I do a non-parametric test?

    Moreover, in case the t-test assumptions were violated, which is the most appropriate test to perform (Welch’s, Kruskal-Wallis…)?

    Thank you very much!
    Best Regards

    Reply
    • Hi Piero,
      Since you are comparing the total scores and the assumptions are satisfied, then I think that you are safe to use the two independent sample t test.
      If the assumptions were not met, I would try the Mann-Whitney test. Welch’s and Kruskal-Wallis are used when the assumptions for ANOVA are not met.
      Charles

      Reply
  15. This is very interesting and to deal with the outlier or data normality issues, what if I plug in median value instead of mean and median absolute deviation instead of standard deviation while calculating the t test results. What are your thoughts around this approach?

    Reply
  16. Dear Charles
    I’ve been analysing data gathered from a questionnaire applied to turists. The data is non non-nornal distributed and have many outliers. In order to identify a representative estimator, instead of the mean, I applied the M- Huber estimator. Do you think that was a good option?
    Right now I want to compare the expenses into 2 groups, male=213 and female=188. The both cases they violate the normal distribution and equal variances assumptions. What I should do? I’ve already tried the several nomralisation procedures, but the only one that fits is log10 that seems to me that can be a good option since I have expenses =0.
    I’ve read your suggestions and right now I’ve the following doubt: Can /should I apply the Yuen-Welch’s Test or prefer the non-parametric tests, namely Mann- Whitney? Is there other options?
    Thank you in advance for the suggestions that you can share.

    Reply
    • Isabel,
      Generally the Mann-Whitney would be used, although need to interpret the results properly since the homogeneity of variances assumption is not met. You might be able to use a t test if normality is not too badly violated (e.g. if there is symmetry). Yuen-Welch is also possible If the normality assumption violation is due to outliers.
      Charles

      Reply
      • The violation of normality is due outliers. On the other hand, Mann-Whitney doesn’t require also equal variance? if so, maybe it would be better to apply Kolmogorov Smirnov (KS) that doesn’t have this kind of assumptions. What you would prefer: KS or Welch test?
        Thank you

        Reply
        • Isabel,
          Mann-Whitney doesn’t require equal variances, but if you plan to use it to test the medians, then it does require equal variances.
          All things being equal, I would use Welch’s test. If outliers are the problem, I suggest that you first determine why you have these outliers.
          Charles

          Reply
    • Jyotsana,
      They change the results of the analysis. E.g. suppose you calculate the p-value to be .03, a significant result. But because the assumptions of the test have been violated it turns out that the p-value is actually .12, which not a significant result. Unfortunately it is not easy (or possible) to figure out how far off the p-value actually is.
      Charles

      Reply
  17. Hi Sir,

    I want to compare mean differences of the same sample before and after intervention. I got 48 samples. I tested for distribution of the differences using Shapiro-Wilk Test and got a result of normally distributed differences. But then testing for outliers in the differences, I got one outlier score. Can I just include the outlier and proceed with paired sample T-test? Or should I use Wilcoxon signed-rank test?

    Follow-up question, what is the number of samples to consider when choosing between parametric and nonparametric tests? Thanks!

    Reply
  18. Hello.

    I have a slightly different question… In the 5 options at the top of the page for dealing with data that does not conform to a normal distribution, it is still not clear to me how best to deal with outliers when performing a 2-sample t-test to compare means.

    For my problem, I want to compare the means of two groups (n=40) that are normally distributed except for 3 points that are clear outliers and significantly affect the mean and standard deviation.

    Is it appropriate to simply throw these outliers out and ignore them for the t-test? I am reluctant to use a non-parametric test or transform the data if I can help it.

    Thanks.

    Reply
    • Greg,
      It depends on the impact of keeping the outliers. If the data is reasonably normal or at least symmetric, then you should be ok simply using the t test (including the outliers).
      If you do decide to remove the outliers, then you should also report the results of some test with the outliers included.
      Note that the Mann-Whitney is a reasonable test to use when you have outliers.
      Charles

      Reply
  19. If sample size in one group 39 and in another group 7 can I use independent-t test
    or the difference between two groups very large
    and what is the valid difference between two groups which we can use with independent-t test to compare means of 2 groups

    Reply
  20. Dear Dr. Charles,

    I have to perform a T-test for two independent samples to compare an anthropometric measure between males and females populations.
    I use Shapiro-Wilk test to check Normality assumption in both samples.
    While male sample passes the Shapiro-Wilk test, the female sample doesn’t (p = 0.013).
    Homegenity of variances is satisfied.

    Now, the parametric t-test is significant (p = 0.038, two tails), but the Mann-Whitney test resulted not significant (p=0.117, two tails).

    The female sample (the one that didn’t pass Shapiro-Wilk) has Kurtosis = 0.29, Skewness = -0.55, sample size n =57. Even looking QQ or Box-Plot diagrams, I am still uncertain for symmetry.

    In your opinion, which test should I trust more? Do you have any suggestion?

    Thank you very much for any help
    Best Regards
    Piero

    Reply
    • Piero,
      It is a good question.
      With these values of kurtosis and skewness, I would have thought that the data would pass the SW test. Do you have a lot of ties (esp. in the female sample)? With a lot of ties, SW is not so accurate.
      If you send me an Excel file with your data, I will try give you my judgement.
      Charles

      Reply
  21. Dear Charles,
    In a between-group design, the normality distribution is met for one group and not in the other using tests for normality. Besides, the homogeneity of variance assumption is not violated. Can I proceed with independent t-test even if the normality assumption is not met for one level of the independent variable without converting to the non-parametric one?
    thank you for your cooperation.

    Reply
    • Takwa,
      It depends on how far from normality the group is. If it is reasonably symmetric, then it is usually reasonable to use the t test.
      Charles

      Reply
      • Dear Charles,
        the mean and standard deviation for the groups involved are (M=8.22/ S.d= 3.75) and (M=7,14/sd= 5.5) respectively.
        Shapiro-Wilk results for the two independent groups are p=,206 and p=,003 respectively.
        Are these numbers enough to know how far from normality the group is? are there special tests for symmetry?
        Thank you for your help.

        Reply
    • You can use a statistical test (such as Levene’s test)to determine whether the variances are significantly different. See the webpage
      Homogeneity of Variances

      In any case, when in doubt use the t test with unequal variances. If the variances are equal, the result of this test will be very similar to the t test with equal variances.

      Charles

      Reply

Leave a Reply to jamila bibi Cancel reply