Two Sample t Test: unequal variances

Theorem 1: Let  and ȳ be the sample means and sx and sy be the sample standard deviations of two sets of data of size nx and ny respectively. If x and y are normal, or nx and ny are sufficiently large for the Central Limit Theorem to hold, then the random variable


has distribution  T(m) where

Observation: The nearest integer to m can be used.

An alternative calculation (Satterthwaite’s correction) of m (which has the same value) is as follows



Observation: This theorem can be used to test the difference between sample means even when the population variances are unknown and unequal. The resulting test, called, Welch’s t-test, will have a lower number of degrees of freedom than  (nx – 1) + ( ny – 1), which was sufficient for the case where the variances were equal. When nx and ny are approximately equal, then the degrees of freedom and the value of t in Theorem 1 are approximately the same as those in Theorem 1 of Two Sample t Test with Equal Variances.

Real Statistics Function: The Real Statistics Resource Pack provides the following supplemental function.

DF_POOLED(R1, R2) = degrees of freedom for the two sample t test for samples in ranges R1 and R2, especially when the two samples have unequal variances (i.e. m in Theorem 1).

Excel Function: Excel provides the function TTEST to handle the various two sample t-tests.

TTEST(R1, R2, tails, type) = p-value of the t-test for the difference between the means of two samples R1 and R2, where tails = 1 (one-tailed) or 2 (two-tailed) and type takes the values:

  1. the samples have paired values from the same population
  2. the samples are from populations with the same variance
  3. the samples are from populations with different variances

These three types correspond to the Excel data analysis tools

  • t-Test: Paired Two Sample for Mean
  • t-Test: Two-Sample Assuming Equal Variance
  • t-Test: Two-Sample Assuming Unequal Variance

Note that the type 3 TTEST uses the value of the degrees of freedom as indicated in Theorem 1 unrounded, while the associated data analysis tool rounds the degrees of freedom as indicated in the theorem to the nearest integer. We will explain the type 1 TTEST in Paired Sample t Test.

This function  ignores all empty and non-numeric cells. The value of alpha is assumed to be .05.

Example 1: In Example 1 of Two Sample t Test with Equal Variances, we assumed that the population variances were equal since the sample variances were almost the same. We now repeat the analysis assuming that the variances are not necessarily equal.

We use the Excel formula TTEST(A4:A13,B4:B13,2,3). The first two parameters represent the data for each sample (without labels). The 3rd parameter indicates that we desire a two-tailed test and the 4th parameter indicates a type 3 test. Since

TTEST(A4:A13,B4:B13,2,3) = 0.043456 < .05 = α

we reject the null hypothesis. Note that if we use the type 2 test, TTEST(R1, R2, 2, 2) = 0.043053, the result won’t be very different, thus confirming our assumption that the population variances are almost equal.

Example 2: We repeat the analysis from Example 1 but with different data for the new flavoring.

t test unequal variances

Figure 1 – Sample data and box plots for Example 2

Clearly, the sample variances are quite unequal. Using the T.TEST function with  = 3 we get

T.TEST(A4:A13 ,B4:B13, 2, 3) = 0.05773 > .05 = α

and so this time we cannot reject the null hypothesis (for the two-tailed test). Note that if we had used the test with equal variances, namely T.TEST(A4:A13, B4:B13, 2, 2) = 0.048747 < .05 = α, then we would have rejected the null hypothesis.

We can also use Excel’s t-Test: Two-Sample Assuming Unequal Variances data analysis tool to get the same result (see Figure 2).

t test unequal variances

Figure 2 – Data analysis for the data from Figure 1

Observation: Generally, even if one variance is up to 4 times the other, the equal variance assumption will give good results. This rule of thumb is clearly violated in Example 2, and so we need to use the t test with unequal population variances.

Real Statistics Data Analysis Tool: The Real Statistics Resource Pack provides a data analysis tool called T Tests and Non-parametric Equivalents, which combines the analyses for equal and unequal variances, as well as providing confidence intervals and the Cohen effect size. A second measure of effect size is also provided, which we will study in Dichotomous Variables and the t-test.

Example 3: Repeat Example 2 using the Real Statistics data analysis tool.

Enter Ctrl-m and select T Tests and Non-parametric Equivalents from the menu. Fill in the dialog box that appears as shown in Figure 3.

Dialog box t test

Figure 3 – Dialog box for T Test and Non-parametric Equivalents

Choose the Two independent samples and T test options and press OK. The output appears in Figure 4.

t test independent samplesFigure 4 – Real Statistics data analysis for data from Figure 1

We can see from Figure 4 that the degrees of freedom have been reduced from 18 to 11.208 under the assumption of unequal variances. We can get this same value by using the formula =DF_POOLED(A4:A13, B4:B13).

Observation: The input data for the two independent sample t test can have missing data, indicated by empty cells or cells with non-numeric data. Such cells will be ignored in the analysis.

154 Responses to Two Sample t Test: unequal variances

  1. Colin says:

    It seems (Sp)*sqrt(1/n1 + 1/n2) = sqrt((S1)^2/n1 + (S2)^2/n2). But I cannot prove it.
    Sp means the square root of pool variance
    sqrt() is the square root function in Excel

    The package you provide uses “sqrt((S1)^2/n1 + (S2)^2/n2)” to calculate the stander error and t-value for both cases of “equal variance” and “unequal variance”

    But I usually use “(Sp)*sqrt(1/n1 + 1/n2)” to calculate the stander error and t-value for “equal variance” and ” sqrt((S1)^2/n1 + (S2)^2/n2) ” to calculate the stander error and t-value for “unequal variance”.

  2. Ding says:


    I have several questions after reading your post.

    1. Is there a scientific way (equation or theory) that clearly defines in which case variances of two data sets are equal or unequal?

    2. I am not sure if I get your points, if two values obtained respectively from type 2 and type 3 (Excel t test) does not differ greatly, then it suggests equality of variance. If not, the opposite?

    3. What does the considerable reduction of df mean in your example? Sorry I am not from background of mathematics. Can you explain to me in details.

    4. I have two independent samples, n=6, to compare in excel t test. But I found no evidences to prove their variance equality. Can you suggest some ideas?

    Thank you very much for your help. I look forward to your reply.

    Have a good day.


    • Charles says:


      1. There are a number of techniques for determining whether variances of two (or more) data sets are approximately equal, including graphical approaches and the commonly used Levene’s test. See the webpage for more information.

      2. No, even when the type 2 and type 3 p-values are very similar, the variances may be noticeably different. Generally the variances need to be very different before you will see any real difference between the type 2 and type 3 tests.

      3. A smaller value of df changes the p-value. Obviously for the example I have given the smaller value of df doesn’t change the p-value that much.

      4. In this case, use the unequal variance test. With such a small sample, there is also risk that the normality assumption may not be satisfied, in which case you may want to use a non-parametric test such Mann-Whitney (see the webpage


  3. Donna says:

    Are you beginning with a significance level of 5% or 10% for your 2-tailed test?

    What if the value you get is 0.03 for the t-test? For example
    TTEST(A4:A13,B4:B13,2,2) =0.03
    Do you reject the null hypothesis? What about the 2 tails?
    Do large values have to be taken into consideration? What If I get 0.98?
    Thank you for your help!

    • Charles says:

      The TTEST assumes that alpha = 5%.
      If TTEST(A4:A13,B4:B13,2,2) = 0.03 then null hypothesis is rejected since .03 < .05. This is the two-tailed test (since the third argument is 2). If you want the one-tailed test you use the formula TTEST(A4:A13,B4:B13,1,2), which will have a value which is half of the two-tailed test, and so once again you would reject the null hypothesis (since .03/2 = .015 < .05). If you get a p-value = 0.98 you couldn't reject the null hypothesis since .98 > .05.

  4. Olukayode Adedayo Babarinde says:

    I want to know, i have samples from the same source. I have used two different methods to analyse them. I am trying to compare two different methods used to analyse the samples.
    1. Can I use paired t-test?
    2. Are the samples dependent or independent?
    3. what do I do if the null hypothesis is rejected when t-calculated is greater than t-critical but p-value is greater than 0.05?
    4. tell me which method to use.
    thank you

    • Charles says:

      1. It depends on what you mean by the samples are from the same source. If “source” means “population”, then probably you shouldn’t use the paired sample t test. But if “source” means the same “subjects” then the paired test is the one you should use. See for more details.

      2. This is related to the first question. You need to supply more information before I can answer this question.

      3. I you are using a right-tailed test then it should never happen that t-calculated is greater than t-critical but p-value is greater than 0.05. If you are using a left-tailed test, then this just means that you can’t reject the null hypothesis.

      4. See my answer to your first question.


  5. SAM says:

    can you please help me in doing my research study i don’t know how to solve the P-value. using T-test..
    thank you! 🙂

  6. Jam says:

    t-Test: Two-Sample Assuming Unequal Variances

    Mean 0.205416667 —————- – 0.184527932
    Variance 0.000385934——————- 0.000686411
    Observations 20———————————- 19
    Hypothesized Mean Difference——————– 0
    df —33
    t Stat— 2.805852172
    P(T<=t) one-tail– 0.004176129
    t Critical one-tail– 1.692360258
    P(T<=t) two-tail –0.008352257
    t Critical two-tail– 2.034515287

    • Charles says:

      Assuming that alpha = .05, since p-value (two-tailed) = 0.00835 < .05 = alpha, you reject that hypothesis that the two populations (from which the samples came) have the same mean. Charles

  7. Tanya says:

    May I ask what the formula for the df (degree of freedom)? I noticed that the value for the df is also different when I use t-test with unequal variances and equal variances.

  8. Tripti Sharma says:

    Hello Charles,
    I would like to know whether I am using the right t test for my data.I have two data set of male life span with mean-31.15 and 19.05,variances -287.1 and 217.6,N1=79,N2=78.I am using two sample assuming equal variances.The other data set is the number of eggs laid having mean-36.59 and 15.1, variances-1130.399 and 238.32,N1=41,N2=10.For this data set, I am using two sample t test assuming equal variances .Which p value I should consider for my result -one tail or two tail. Am I using correct statistical analysis or not if not please suggest what I should use.

    • Charles says:

      If you goal is to determine whether the two populations have the same mean, then the two sample t test assuming equal variances seems like a good choice provided the assumptions for the test are met (principally that the data is not highly skewed).

      For the second example, I suggest that you use two sample t test assuming unequal variances.


  9. Dawn Wright says:

    Hi Charles,
    I noticed the formula for the two sample, independent t-statistic calculates the absolute value [=(ABS(H5-H6-J3))/G16] . Other software packages I have used do not use the absolute value and thus can produce negative t-statistics. Is this something I am misunderstanding?

    • Charles says:

      The sign is not particularly important since it depends only on which of the means is subtracted from the other. The p-value is identical. I used the absolute value since Excel’s two tailed formula — TDIST(t,df,2) or TDIST.2T(t,df) — requires a positive value for t.

  10. Niklas Leuschner says:

    Hello, I am not sure what T-Test to use for one of my experiments. I am measuring if there is a significant difference in the abundance of a species in two different habitats.

  11. nanthinie says:

    hi sir,

    I’m doing 2 independent samples mean t-test with unequal variances to verify the comparison in the performance of the GDP Growth between 2 countries (Jordan & Morocco).. I’m not sure of which sign to use in Null Hypothesis and also in Alternative Hypothesis.. Is it = & ≠ or ≤ & > or ≥ & < ?

  12. IHateMath says:

    Can you post the Unequal variance with a simpler examples?

  13. Quinton says:

    Good Afternoon

    I am trying to justify that the current method of sample taking is not representative. I have data from an online analyser that analyses the material/ore as it is produced. We then take a few grab samples for laboratory for analysis. I am not sure, but I think the two sample t-test would be the best fit for me. FYI I have done the F-test for the two samples and the null hypothesis that the variances for the two samples that are equal were not satisfied. I know want to perform the t-test to show that the sample means are not same, thus justifying that the grab samples is not sufficient and we need continuous online samplers. Am I on the right track? Please help

    • Charles says:

      If I am understanding correctly, you want to use the t-test for independent samples with unequal variances to test whether the two samples come from populations with the same mean. This seems like a reasonable approach to determine whether the grab samples are sufficient. Since you have already found a significant difference in the variances, you already have evidence that the grab samples are not sufficient.

      • Quinton says:

        Hi Charles, Thank you sooo so much for replying. To put some more clarity. I have more 40 000 data points that I have from an online analyser. This comes from one days production. Then I have a grab sample of 50 rocks (ore particles) that I re-analysed. I basically put it over the analyser 5 times so have 250 datapoints. If this sample was representative I assume that when plotting cumulative histograms of the two distribution (40 000 and 250 datapoints) should lay more or less on the same graph. Visually this is not the case. With my limited knowledge of inferential statistics the t-test with unequal variances seems to be the best option in comparing the two populations. Is this correct, since the population sizes are different. Is there another way that I can proof that the sample is not representative in a “fancy” way. Kind Regards

        • Charles says:

          The t test is fancy enough. You can use the t test with unequal samples.

          One caution: the 5 times that you have put each sample through the analyzer means that the sample of 250 datapoints are not independent, one of the assumptions for the t test. You might better averaging the 5 values for each rock to arrive at 50 data points, which you would compare with the 40,000 data points. Another, more complicated approach is to perform ANOVA with repeated measures.


  14. pi says:

    SIR.i am wondering could i compare t-test,welch and also mann whitney in term of mean.

    as i am referring the journal article “should i use nonparametric method on two apparently non normal distribution”

    some ppl said that this is no logic…however ,i do found some books to claim that under additional assumptions , mann whitney has the same distributons but shift of location occur,therefore we can use it to compare their means.

    • Charles says:

      Generally, if you can satisfy the assumptions for the t test, you should use the t test; otherwise provided the shapes of the two distribution are similar you should use Mann-Whitney. The loss in power of using Mann-Whitney is pretty small even when the assumptions for the t test are satisfied, and so when in doubt you might as well use Mann-Whitney.

  15. Yow says:

    Hello. Is this suitable if I have 10 respondents, which will be taking medication and be observed for their blood pressure for 10 days, to know if the medication is significant? or should I do one t-test for each of the respondent? Not really sure.
    Sorry for the bad english.

    • Charles says:

      Unfortunately, I don’t understand your question.

    • Learner says:

      I guess, you want to study the effect of “medication” on “blood pressure” of patients (Is this medication significantly contributing for curing Blood pressure?). There might be two approaches:
      1. You need to collect data from two group of BP – patients, namely treatment (Those who are taking medication) and control group (without medication). For keeping the effects of any other factor minimal, trails should be randomized.
      2. Collect data measuring blood pressure of patients before and after taking medication. Again, keeping the effects of any other factor minimal, trails should be randomized.

      So, finally you will have data of BP of two different groups. You can apply t-test. I believe for first case; you can apply independent sample t-test (with unequal variance) and for second case you can apply paired t-test.

      If Professor approves the approach.

      • 4th Year Psych says:

        I think it’s actually a within-subjects t-test, comparing pre-treatment BP with post-. I think you want to calculate the mean and SD of the BP for your 10 participants before they started the medication, and again after. Then you would compare those.

  16. Athina Crilley says:

    Hello, I’m doing a t-test on part of a set of data using excel
    1 mean is 1.6 with SD of 0.79, the other has a mean of 6.6 and a SD of 1.34. i’ve done the t test, selecting the first mean and SD as ‘array 1’ and the second lot as ‘array 2’. it’s a two-tailed test with unequal variance. I’ve got a p value of 0.48, which seems very high. have i done it correctly?

    • Charles says:

      No, the arrays should contain the raw data, not the mean and standard deviation. You can perform the t test using TDIST or T.DIST using the means and standard deviations.

  17. Niez says:

    I have a problem with my research. My lecturer told me to use both equal & unequal t-test but I don’t understand what the difference equal & unequal t-test.

    My research was about the efficiency between conventional and islamic banks from 2008 to 2015.the efficiency was measure by four (4) financial ratio.
    1) return on asset between conventional & islamic bank
    2) net profit margin between conventional & islamic bank
    3) debt ratio between conventional & islamic bank.
    4) earning per share between conventional & islamic bank.

    It is logic to use both equal & unequal to run the data in excel & how?

    • Charles says:


      In this situation, equal and unequal refers to variances of the two samples (actually the population, but the samples serve as surrogate for the population). You can calculate both versions (equal and unequal variances) of the t test using either Excel’s data analysis tools or the Real Statistics data analysis tools. For more information, see the referenced webpage or the following webpage for more information about the equal variances version of the t test.

      The t test is used to determine whether there is a significant difference in the means between two samples. This sounds like a reasonable test to use for the problems you have listed.


  18. Jaclyn says:

    I have 2 questions:
    1-why would I get 2 different T values when I run ttests in excel and spss?
    2- I have a student who did a pre and post test but did not match up the ID number so correctly, what kind of ttest can she use, I am assuming can not used paired? Thanks

    • Charles says:


      1. You should get the same values. If you send me an Excel file with your data and results I will try to see what has happened.

      2. The student will need to match up the ID numbers to be able to run any type of analysis.


  19. Cardre says:

    Hi Guys,

    I am doing research involving 65 samples at two different cycles, and seeing the impact these cycles (A & B) would have on the samples. Which t-test would be best to use and why?

  20. Andrea says:

    I am comparing three types of breathing during the shooting performance, but i have no the same number of people in each groups. So the situation seems like this:
    A:1, 2, 3, 4, 5, 6, 7, 8, 9
    B:1, 2, 3, 4, 5, 6, 7
    C: 1, 2, 3, 4
    Is it possible to evaluete it by t-test? What is the method???

    • Charles says:

      You don’t need to equal sample sizes to use the t test. But you are comparing more than 3 samples and so you need to use one-way Anova instead of the t test. See the following webpage: One-way ANOVA.

  21. Peaches says:

    How would I write up the results of a Two-Sample Assuming Unequal Variances with the results with the mean (variable 1 -3.11; variable 2 – 3.04), variance 0.022 & 0.029,
    observations 159 & 332, df 351, t Stat 4.53, P(T<=t) two-tail 8.15
    I need to know how to write this information up in a detailed format.

    • Charles says:

      I have not checked to see whether the t stat and df you calculated are correct, but T.DIST.2T(4.53,351) = 8.10E-06 and not the p-value you report (the E-06 part is important).

      When you report your results, you need to relate the statistical results to the real-world problem you were studying. I will suppose, for illustrative purposes, that you are testing whether a particular training course is effective in reducing accidents. I will also suppose that the p-value is 8.10E-06, and so you have a significant result.

      Using APA-like guidelines you would say something along the following lines:

      On average participants achieved better test scores after the training course (M = -3.11, SE = 0.15, N = 159) than those who did not take the training course (M = -3.04, SE = 0.17, N = 332). The difference is significant t(351) = 4.53, p < .001 (two-tailed); this represents a xx-sized effect of d = xx. Note that I used the standard error instead of the variance. You should also report the effect size Charles

      • Peaches says:

        The variables are positive numbers. Would I use variance instead of standard error?

        Thank you.

        • Charles says:

          That the variables are positive numbers is not relevant, You can certainly use the variance, but generally the standard error is reported.

  22. Serna says:

    Hello sir Charles!
    I am one of those people who gets their brains crumpled like hell when it comes to statistics.
    I just want to know if waht t test should I use to know if there is a significant difference between my experimental values and a fixed theoretical value.
    for example, exptl values are 1, 2, 3 and my theoretical values are 2, 2, 2

  23. Ravi says:

    Hi Charles,
    great and very helpful website!
    I just have a small question: I calculated the total bacterial numbers in the blood of 20 boys at three different time points i.e., at age 1 yr, 3 yr and 5 yr. I am confused which type of t-test should I use to calculate the statistical difference between the different time points?

    Many thanks in advance.


    • Charles says:

      Hi Ravi,

      The t test can only be used with pairs and not triplets. Thus you would have to perform up to three paired t tests: 1 yr – 3 yr, 1 yr – 5 yr, 3 yr – 5 yr. With three tests, there is more chance for experimentwise error, and so if you usually use alpha = .05, you would have to reduce the value of alpha say to .05/3 = .0667.

      The usual approach in this case, is to start by using a different test, namely Repeated Measures ANOVA. This will test whether there is a significant difference between all three times. If there is, then there are follow up tests to pinpoint where the differences lie.

      I suggest that you look at the ANOVA and Repeated Measures ANOVA part of the website.


      • Ravi says:

        Dear Charles,
        Thank you so much for your quick response. I got your point!
        By the way, if I wish to compare the data of cell numbers only between two time points i.e., 1yr and 5 yr, which type of excel t-test shall then be appropriate?
        Many thanks once again.

  24. Eric says:

    Hi Charles,

    With unequal variances, which degree of freedom is reported in the text describing the results ? The adjusted Welch df or the “natural” df (n1+n2-2) ?

    Example : (t(df?)=2.78; p=0,004)

    Can’t find an answer on this on the web or in textbooks…

    Thanx in advance for considering this,


  25. Tevita says:

    Sir, using the two sample t-test(welch) to compare the mean of two samples…how do I work out the standard deviation for both. Thanks.

    • Charles says:

      The standard deviation for data in range R1 is calculated by STDEV.S(R1).

      The standard error for the two sample t-test (Welch) is the denominator of the first formula in Theorem 1 of the referenced website.


      • Fofo says:

        Hi Charles
        Iam not good with the statistic stuff but I found out that Ecel has a t-test equation and I got some results for me data and calculate t-test. However I don’t know how to interpret the t-test result, so what it mean, Would you please help me with that

        • Charles says:

          The t test tests whether the means of two populations are equal based on a samples from each population. Also loom at the following webpage for more information:
          Two Sample t Test

  26. Alex says:

    Hi Charles if the formula for Equal Variances is T= (xbar1 – xbar2) – (mu1 – mu2)/ SQRT (1/n1+1/n2), then what would be the formula if it were unequal variances?

  27. Soledad Torres-Guijarro says:

    Suppose I comparing two data sets, x1 and x2. The sample mean of x1 is larger that the sample mean of x2, their variances are different, and my hypothesis is mean(x1)>mean(x2). If I got it right, T.TEST(x1;x2;1;3) gives the probability of mean(x1)>mean(x2). Then, why T.TEST(x2;x1;1;3) gives the same result? I would spect T.TEST(x2;x1;1;3) to be smaller than T.TEST(x1;x2;1;3).
    Thank your for your help, and for this useful tool and explanations.

    • Charles says:

      This function doesn’t return the probability that mean(x1)>mean(x2). It returns the p-value of test, which is different. In fact, if you flip the x1 and x2 values, the result for the test remains the same. See Null and Alternative Hypothesis for more details about how to interpet a p-value

  28. Leonie says:

    Hi, I am new to statistics so would like some help please

    If I have a balance intervention which all participants underwent, and would like to establish and analyse whether the right leg or left leg was more effective at improving in balance, am I correct in using a t-test for independent samples.

    Also how do I assume equal or unequal variance. All of the figures are different and varying therefore do I use unequal variance. I would like to use excel to analyse my data.

    Many thanks.

    • Charles says:

      Assuming that you are comparing each person’s right leg with his/her left leg, you should use a paired t test. This is because the right and left legs are not independent (since they belong to the same person).

  29. Cait G says:

    Hi Charles!
    I am completing research analysis in regards to the effect of different variables on the level of mental illness stigma. I am testing how one’s age affects the level of negative stigma, as well as how one’s previous exposure to mental illness affects the level of negative stigma.

    I am at the point in my analysis where there was no significant correlation between age and level of stigma, so my professor suggested dividing the ages into two groups (a younger group and an older group) and performing a t-test on the stigma results in order to see if there is any relationship there. So I have done that in Excel, I have selected the stigma results from each age group and compared them in a t-test two sample unequal variance test. My question is: in the results, the only thing I can see that is relevant to a p-value for significance is listed as:

    P(T<=t) one-tail 0.284053007
    t Critical one-tail 1.71088208
    P(T<=t) two-tail 0.568106014
    t Critical two-tail 2.063898562

    I know normally a p-value is a lower case p, so are those upper case P's not a p-value? If not, what am I doing wrong in order to find the statistical significance of my findings? Also, how do I decide whether or not I want a one-tail or two-tail value (as they are very different)?

    Thank you!

    • Charles says:

      Dear Cait,
      The uppercase P is indeed the p-value. Generally, you should use the two-tailed t test. In this case, both the one and two tailed tests yield a result which is not significant. See Null Hypothesis for more details about the number of tails.

  30. Pixie bliu says:

    hi there,

    I have sampled 2 different habitats to determine whether tree species vary between the 2 sites. To do a ttest am i putting the raw data in or the mean, variance worked out frm each habitat.?

    Thank you

    • Charles says:

      You should generally conduct the t test on the raw data and not the mean/variance. Without knowing more about the specifics of your scenario I can’t say much more.

  31. bri says:

    I had my students run an experiment over 15 days where they measure the growth (budding) of lemna plants under different colors of light using white as a control. They then graphed the raw data (5 trials of each color), then got the slope of the linear trendline as the rate. I want them to compare each rate of growth to white using a t-test.

    my expectation was each graph would have the 5 trials for that color (so 5 lines = 5 rates). Then they were basically comparing av rate for red to av rate for white using a t-test, then av rate for blue to av rate to white, etc. They were t-testing just the 5 averages to the other 5 averages. My question is for degrees of freedom. Would it be 5-2=3, or would they need to use all of the data points (so 15 days x 5 trials = 75 -2 = 73DF)?

    Also, when excel does the t-test it calc the p value so does it already take DF into account?

    Where as if they used an online calculator, they’d need to calc DF because they’d be given the t-calc, correct?

    thanks so muc!

    • Charles says:


      If I understand the problem correctly, you are comparing averages of one color vs white over the 15 days. If so, I would use df = 5+5-2 = 8 if this is an independent samples test (5 plants getting white light vs 5 different plants getting red light) and df = 5-1 = 4 if this is a paired samples test (5 plants getting white light and separately getting red light.

      You could instead use ANOVA on the averages taking all 5 colors into account. You could also use repeated measures ANOVA instead of taking averages. Finally you could use ANOVA with a fixed factor for color and repeated measures factor for time.

      When Excel does the t test on the raw data (via T.TEST or TTEST) it calculates the df inside the software. When it uses the T.DIST, TDIST and other distribution functions, the user needs to supply the value for the df.


  32. Maria Wachira says:


    I have two distinct samples-ESG performance of South African companies and ESG performance of Mauritian comapnies. I run a t test to establish if both are distinct from each other and I can reject the null hypothesis. However, if I want to know whether the performance from one sample (i.e. South Africa) affects the ESG performance of the other sample (Mauritian companies), what should I do? I would be grateful for any assistance.

    Thank you!


    • Charles says:

      Putting statistics to the side, please give me an example (or examples) of how ESG performance in South African companies can be affect the performance of Mauritian companies.

      • Maria Wachira says:

        Hi Charles. Thank you for responding. Essentially, using organizational theory, in particular institutional theory, we say that companies that operate in close proximity to each other tend to conform to certain established norms of behavior. In some cases, businesses may follow practices done by larger and more established firms which is what we tend to call mimetic pressure. So the grounds for forming the hypothesis that since companies in South Africa are in many ways more established than Mauritian companies, then it follows that Mauritian companies could imitate their practices (in my case ESG reporting). Hope that makes sense.

        • Charles says:

          Thanks for your clarification.
          Regarding your original question, first we need to decide on how to measure “whether the performance from one sample affects the performance of the other sample”. It is easy to measure “correlation”, but it is more difficult to measure “causation” or “influence”. I don’t really know how you can measure this.

          • Maria Wachira says:


            Thank you very much. Yes, I have carried out correlation but I see perhaps I may need to look beyond statistical testing and carry out interviews with regulators of accounting information or specific case studies in these countries. But thank you so much for your help.

  33. Vijay Kumar Keerthivasan says:


    This was a very helpful article.
    I have the experimental data on temperatures from 2 sets of experiments that involve heating up of liquids under specific conditions. One set of data is for water where I did 5 experiments and have recorded the final temperature values. Other set of data is for salt water (brine) where I did 6 experiments and have recorded the final temperature values. I would like to compare the results of water and brine. From chemical data, the final temperatures of brine is expected to be lower than that of water. So I know that I would like to do a one-sided t-test.
    However, I am new to statistical methods and was wondering how I can use excel to do such a test. Should my ‘Variable 1 Range’ in Excel data analysis be water or brine or does it matter for an one sided test? Because, I want make sure that I am checking for the case that brine temperatures are lower than water and not checking for the reverse scenario. Thanks a lot for your help.

    • Charles says:

      It shouldn’t matter which variable you list first. You will get the same result in either case. In fact you will see both the 1 tailed and 2 tailed results.
      When you say that you have done 6 experiments, do you mean 6 repetitions of the same experiment or 6 different experiments?

  34. Moustafa says:

    I have two fungal organism one is wild type (parent strain) and the other is mutant type of the same strain. I would like to compare between gene expression in the two organisms. Which type of t-test should be used to know if the gene expression is significant or not?

    • Charles says:

      I would need to know more details, but it sounds likely that you need a two sample t test.

  35. Ben Kerns says:

    I’m conducting a test to determine if there is a quality difference between diaper brands. Unfortunately, my sample size is 12. 7 particpants for size 3 and 5 participants for size 4. My original plan was to conduct a t-Test: Paired Two Sample for means test (Ho: mu BENCHMARK BRAND – mu PROPOSED BRAND = 0, HA: mu BENCHMARK BRAND – mu PROPOSED BRAND 0) at the 5% level of significance. However, after I run the test in excel, my two tail P-Value is higher than I’d like. Therefore, this is leading me to think I should use two sample t-Test: unequal variances. Regardless, my question is, with a small sample size which statistics test mentioned above is ideal for comparing two samples? Or do you need more info to answer?

    • Charles says:

      Irrespective of the outcome, you can’t use the paired t test when the samples are independent. You need to use the independent t test. You are correct that you shouldn’t expect too much with such small samples (unless the sample means are quite different). You can check the power of the test as described on the Power of the t test.

  36. Abiola says:

    Hi sir,
    I am to determine if factors affecting employee turnover are the same as factors affecting employee retention. I have a frequency distribution table stating how many respondents consider each factor relevant to retention and turnover. So my data arrays are frequency counts for each factor. Array 1 for retention and Array 2 for turnover for the same factor. Example
    Pay 28% 14%
    Met expectations 16% 12%
    Trainings 8% 4%
    How do I apply the t-test to this analysis?

  37. John says:

    Thanks for the great article! I do have one follow up question however. I am still unclear as to which test to use based on the number observations.

    To give an example, I am looking to compare two columns of data; column A holds performance data before a change was made and column B holds performance data after a change was made. Both columns are for the same individual. The null hypothesis would be that there is no change in performance after the change is made. Column A has 30 observed values (n=30) and column B has 12 observed values (n=12). Is the data in column A and B still paired meaning I would use the two sample t-test for equal variances or is it unpaired due to the difference in n values meaning it would be a two sample t-test for unequal variances?

    Thanks for your time, I look forward to hearing from you!


    • Charles says:

      To use a paired test, (1) the sizes of the two groups must be the same, (2) each element in A must be independent of the other elements in column A (in particular, they can’t be from the same subject) and each pair of elements in the same row must be from the same individual.

  38. D. Johnson says:

    In the example, the T.Test (type 3) function and the Real Statistics tool both return a two-tailed p of 0.05773 — but Excel’s data analysis tool returns 0.0582. What accounts for this slight discrepancy? Thanks!

  39. Maireen Reformina says:

    Hi Sir,
    Will you help me interpret my data in t-Test: Two-Sample Assuming Unequal Variances
    t-Test: Two-Sample Assuming Unequal Variances

    Generic Branded
    Mean 2.079 2.126
    Variance 0.070 0.024
    Observations 11 11
    Hypothesized Mean Difference 0
    df 16
    t Stat -0.512
    P(T<=t) one-tail 0.308
    t Critical one-tail 1.746
    P(T<=t) two-tail 0.616
    t Critical two-tail 2.120

    • Charles says:

      Assuming a significance level of alpha = .05, the fact that the p-value > alpha indicates that you can’t reject the null hypothesis that the samples come from population with equal means.

  40. agbidi samue says:

    Which t test formulae will I use to test my hypothesis if the population is 79 and 101..
    Hypothesis: there is no significant difference in mean score between male and female teachers in regards to capacity building

  41. Satarupa says:

    I have selected return of a particular stock to know impact of stock split. I have taken return 3months before and after. I want to use t test. I also want to test that after return is higher than before or not. Same I want use it with other variable i.e turnover. Please guide me in this regard.

  42. Vivian says:

    I conducted a lab to try to reject the null hypothesis: “The rate of cellular respiration/oxygen consumption of a pea (plant) is the same as cricket (animal).”

    Would the t-test be appropriate?:
    Minutes Cricket Pea
    6 0.015 0
    9 0.035 0.04
    12 0.045 0.045

    So this 1st trial’s t-test results was .85 > .5 meaning that the difference between the rates of cellular respiration in the pea and cricket are not significant, thus we failed to reject the null, correct? And if so, when is a t-test appropriate? I referred to the link below:

    • Charles says:

      It sounds like the t test could be appropriate, but I don’t know how you calculated the value of .85 or where the .5 came from.
      By the way, how many peas and crickets were sampled and which version of the t test did you use?

  43. Kuunani says:

    So say I had a t Test : Two Sample Asumming Equal Variances
    Variable 1 Variable 2
    Mean 4.0875 8
    Variance 5.267857143 18.28571429
    Obs 8 8
    pooled variance 11.77678571
    Hypo mean differ 0
    df 14
    t stat -1.81237697
    P(T<=t) one tail 0.045002328
    t critical one tail 1.761310136
    P(T<= t) two tailed 0.090004655
    t Critcal two tail 2.144786688

    • Kuunani says:

      how would I explain this in everyday language?

      • Charles says:

        Assuming that you are performing a two-tailed test, the fact that p-value = .09 > .05 = alpha, indicates that there is is no statistical evidence for rejecting the null hypothesis that the samples come from populations with equal means.

        Two cautions though:
        1. The variances are not equal and so you should probably use the t test assuming unequal variances. I don’t expect the test to be that much different, but you should check this out.
        2. The calculation of the t stat doesn’t seem correct. t = the difference between the means divided by the pooled standard deviation times the square root of the sum of the reciprocals of the sample sizes. Thus t = (4-8) / (sqrt(11.78)*sqrt(1/8+1/8)) = -2.33.


  44. Courtney says:

    Hello, I am currently doing a project in class based off of a survey our class created. Our professor told us to all form our own hypothesis based on the data. I am having trouble creating my hypothesis and figuring out which test to perform. I really want to compare female vs. male coping mechanisms given social media. Would it be inappropriate to hypothesize that Females tend to have more appropriate coping mechanisms more so than males when it comes to social media? Also would I just use an unpaired t-test?

    • Charles says:

      You have not provided enough information to determine what is the appropriate hypothesis, but what you have proposed at least sounds plausible. Two sample t test could be appropriate.

  45. Lee says:

    Hey, would I be able to use T-test unequal variances with my data? By comparing issues in medication (Grouping them in main headings) and using the outcome (resolved by “A”), so e.g. Grp 1 med v Grp 2 med and outcome resolved by “A” (being 1) and resolved by “other” (being 0). My data size also ranges from 1 to 53, (I’m also thinking of excluding some data size from <6) would it be possible to use T-test unequal variances or another test would be more appropriate.

  46. Giridhar Kaushik R says:

    Supposing we get a p-value greater than alpha for a one tailed t test, can we look at the tstat and tcritical to compare the two arrays ?

    If yes, how do we do that ?

    • Charles says:

      Sorry, but I don’t understand your question. You can test using the p-value or the critical value. The conclusion will be the same.

  47. Andy Watkin says:

    Hi, Can you explain the computational formula Excel uses for the two sample mean t-test for samples with unequal variances. I’ve attached the Microsoft web address which shows the equations used but little else. In particular what are the delta sub o, m and n variables?? Thanks …Andy

  48. shane says:

    how to determine the significance level .05 in test statistic using unequal variance i have 50 and 100 observation…

  49. shane says:

    what is E in p two tail

  50. Nhi Nguyen says:

    Hi Charles,

    I need to test statitstics difference of two means. Both samples are related to one factor (e.g. net sales), however, each subject in one sample can experience same value several times and they are unequal samples. So can you please advice which test I should use?

    Thank you for your help!

    • Charles says:

      Are you saying that some subjects are measured multiple times (yielding potentially different measurements)?

      • Nhi Nguyen says:

        Hi Charles,
        I’m not sure. I will explain clearly. For example, we have 2 sample with net sales data. Sample 1 includes firms with characteristic 1, sample 2 consists of firms with characteristic 2. Example of sample 1 as follows.
        Obs Firm Year Net sales
        1 Firm A 1 1,000
        2 Firm B 1 1,200
        3 Firm A 1 1,000
        4 Firm A 2 1,500
        5 Firm B 1 1,200
        6 Firm B 2 2,000
        7 Firm A 1 1,000
        8 Firm C 1 3,000
        9 Firm B 2 2,000

        Similar to sample 2. But 2 samples are unequal.I need to test difference in two means of net sales.
        Thank you for your help.

        • Charles says:

          The fact that the samples are unequal in size is not a problem. The problem is that certain firms have multiple measurements (e.g. A and B). We could use repeated measures ANOVA based on year, but again I see multiple measurements. In this case, though, the multiple measurements are all identical, and so it looks like your data is really only:
          1 Firm A 1 1,000 (also sample 3, 7)
          4 Firm A 2 1,500
          2 Firm B 1 1,200 (also sample 5)
          6 Firm B 2 2,000 (also sample 9)
          8 Firm C 1 3,000
          Now the only problems are: (1) Firm C is missing data for year 2 and (2) you don’t have much data.

          • Nhi Nguyen says:

            Hi Charles,
            Thank you for your answer. It’s just an example, not real my data. I need test means between two sample across firm and year. You mean that I should use repeated measures ANOVA. However, I think that test looks like just for indicator number (0,1) not for continuous data. So do you think can I delete mutiple measures and remain only 1 observation for 1 firm in 1 year? I mean for above example, there is remaining only 5 observations.

          • Charles says:

            Nhi Nguyen,
            It really depends on what hypotheses you want to test. You can also take the average for a year when you have multiple years. Again it depends on what you are trying to discover.

  51. Davina says:

    Hi Sir,
    Will you help me interpret my data in t-Test: Two-Sample Assuming Unequal Variances
    t-Test: Two-Sample Assuming Unequal Variances

    Lethrinus nebulosus Siganus Sutor
    Mean 0.244083333 0.351354167
    Variance 0.015854884 0.028630815
    Observations 24 24
    Hypothesized Mean Difference 0
    df 42
    t Stat -2.491592797
    P(T<=t) one-tail 0.008375702
    t Critical one-tail 1.681952358
    P(T<=t) two-tail 0.016751403
    t Critical two-tail 2.018081679
    Thank you

  52. Amna Majeed says:


    For the calculation, where is the u_x and u_y coming from in the data set?
    I don’t have the population means, I only have my sample means

    • Charles says:

      Excellent question. If your null hypothesis is that the population means are equal, then you don’t need to know u_x and u_y, since from the null hypothesis u_x – u_y = 0.

  53. John says:

    Sir, can you please interpret my data for me. I am testing to see if these two mangrove species have equal reflectance wavelength.
    Mean 1.915770641 1.597215839
    Variance 1.015396402 0.421691398
    Observations 1159 1159
    Hypothesized Mean Difference 0
    df 1978
    t Stat 9.046575705
    P(T<=t) one-tail 1.72E-19
    t Critical one-tail 1.645624349
    P(T<=t) two-tail 3.43E-19
    t Critical two-tail 1.961164035


  54. Sarah says:

    Hi there, This is the prompt, and this is the data. We are typing a results section and need some assistance.
    We would like to know whether we can increase the population of an endangered salamander by
    adding coarse woody debris to the forest.
    We located 40 forest stands and added coarse woody debris to a randomly selected group of 20 of them
    and left the other 20 as-is.
    two years later, we surveyed for salamanders and computed the population
    here are the data, expressed as salamanders per hectare
    with CWD as-is
    t-Test: Two-Sample Assuming Unequal Variances

    Variable 1 Variable 2
    Mean 17.375 23.55
    Variance 51.26602564 41.68974359
    Observations 40 40
    Hypothesized Mean Difference 0
    df 77
    t Stat -4.050687903
    P(T<=t) one-tail 6.04193E-05
    t Critical one-tail 1.664884537
    P(T<=t) two-tail 0.000120839
    t Critical two-tail 1.991254395

  55. sisila says:

    How do we determine degree of freedom for one sample test and two sample test? why?

  56. Can you please interpret the result for me:
    t-Test: Two-Sample Assuming Unequal Variances

    Variable 1 Variable 2
    Mean 12.78571429 12.78571429
    Variance 16.33516484 10.7967033
    Observations 14 14
    Hypothesized Mean Difference 0
    df 25
    t Stat 0
    P(T<=t) one-tail 0.5
    t Critical one-tail 1.708140761
    P(T<=t) two-tail 1
    t Critical two-tail 2.059538553


    t-Test: Two-Sample Assuming Unequal Variances

    Variable 1 Variable 2
    Mean 18.64285714 21.71428571
    Variance 27.78571429 31.45054945
    Observations 14 14
    Hypothesized Mean Difference 0
    df 26
    t Stat -1.493174652
    P(T<=t) one-tail 0.073714206
    t Critical one-tail 1.70561792
    P(T<=t) two-tail 0.147428413
    t Critical two-tail 2.055529439

    • Charles says:

      Since the p-value is larger than alpha = .05, you can’t reject the null hypothesis that the two populations have the same mean. Here I am referring to the p-values of 1 and 0.147428413 for the two examples.

      • The 1st data is the result of the pretest of the students without the intervention. Variable 1 is the control group while variable 2 is the experimental group. After the intervention, I administered the post test (same question in the pretest) to see if the intervention can yield significant result in improving the scores of the students. Thus data 2 is the result after the intervention is done. Variable 1 is the control with no intervention while variable 2 is the experimental group. Thanks Charles.

        • When I used the mean percentage score (MPS) between the groups. Variable 1 got 53% while variable 2 got 62%. Can I state on my action research that there was an increase on students’ performance by a statistically significant 8.78% relative to the mean score?

          • Charles says:

            I am not sure what the mean percentage score means in your context, but assuming that the mean for variable 1 is 53% and the mean for variable 2 is 62% and assuming that you have conducted some test (probably the t test as in your previous comments) and obtained a significant result, you can say that variable 2 has as significantly larger mean than variable 1. Although the difference between the sample means is 8.78%, this difference may not be true for the population. You can give a confidence interval that captures the spirit of this conclusion.

      • Am I using the correct statistical tool for my study? Thanks for helping me a lot, Charles.

        • Since I want to know if there is a significant difference between the scores of the two groups. Thanks Charles

          • Charles says:

            Yes, you can use a two sample t test to determine whether there is a significant difference between the mean scores of the two groups, provided the assumptions of the t test are met, esp. independent samples and normality (or least not too far from normality).

  57. Ernest Ruto says:

    hi please if I am
    1. comparing the mean gotten from the length right and left hand, what t-test do I use
    2. if I am comparing between the mean length of hand in Africa and the UK what test do I use?

  58. Ankit Gupta says:

    Hi Charles,
    What is the actual formula to calculate ‘degree of freedom’ for un-equal variance in t-test.please suggest also why it is different from equal variance?

  59. Vaidya says:

    Hi Charles,

    If I am comparing means of say four groups with unequal variances, I will have to go pair wise. So it would be 4C2 combinations.Now I can get different pair with mean significantly different, Is there any way we can claim that this specific group’s mean is significantly different than all others?
    Suppose the groups are G1,G2,G3,G4.
    Diff of G1,G2 : Significant
    Diff of G2,G3: Significant
    Diff of G1,G4: not Significant
    Diff of G1,,G3: not Significant
    ….all other comb insignificant…

    Is there any way to reach at conclusion saying mean of G3 is significantly different from all others and G3 is the main culprit?

    Many Thanks.

    • Charles says:

      You may not have one group whose mean is significantly different from the others. It might be that groups A and B are significantly different and all the other pairs are not significantly different. The type of tests that you are referring to are typically dealt with as follow-up tests to ANOVA (like t tests but with more than 2 groups). The typical follow up test is Tukey’s HSD test. Since you have unequal variances you could use the Games-Howell test after ANOVA, although since you have unequal variances you should use Welch’s ANOVA instead of ANOVA as your “omnibus” test. The problem with doing 4C2 separate tests is that this approach inflates the type I error way beyond .05 (what is called experiment-wise error).
      All these topics are addressed on the Real Statistics website: just enter the appropriate topic in the Search tab on the right side of the webpage.

  60. Godseed says:

    Hey Charles,please this is urgent.discuss with example in each case(equal and unequal variance)the estimation of confidence interval mean with an unknown standard deviation and a small sample

    • Charles says:

      Figure 4 shows the confidence interval for both the equal and unequal variance cases for a specific example. The calculation is shown on the following webpage:
      This webpage addresses the one sample case. The two sample case is similar and uses the two sample value for the standard error and the difference between the two sample means in place of the difference between the sample mean and the hypothetical mean (often set to zero).

Leave a Reply

Your email address will not be published. Required fields are marked *