Comparing two means when variances are known

Theorem 1: Let and ȳ be the means of two samples of size nx and ny respectively. If x and y are normal or nx and ny are sufficiently large for the Central Limit Theorem to hold, then – ȳ has normal distribution with mean μxμy and standard deviation


Proof: Since the samples are random,  and ȳ are normally and independently distributed. By the Central Limit Theorem and Property 1 and 2 of Basic Characteristics of the Normal Distribution, we know that  – ȳ is normally distributed with mean


and standard deviation


Hypothesis Testing: When the population is normal or the sample sizes are sufficiently large, we can use the above theorem to compare two population means. The theorem requires that the population standard deviations be known, which is usually not the case. Often, especially with large samples, the standard deviation of the samples can be used as an approximation for the population standard deviations. We can also employ the t-test (see Two Sample t-Test with Equal Variances and Two Sample t-Test with Unequal Variances) which doesn’t require that the variances be known, and is especially useful when the sample sizes are small.

Excel Tools: Excel provides a data analysis tool called z-Test: Two Sample for Means to automate the hypothesis testing process (as shown in Example 1).

Example 1: The average height of 5 year old boys in a certain country is known to be normally distributed with mean 95 cm and standard deviation 16 cm. A firm is selling a nutrient which it claims will significantly increase the height of children. In order to demonstrate its claim it selects a random sample of 60 four year old boys, half of whom are given the nutrient for one year and half of whom are not. Given that the heights of the boys at 5 years of age are as in the Figure 1, determine whether the nutrient is effective in increasing height. 

Two sample means test

Figure 1 – Two sample test using z-scores

In addition to the raw data, Figure 1 shows how to calculate the z-score for the difference between the sample means based on a normal population with a known standard deviation of 16 (i.e. a known variance of 162 = 256). Here the null hypothesis H0 is


or equivalently

This is a two-tail test, which is why the p-value (in cell I12) is doubled. Since p-value = .008 < .05 = α, we reject the null hypothesis, and conclude there is a significant difference between the boys that take the nutrients and those that don’t.

We can also use Excel’s data analysis tool to automatically calculate the z-score from the sample data (although we must first reorganize the data in the form of either a single row or single column). Figure 2 shows the output of the data analysis tool for Example 1.

Z test Excel

Figure 2 – Output of z-Test: Two Sample for Means data analysis tool

Looking at the two-tail results, we see once again that .008 < .05 (or alternatively |z| = 2.65 > 1.96 = z-crit), and so we reject the null hypothesis.

13 Responses to Comparing two means when variances are known

  1. Tong Sin Keng says:


    Thank you for the reply. Suppose I wish to test if two dices have the mean and SD by taking large samples, the SD formula in the theorem increases it a factor of sqrt(2) compared with the single SD z test. The consequence is that the Theorem leads to a z-score lower by a factor of sqrt(2). Could you please comment on this. Many thanks in advance.

    • Charles says:

      This is the way the mathematics works out, at least when the two samples have the same size and standard deviation. I guess one way to look at this is that with two samples you have added standard deviation from the mean (which in this case is the difference between two means).

  2. Tong Sin Keng says:

    Charles, Thank you very much for the article. I am new to application of statistics. I am writing a document on methods of measuring the uniformity of the distribution of sequences created by random number generators and irrational number. The theorem is just what I have been looking for. Your website will referenced. Does the Theorem have a name? Many thanks

  3. Jonathan Bechtel says:

    Hi Charles,

    2 Questions:

    Since the question is…….”Is the nutrient significantly better at INCREASING height?”…….wouldn’t that imply a 1-tailed test and hence more appropriate to use NORMSINV(.05) to arrive at the most appropriate Z-Crit value?

    You used NORMSDIST in this example, but omitted using NORMDIST. I re-ran the results using NORMDIST(95, 106.69, 4.131182, TRUE) and got a different answer. My intuition is that NORMDIST is best for 1 sample testing and not two sample testing, which is why you only used NORMSDIST instead. Am I correct here or is there something I’m missing?

    • Charles says:


      Just because you are interested in increasing height does not mean that you should use a one-tailed test. You could use a one tailed test, if you are certain that the nutrient won’t decrease height. Usually it is safer to use the two-tailed test.

      NORMDIST(x, m, s, TRUE) is equivalent to NORMSDIST((x-m)/m) and has nothing to do with 1 or 2 sample testing. For the problem on the webpage, the equivalent version of the p-value using NORMDIST instead of NORMSDIST is =2*NORMDIST(95.74-106.69,0,4.13,TRUE).


  4. Isaac Hayford says:

    Can you please help me solve the ff question: In Norway, the distribution of birth weights for full-term infants whose gestational age is 40 weeks and whose mothers did not smoke during pregnancy is approximately normal with mean 3500 grams and standard deviation 430 grams (Bellinger et al., 1995; New England Journal of Medicine 332:549-555). An investigator plans to conduct a study to determine whether or not the mean birth weight of full-term babies whose mother smoked throughout pregnancy is different from that of the non-smoking mothers.

    Suppose the investigator believes that the true mean birthweight for the infants from smoking mothers could be as low as 3200 grams or as high as 3800 grams (i.e. he anticipates conducting a two-tailed test) with the true variability being the same within each of the two groups. He intends to design a balanced CRD (i.e. equal sample sizes) in weighing babies from randomly selected mothers from each of the two groups.

    a) Now, the investigator wants to risk a 10% or less chance of failing to detect a mean difference between the two groups of mothers. Suppose the investigator intends to eventually analyze the data assuming that the variance(s) are known. What sample sizes per each of the two groups would needed for this study?

    b) Obviously, the investigator will not able to assume the variance(s) as known when he analyzes the data and intends to publish the results. Readdress the question in (a) given this more normal circumstance.

    c) What power would be afforded from sample sizes of 10 babies per each of the two groups if a conventional t-test was going to be used to analyze the data?

    d) What should be the sample sizes for the two groups if the investigator desires the 95% t-based CI on the mean difference to be no greater than 50 grams?

  5. Anurag says:

    Hi, I want to know how did you calculate population variance??

  6. Celina says:


    Can I follow example 1 even if the sample sizes are different?

    Basically I want to compare the mean of two samples with different sample sizes (in Excel). I have the mean, the variance and the sample size for both.


Leave a Reply

Your email address will not be published. Required fields are marked *