**Univariate case**

When the variances of the two populations are unequal (as indicated by notably unequal sample variances), we use a modified version of the t-test. In particular we use the following t-statistic

We now test the null hypothesis H_{0}: *μ _{x} = μ*

_{y}using the fact that

*t*~

*T*(

*m*) where

*m*is defined as

(see Two Sample t Test with Unequal Variances). As we have seen several times now, this is equivalent to

where *t ^{2}* can be expressed as:

where* z̄* = *x̄* – ȳ and *µ _{z} = µ_{x} – µ*

_{y}.

**Multivariate case**

We now look at a multivariate version of the problem, namely to test whether the population means of the *k* × 1 random vectors *X* and *Y* are equal, i.e. the null hypothesis H_{0}: *μ _{X} = μ_{Y}*, under the assumption that the covariance matrices are not necessarily equal.

**Definition 1**: The **modified two sample Hotelling’s T-square test statistic** is

**Observation**: Note the similarity between the expression for *T*^{2} and the expression for *t*^{2} given above. Also note that if *n _{X} = n_{Y}*, then this definition of

*T*

^{2}is equivalent to that in Definition 1 of Hotelling’s T-square for Independent Samples.

**Theorem 1**: For *n _{X} *and

*n*sufficiently large,

_{Y}*T*

^{2}~

*χ*

^{2}(

*k*).

**Observation**: For small *n _{X} *and

*n*,

_{Y}*T*

^{2}is not sufficiently accurate and a better estimate is achieved using the following theorem

**Theorem 2**: Under the null hypothesis,

where *n* = *n _{X} *+

*n*1 and

_{Y }–*m*is defined as follows:

If *F > F _{crit}* then we reject the null hypothesis.

**Example 1**: Repeat Example 1 of Hotelling’s T-square for Independent Samples using the data in Figure 1.

**Figure 1 – Data for Example 1**

Once again, we employ Box’s test, obtaining the results shown in Figure 2.

**Figure 2 – Box’s Test for Example 1**

This time we see that p-value < *α* = .05, and so we conclude that there is evidence that the covariance matrices are unequal (or that the data is not multivariate normally distributed), although we have somewhat forced the issue since usually a significance level of *α* = .001 instead of *α* = .05 is used for Box’s Test.

As a result we will use the *T*^{2} test with unequal covariance matrices. This analysis is shown in Figure 3.

**Figure 3 – Analysis for Example 1**

We conclude there is a significant difference between the drug and the placebo in treating the symptoms.

**Confidence intervals**

The simultaneous 1 – *α* confidence interval for *μ _{i} *is given by the expression

For *n* sufficiently large, we could use the following expression instead

Once again we can use Bonferroni confidence intervals instead.

Dear Prof. Charles Zaiontz,

I found your website as I searched for multivariate versions of t-test for unequal covariance. Your description really helps me, but I’d like to see reference papers about formulations on the multivariate case of Hotelling’s T-square test with unequal covariance matrices. I’d appreciate it if you’d help me with it.

Best regards,

K. Lee.

K. Lee,

Please see the Bibliography webpage on the Real Statistics website. A good reference is [PS2].

Charles

How is the calculation for p-value made in this case, if n_x and x_y are different?

Kay,

The calculation for the p-value shown on this webpage does not require that n_X = n_Y. In fact in the Example given n_X is not equal to n_Y. The calculation is shown in Figure 3 (using Theorem 2).

Charles

Charles,

I love this website. Thank you.

I’m currently dealing with a zero inflated dataset. I’m applying the Hotelling’s T2 test to these data and was wondering if there are issues with having a lot of zeros in the data.

I’m comparing two sites (control and experimental) using fish densities at 1 m depth increments (total of 15). The sample size for each 1 m depth increment is 50. So a matrix that is [50,15].

Garrett,

Thanks for letting me know that you love the website.

Regarding your question, unfortunately I don’t have any experience with Hotelling’s T2 test with zero-inflated data, and so I am not able to answer your question.

Charles

This is what I was looking for. Thank You for Your work.

Dear Charles,

Thank you very much for this website, it helps a ton in helping me understand Hotelling’s test.

I was wondering whether you can explain to me why Hotelling’s test uses the F distribution? I cannot seem to connect the test with the F distribution.

Thank you a lot for your help,

Lakyn

Lakyn,

To give a precise answer to your question would require the proof of Theorem 2 on the referenced webpage, which is too technical for our purposes. One way to motivate why the F distribution might be involved:

– the univariate version of the Hotelling’s test is the t-test, which uses the t distribution

– the t distibution can be expressed via the F distribution since if t has distribution T(df), then t^2 has distribution F(1,df)

Charles

Dear Charles,

Thank you for your reply!

I was just wondering where is the referenced webpage? The proof is something that I will be very interested in looking at.

Sincerely,

Lakyn

Lakyn,

The referenced webpage is simply the webpage where you made your comment (Hotelling’s T-square Test with Unequal Covariance Matrices in this case).

I believe the following website has the proof (although I have not read this article myself):

http://www.math.sci.hiroshima-u.ac.jp/stat/TR/TR12/TR12-19.pdf

Charles

Dear Prof. Charles Zaiontz,

I studied the Hotelling test in order to evaluate if a unique sample (I have just one value for each variable, so n_x=1) could belong to a population, whose I have n_y values of the different variables. In your opinion, is the Hotelling procedure with unequal covariance matrix adapt? How can I overcome to the problem of obtaining zero to the denominator (probably saying that I have 2 equal observations of the p variables)? If in your opinion this is not the right procedure, can you suggest me a more adapt one, please?

Thanks for your attention.

Best regards

Ensia,

Are you saying that your sample consists of just one element in each group? In this case, you should expect much no matter what statistical test you use.

Charles