# Hotelling’s T-square Test Additional Topics

### Confidence Intervals

Since we know there is a significant difference between drug and placebo in treating at least one of the 3 symptoms, we would like to identify which symptoms are different.

Example 1: For the data from Example 2 of Hotelling’s T2 for Independent Samples, determine for which symptoms the drug is significantly different from the placebo.

As we did in the one-sample and paired sample cases we now seek to find confidence intervals for each of the symptoms. Once again we consider both the simultaneous 95% confidence intervals and the Bonferroni 95% confidence intervals.

To determine the simultaneous 95% confidence intervals, we note (as in the one-sample case) that the 1 – α confidence hyper-ellipse for the population mean difference vector μ = μXμY is given by

where T2 is as in Definition 1 of Hotelling’s T2 for Independent Samples. Thus we are looking for values of μX – μY which fall within the hyper-ellipse given by the equation

From the 1 – α confidence hyper-ellipse, we can also calculate simultaneous confidence intervals for any linear combination of the means of the individual random variables. For example, for the linear combination

(where μi= μX,i – μY,i), the simultaneous 1 – α confidence interval is given by the expression

where the pooled covariance matrix S = [sij].

For the case where c = μi the simultaneous 1 – α confidence interval is given by the expression

The simultaneous confidence intervals for Example 1 are as shown in Figure 1.

Figure 1 – Simultaneous 95% confidence intervals

Since 0 is in the confidence interval for Pressure and Aches, we conclude there is no significant difference between the drug and placebo for these symptoms.

Since the end points of the confidence interval for Fever are both negative, we conclude that patients taking the drug have significantly less fever than those who take the placebo.

As in the one-sample case, if we are only interested in looking at single variables and not linear combinations, we will be better off using Bonferroni confidence intervals since these intervals will tend to be narrower. We now turn our attention to this analysis and use the following formulas for the 1 – α confidence intervals:

where the tcrit is based on a significance level of α/k. The relevant calculations are given in Figure 2.

Figure 2 – Bonferroni 95% confidence intervals

The confidence intervals in Figure 2 are narrower than those in Figure 1, but the results are similar.

### Effect size

The Mahalanobis Distance can be used as a measure of effect size, where

For Example 1 this is

### Assumptions

Hypothesis testing using the T2 statistic for two independent random vectors X and Y is based on the following assumptions:

1. Each of the random vectors has a common population mean vector
2. X and Y have a common population covariance matrix Σ
3. X and Y are multivariate normally distributed
4. Each of the samples is done randomly and independently

Normality: That X and Y are normally distributed implies that each variable in X and Y is normal (or at least roughly symmetric). This can be tested as described in Testing for Normality and Symmetry (box plots, QQ plots, histograms, etc.). You can also produce a scatter diagram for each pair of variables in X and each pair of variables in Y. If the random vectors are multivariate normally distributed then each plot should look roughly like an ellipse. These are not sufficient to show that X and Y are multivariate normally distributed, but it may be the best you will be able to do. Fortunately, Hotelling’s T-square test is relatively robust to violations of normality.

Also, if nX and nY are sufficiently large then the Multivariate Central Limit Theorem holds and so we can assume that the normality assumption is met.

Common covariance matrix: In the univariate case for two sample hypothesis testing of the means, the t-test can be used provided the variances of the two samples are not too different, especially if the sample sizes are equal.

Similarly in the multivariate case, Hotelling’s T-square test can be used provided nX = nY and the sample covariance matrices don’t look too terribly different.

We can use Box’s test to check the null hypothesis that the two sample covariance matrices are equal. The caution here is that this test is very sensitive to violations of normality (even though the Hotelling’s T-square test is not very sensitive to such violations). For Example 1 of Hotelling’s T2 for Independent Samples, Box’s test yields the results shown in Figure 3.

Figure 3 – Box’s test

Since p-value > α = .001, we cannot reject the null hypothesis that the covariance matrices are equal. See Box’s Test for more details about Box’s Test.

### 6 Responses to Hotelling’s T-square Test Additional Topics

1. Almudena says:

Dear Charles,

I’m a phd Student and I have some problems with my data. To sump up, I have one factor with two groups and 2 variables for each one. After remove the incomplete cases and non numeric data, the first group has 72 cases and the second 57. The Box test has a significant outcome, so I reject the null hypothesis of equal covariances matrix. Anyway I tried a MANOVA analysis an T2-Hotelling test. In both of test, I have a significant outcome, but since the simultaneous and Bonferroni intervals are different, with MANOVA the factor accounts for the difference in means vector observed whereas in T2-Hotelling test the two dependent variables are ok. Why?
How should I interpret the results? Because I focus in Hotelling outcome, I meet significant differences but not due to these variables and focusing on MANOVA the same variables play a role in the significant difference?

Thanks in advance and congratulations for the really useful website.

• Charles says:

Almudena,
I am pleased that you have found the website useful.
Sorry, but without seeing your data and analysis, I am unable to comment on the issues that you have raised.
Charles

• Almudena says:

Charles,
May I send you a mail with my excel file (data and analysis)?

Thanks

2. Marco says:

Hello. Awesome web.

In Figure 1, how do you calculate “s” and “se”? I’m trying in Excel and I don’t reach the same values. I’m writing something wrong.

• Charles says:

Marco,
For Fever s_1 (cell N29) is calculated by =SQRT(((N19-1)*N27+(N20-1)*N28)/(N19+N20-2))
se_1 (cell N30) is calculated by =N29*SQRT(1/N19+1/N20)
Charles