Hotelling’s T-square Test Additional Topics

Confidence Intervals

Since we know there is a significant difference between drug and placebo in treating at least one of the 3 symptoms, we would like to identify which symptoms are different.

Example 1: For the data from Example 2 of Hotelling’s T2 for Independent Samples, determine for which symptoms the drug is significantly different from the placebo.

As we did in the one-sample and paired sample cases we now seek to find confidence intervals for each of the symptoms. Once again we consider both the simultaneous 95% confidence intervals and the Bonferroni 95% confidence intervals.

To determine the simultaneous 95% confidence intervals, we note (as in the one-sample case) that the 1 – α confidence hyper-ellipse for the population mean difference vector μ = μXμY is given by

image9040

where T2 is as in Definition 1 of Hotelling’s T2 for Independent Samples. Thus we are looking for values of μX – μY which fall within the hyper-ellipse given by the equation

image9041

From the 1 – α confidence hyper-ellipse, we can also calculate simultaneous confidence intervals for any linear combination of the means of the individual random variables. For example, for the linear combination

image9042

(where μi= μX,i – μY,i), the simultaneous 1 – α confidence interval is given by the expression

image9056

where the pooled covariance matrix S = [sij].

For the case where c = μi the simultaneous 1 – α confidence interval is given by the expression

image9057

The simultaneous confidence intervals for Example 1 are as shown in Figure 1.

Simultaneous confidence interval Hotellings

Fig 1 – Simultaneous 95% confidence intervals

Since 0 is in the confidence interval for Pressure and Aches, we conclude there is no significant difference between the drug and placebo for these symptoms.

Since the end points of the confidence interval for Fever are both negative, we conclude that patients taking the drug have significantly less fever than those who take the placebo.

As in the one-sample case, if we are only interested in looking at single variables and not linear combinations, we will be better off using Bonferroni confidence intervals since these intervals will tend to be narrower. We now turn our attention to this analysis and use the following formulas for the 1 – α confidence intervals:

Bonferroni confidence interval formulawhere the tcrit is based on a significance level of α/k. The relevant calculations are given in Figure 2.

Bonferroni confidence interval Hotellings

Figure 2 – Bonferroni 95% confidence intervals

The confidence intervals in Figure 2 are narrower than those in Figure 1, but the results are similar.

Effect size

The Mahalanobis Distance can be used as a measure of effect size, where

Mahalanobis distance effect sizeFor Example 1 this is

image9060

Assumptions

Hypothesis testing using the T2 statistic for two independent random vectors X and Y is based on the following assumptions:

  1. Each of the random vectors has a common population mean vector
  2. X and Y have a common population covariance matrix Σ
  3. X and Y are multivariate normally distributed
  4. Each of the samples is done randomly and independently

Normality: That X and Y are normally distributed implies that each variable in X and Y is normal (or at least roughly symmetric). This can be tested as described in Testing for Normality and Symmetry (box plots, QQ plots, histograms, etc.). You can also produce a scatter diagram for each pair of variables in X and each pair of variables in Y. If the random vectors are multivariate normally distributed then each plot should look roughly like an ellipse. These are not sufficient to show that X and Y are multivariate normally distributed, but it may be the best you will be able to do. Fortunately, Hotelling’s T-square test is relatively robust to violations of normality.

Also, if nX and nY are sufficiently large then the Multivariate Central Limit Theorem holds and so we can assume that the normality assumption is met.

Common covariance matrix: In the univariate case for two sample hypothesis testing of the means, the t-test can be used provided the variances of the two samples are not too different, especially if the sample sizes are equal.

Similarly in the multivariate case, Hotelling’s T-square test can be used provided nX = nY and the sample covariance matrices don’t look too terribly different.

We can use Box’s test to check the null hypothesis that the two sample covariance matrices are equal. The caution here is that this test is very sensitive to violations of normality (even though the Hotelling’s T-square test is not very sensitive to such violations). For Example 1 of Hotelling’s T2 for Independent Samples, Box’s test yields the results shown in Figure 3.

Box's M test HotellingFigure 3 – Box’s test

Since p-value > α = .001, we cannot reject the null hypothesis that the covariance matrices are equal. See Box’s Test for more details about Box’s Test.

Leave a Reply

Your email address will not be published. Required fields are marked *