Multivariate Normal Distribution Basic Concepts

Univariate case

A random variable x has normal distribution if its probability density function (pdf) can be expressed as

Multivariate normal density function

Here e is the constant 2.7183…, and π is the constant 3.1415…

The normal distribution is completely determined by the parameters μ (mean) and σ (standard deviation). We use the abbreviation N(μ, σ) to refer to a normal distribution with mean μ and standard deviation σ, although for comparison with the multivariate case it would actually be better to use the abbreviation N(μ, σ2) where σ2 is the variance.

Multivariate case

Definition 1: A random vector X has a multivariate normal distribution with vector mean μ and covariance matrix Σ, written X ~ N(μ, Σ) if X has the following joint probability density function:

Multivariate normal density function

Here |Σ| is the determinant of the population covariance matrix Σ. The exponent of e consists of the product of the transpose of Xμ, the inverse of Σ and Xμ, which has dimension (1 × k) × (k × k) × (k × 1) = 1 × 1, i.e. a scalar. Thus f(X) yields a single value. The coefficient  (2π)k |Σ| can also be expressed as |2πΣ|.

Definition 2: The expression

Mahalanobis distance squared

which appears in the exponent of e, is called the squared Mahalanobis distance between X and μ. We can also define the squared Mahalanobis distance for a sample to be

Mahalanobis distance squared sample

Where S is the sample covariance matrix and  is the sample mean vector. In MANOVA we give an example of how to calculate this value and also introduce the supplemental function MDistSq which calculates this value automatically.

Observation: If k = 1 then the above definition is equivalent to the univariate normal distribution. If k = 2 the result is a three dimensional bell shaped curve (as described in Figure 1).

Property 1: If X ~ N(μ, Σ) where all the xj in X are independent, then the population covariance matrix is a diagonal matrix [aij] with ajj= {\sigma}_j^2 for all j and aij = 0 for all ij, and so the joint probability function simplifies to

image9015

where each {f}_{\mu_j ,\sigma_j}(x_j) is the univariate normal pdf of xj with mean μj and standard deviation σj.

Property 2: If y = \sum_{j=1}^{k} c_j x_j = CTX, where C = the k × 1 vector [cj], and X ~ N(μ, Σ) then y has normal distribution with mean \sum_{j=1}^{k} c_j \mu_jCTμ and variance \sum_{i=1}^{k} \sum_{j=1}^{k} c_i c_j \sigma_{ij} = CTΣC; i.e. y ~ N(CTμ,CTΣC).

Observation: The unbiased estimates for population mean and population variance are given by the sample mean \sum_{j=1}^{k} c_j \bar{X}_j = CT and sample variance \sum_{i=1}^{k} \sum_{j=1}^{k} c_i c_j s_{ij} = CTSC, where sij = cov(xi, xj).

Observation: When k = 2, the joint pdf of X depends on the parameters μ1, μ2, σ1, σ2, and ρ. A plot of the distribution for different values of the correlation coefficient ρ is displayed in Figure 1.

bivariate-normal-7 bivariate-normal-0

Figure 1 – Bivariate normal density function

Observation: Suppose X has a multivariate normal distribution. For any constant c, the set of points X which have a Mahalanobis distance from μ of c sketches out a k-dimensional ellipse. The value of the probability density function at all these points is the constant

image9023

Let’s take a look at the situation where k = 2. In this case, we have

image9024

Thus
image9025
and so
image9026
Hence
image9027
where
image9028

Finally, note that the equation

image9029

is an ellipse with foci at μ = (μ1, μ2).

Observation: Note that when ρ = 0, indicating that x1 and x2 are uncorrelated, then the ellipse takes the form (z1 – z2)2 = c2 which is a circle. When ρ = ±1, indicating that x1 and x2 are completely correlated, then the ellipse becomes a straight line.

Observation: Using the above calculations, when k = 2 the multivariate normal pdf is

Bivariate normal density function

Property 3: If X ~ N(μ, Σ), then the squared Mahalanobis distance between X and μ has a chi-square distribution with k degrees of freedom.

Observation: This property is an extension of Corollary 1 of Chi-square Distribution. We can interpret the property as follows. Let c = the critical value of the chi-square distribution with k degrees of freedom for α = .05. Then the probability that X will fall within the ellipse defined by c, i.e. (X–μ)T Σ-1 (X–μ) = c2, is 1 – α = .95.

NOTE: The Real Statistics supplemental functions and data analysis tools are not yet available. They will be provided in the Real Statistics Resource Pack Release 2.0 which will be available shortly.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>