**Univariate case**

A random variable *x* has normal distribution if its probability density function (pdf) can be expressed as

Here *e* is the constant 2.7183…, and *π* is the constant 3.1415…

The normal distribution is completely determined by the parameters *μ* (mean) and *σ* (standard deviation). We use the abbreviation *N*(μ, *σ*) to refer to a normal distribution with mean μ and standard deviation *σ, *although for comparison with the multivariate case it would actually be better to use the abbreviation *N*(μ, *σ*^{2}) where *σ*^{2} is the variance.

**Multivariate case**

**Definition 1**: A random vector *X* has a **multivariate normal distribution** with vector mean *μ* and covariance matrix *Σ*, written *X* ~ *N*(*μ*, *Σ*) if *X* has the following joint probability density function:

Here |*Σ*| is the determinant of the population covariance matrix Σ. The exponent of *e* consists of the product of the transpose of *X* – *μ*, the inverse of *Σ* and *X* – *μ*, which has dimension (1 × *k*) × (*k* × *k*) × (*k* × 1) = 1 × 1, i.e. a scalar. Thus *f*(*X*) yields a single value. The coefficient (2π)^{k} |Σ| can also be expressed as |2*πΣ|.*

**Definition 2**: The expression

which appears in the exponent of *e,* is called the squared **Mahalanobis distance** between *X* and *μ*. We can also define the squared Mahalanobis distance for a sample to be

Where *S* is the sample covariance matrix and *X̄* is the sample mean vector. In MANOVA we give an example of how to calculate this value and also introduce the supplemental function **MDistSq** which calculates this value automatically.

**Observation**: If *k* = 1 then the above definition is equivalent to the univariate normal distribution. If *k* = 2 the result is a three dimensional bell shaped curve (as described in Figure 1).

**Property 1**: If *X* ~ *N*(*μ*, *Σ*) where all the *x _{j}* in

*X*are independent, then the population covariance matrix is a diagonal matrix [

*a*] with

_{ij}*a*= for all

_{jj}*j*and

*a*= 0 for all

_{ij}*i*≠

*j*, and so the joint probability function simplifies to

where each is the univariate normal pdf of *x _{j}* with mean

*μ*and standard deviation

_{j}*σ*.

_{j}**Property 2**: If y = = *C*^{T}*X*, where *C* = the *k* × 1 vector [*c _{j}*], and

*X*~

*N*(

*μ, Σ*) then y has normal distribution with mean =

*C*

^{T}μ

*and variance =*

*C*

^{T}

*ΣC*; i.e. y ~

*N*(

*C*).

^{T}μ,C^{T}ΣC**Observation**: The unbiased estimates for population mean and population variance are given by the sample mean = *C*^{T}*X̄* and sample variance = *C*^{T}*SC*, where *s _{ij} *= cov(

*x*).

_{i}, x_{j}**Observation**: When *k* = 2, the joint pdf of *X* depends on the parameters *μ*_{1}, *μ*_{2}, *σ*_{1}, *σ*_{2}, and *ρ*. A plot of the distribution for different values of the correlation coefficient *ρ* is displayed in Figure 1.

**Figure 1 – ****Bivariate normal density function**

**Observation**: Suppose *X* has a multivariate normal distribution. For any constant *c*, the set of points *X* which have a Mahalanobis distance from *μ* of *c* sketches out a *k*-dimensional ellipse. The value of the probability density function at all these points is the constant

Let’s take a look at the situation where *k* = 2. In this case, we have

Finally, note that the equation

is an ellipse with foci at μ = (μ_{1}, μ_{2}).

**Observation**: Note that when *ρ* = 0, indicating that *x*_{1} and *x*_{2} are uncorrelated, then the ellipse takes the form (*z*_{1} – *z*_{2})^{2} = *c*^{2} which is a circle. When *ρ* = ±1, indicating that *x*_{1} and *x*_{2} are completely correlated, then the ellipse becomes a straight line.

**Observation**: Using the above calculations, when *k* = 2 the multivariate normal pdf is

**Property 3**: If *X* ~ *N*(*μ, Σ*), then the squared Mahalanobis distance between *X* and *μ* has a chi-square distribution with *k* degrees of freedom.

**Observation**: This property is an extension of Corollary 1 of Chi-square Distribution. We can interpret the property as follows. Let *c* = the critical value of the chi-square distribution with *k* degrees of freedom for α = .05. Then the probability that *X* will fall within the ellipse defined by *c*, i.e. (*X–μ*)^{T} *Σ*^{-1} (*X–μ*) = *c*^{2}, is 1 – *α* = .95.

**NOTE**: The Real Statistics supplemental functions and data analysis tools are not yet available. They will be provided in the Real Statistics Resource Pack Release 2.0 which will be available shortly.

Dr. buneos días, disculpe que significa pdf y cdf, podría ser que

pdf es lo mismo que función de densidad de probabilidad?

Pero cdf?

Muchas gracias

Dr Zaionts, good morning, excuse me what mean PDF and CDF values?

Could be PDF= probability density function?

But CDF?

Thank you

Gerardo,

Yes, PDF = probability density function. CDF = cumulative distribution function.

Charles

Thank you