# Method of Least Squares Detailed

Theorem 1: The best fit line for the points (x1, y1), …, (xn, yn) is given by

where

Proof: Our objective is to minimize

For any given values of (x1, y1), … (xn, yn), this expression can be viewed as a function of b and c. Calling this function g(b, c), by calculus the minimum value occurs when the partial derivatives are zero.

Transposing terms and simplifying,

Since  $\sum\nolimits_{i=1}^n (x_i-\bar{x})$ = 0, from the second equation we have c = ȳ, and from the first equation we have

The result follows since

Alternative Proof: This proof doesn’t require any calculus. We first prove the theorem for the case where both and y have mean 0 and standard deviation 1. Assume the best fit line is y = bx + a, and so

for all i. Our goal is to minimize the following quantity

Now minimizing z is equivalent to minimizing z/n, which is

since = ȳ = 0. Now since a2 is non-negative, the minimum value is achieved when a = 0. Since we are considering the case where x and y have standard deviation of 1, $s_x^2 = s_y^2 = 1$, and so expanding the above expression further we get

since

Now suppose b = r – e, then the above expression becomes

Now since e2 is non-negative, the minimum value is achieved when e = 0. Thus b = r – e = r. This proves that the best fitting line has the form y = bx + a where b = r and a = 0, i.e. y = rx.

We now consider the general case where the x and y don’t necessarily have mean of 0 and standard deviation of 1, and set

x′ = (x)/sx and y′ = (y – ȳ)/sy

Now x′ and y′ do have mean of 0 and standard deviation of 1, and so the line that best fits the data is y′ = rx′, where r = the correlation coefficient between x′ and y′. Thus the best fit line has form

or equivalently

where b = rsy/sx. Now note that by Property B of Correlation, the correlation coefficient for x and y is the same as that for x′ and y′, namely r.

The result now follows by Property 1. If there is a better fit line for x and y, it would produce a better fit line for x′ and y′, which would be a contradiction.

### 2 Responses to Method of Least Squares Detailed

1. Andreas Engel says:

Hi Charles,
Googling for a good answer on how to calculate the confidence limits of a linear regression I found your text. It is useful indeed.
Andreas

2. Thank you Charles, I was looking everywhere for this Derivation!