The 95% confidence interval for the forecasted values ŷ of *x* is

This means that there is a 95% probability that the true linear regression line of the population will lie within the confidence interval of the regression line calculated from the sample data.

**Figure 1 – Confidence vs. prediction intervals**

In the graph on the left of Figure 1, a linear regression line is calculated to fit the sample data points. The confidence interval consists of the space between the two curves (dotted lines). Thus there is a 95% probability that the true best-fit line for the population lies within the confidence interval (e.g. any of the lines in the figure on the right above).

There is also a concept called **prediction interval**. Here we look at any specific value of *x*, *x _{0}*, and find an interval around the predicted value ŷ

_{0}for

*x*such that there is a 95% probability that the real value of y (in the population) corresponding to

_{0}*x*is within this interval (see the graph on the right side of Figure 1).

_{0}The 95% prediction interval of the forecasted value ŷ_{0} for *x _{0}* is

where the **standard error of the prediction** is

For any specific value *x _{0}* the prediction interval is more meaningful than the confidence interval.

**Example 1**: Find the 95% confidence and prediction intervals for the forecasted life expectancy for men who smoke 20 cigarettes in Example 1 of Method of Least Squares.

**Figure 2 – Confidence and prediction intervals for data in Example 1**

Referring to Figure 2, we see that the forecasted value for 20 cigarettes is given by FORECAST(20,B4:B18,A4:A18) = 73.16. The confidence interval, calculated using the standard error 2.06 (found in cell E12), is (68.70, 77.61).

The prediction interval is calculated in a similar way using the prediction standard error of 8.24 (found in cell J12). Thus life expectancy of men who smoke 20 cigarettes is in the interval (55.36, 90.95) with 95% probability.

**Example 2**: Test whether the y-intercept is 0.

We use the same approach as that used in Example 1 to find the confidence interval of ŷ when *x* = 0 (this is the y-intercept). The result is given in column M of Figure 2. Here the standard error is

And so the confidence interval is

Since 0 is not in this interval, the null hypothesis that the y-intercept is zero is rejected.

Dr. Zaiontz,

Very neat and concise example. I’m particularly interested in a one sided C.I. (lower bound)

Would you agree to use

\hat{y} – t_{crit} s.e.

where t_{crit} should be calculated in Excel using =TINV(2*\alpha,df),

where \alpha = 1-p?

Regards,

Joaquin

Joaquin,

I believe that what you wrote is correct.

Charles

Hi,

Whats the formula in J12? Cannot get the same results…

Thanks

/Kristian

Hi Kristian,

J12 contains the same value as cell E9. The formula in E9 is =FORECAST(E8,B4:B18,A4:A18).

Charles

Hi Charles,

I’m refering to J12, not J11 J12 contains the formula for se (prediction standard error) and formula result i 8.236857, which I cannot get by using the exact same numbers you do.

What formula is in cell J12??

I think it is in the (x – x_)^2 that something is wrong!

Thanks

/ristian

Hi Kristian,

The formula in cell J12 is =E10*SQRT(1+1/E5+(E8-E7)^2/E11).

Charles

Hi Charles,

Great. Thank u.

/Kristian

Please help how u got value of SSx which I suppose to be:-271.6

Anu,

SSx (cell E11) is calculated by the formula =DEVSQ(A4:A18). It has the value 2171.6.

Charles