Another non-linear regression model is the **power regression** model, which is based on the following equation:

Taking the natural log (see Exponentials and Logs) of both sides of the equation, we have the following equivalent equation:

This equation has the form of a linear regression model (where I have added an error term *ε*):

**Observation**: A model of the form ln y = *β* ln* x + δ* is referred to as a **log-log regression** model. Since if this equation holds, we have

it follows that any such model can be expressed as a power regression model of form y = *αx ^{β} *by setting

*α = e*.

^{δ}**Example 1**: Determine whether the data on the left side of Figure 1 is a good fit for a power model.

**Figure 1 – Data for Example 1 and log-log transformation**

The table on the right side of Figure 1 shows y transformed into ln y and *x* transformed into ln *x*. We now use the Regression data analysis tool to model the relationship between ln y and ln *x*.

**Figure 2 – Log-log regression model for Example 1**

Figure 2 shows that the model is a good fit and the relationship between ln *x* and ln y is given by

Applying *e* to both sides of the equation yields

We can also see the relationship between *x* and y by creating a scatter chart for the original data and choosing **Layout > Analysis|Trendline** in Excel and then selecting the Power Trendline option (after choosing More Trendline Options). We can also create a chart showing the relationship between ln *x* and ln y and use Linear Trendline to show the linear regression line (see Figure 3).

**Figure 3 – Trend lines for Example 1**

As usual we can use the formula described above for prediction. For example, if we want the y value corresponding to *x* = 26, using the above model we get

Excel doesn’t provide functions like TREND/GROWTH (nor LINEST/LOGEST) for power/log-log regression, but we can use the TREND formula as follows:

=EXP(TREND(LN(B6:B16),LN(A6:A16),LN(26)))

to get the same result.

**Observation**: Thus the equivalent of the array formula GROWTH(R1, R2, R3) for log-log regression is =EXP(TREND(LN(R1), LN(R2), LN(R3))).

**Observation**: In the case where there is one independent variable *x*, there are four ways of making log transformations, namely

level-level regression: y = *βx + α*

log-level regression: ln y = *βx + α*

level-log regression: y = *β *ln* x + α*

log-log regression: ln y = *β *ln* x + α*

We dealt with the first of these in ordinary linear regression (no log transformation). The second is described in Exponential Regression and the fourth is power regression as described on this webpage. We haven’t studied the level-log regression, but it too can be analyzed using techniques similar to those described here.

Hi Charles,

I used this method for a project at work and got estimates for a and β. so now that I have the estimate for y being y=ax^β I want to put a confidence interval around y.

I can obtain a confidence interval for both a and β, but I am not sure what error propagation technique to use to get a confidence interval for y.

Any help would be greatly appreciated!

Thanks,

Stephen

Ok, I think I need to clarify this a bit.

Using the above example I can get the s.e. of a=exp(2.813)*.206 and I can get the s.e. of β=exp(.234)*.068

How do I combine the s.e. of a and the s.e. of β to get the s.e. of y?

Stephen,

You don’t calculate the standard error of y this way. Instead the s.e. is equal to the square root of MSE. This is explained after Figure 5 of the following webpage: http://www.real-statistics.com/multiple-regression/multiple-regression-analysis/multiple-regression-analysis-excel/

Charles

Thanks for the response! That was definitely helpful but I am still kind of stuck…

In your example under figure 3 you get the formula for estimating x when x=26 as y = 35.748.

It makes sense how we get there but I am confused on how to get a confidence interval around y = 35.748 — and also for any other y given x.

Hopefully that makes sense. Thanks for all of the help!!

Stephen,

Essentially a “power” regression is a transformation of variables to obtain an ordinary linear regression model. For an ordinary linear regression model you can obtain confidence or prediction intervals as described on the following webpage:

http://www.real-statistics.com/regression/confidence-and-prediction-intervals/

You just need to perform the inverse transformation on the end points of this interval to obtain (an estimate of) the interval that you are looking for.

Charles

Charles,

To make it easier to interpret the coefficients and predicting, what equation would you use in the example I provided for the ln model vs log model?

I posted the regression outcome of the same data set taking the ln of y and x’s and log of y and x’s.

Finally, I wondered if the log log coefficients represented % changes. So, a +1%y= x%.

Joe,

In comparing a ln model with a log model, note that ln(x) = log(x)/log(e). Thus these models are identical except for a constant multiplier.

Charles

Charles,

Love this blog, awesome info!

Question, I’m trying to create a price elasticity model that has other variables (multiple regression) that come into play. When I log or ln transform the y and x’s, both have great fits. My problem is using either set of coefficients to predict. I may be doing it right, but I want to be sure.

LN model

Intercept = -6.4

Discount % = .198

Ad % = .843

Log model

Intercept = .03349

Discount % = .013558

Ad % = .133

How would you deal with these to predict?

Thanks!!!’

Joe,

For ordinary linear regression you can do prediction using the TREND function as explained on one of the following webpages:

http://www.real-statistics.com/regression/regression-analysis/

http://www.real-statistics.com/multiple-regression/multiple-regression-analysis/multiple-regression-analysis-excel/

For any value of x (or x’s) this yields a forecasted (or predicted value of y). Now if you have transformed both y and x using LN, then you need to reverse the process using exp to get the forecasted value you want.

E.g. Suppose your original equal is y = ax^b. This becomes ln y = b * ln x + ln a, which can be modeled via linear regression. For any given value x0 of x the regression model will provide a forecast of ln y for ln x0 (using the TREND function). Say this forecasted value is z0. then exp(z0) would be the forecasted value you are looking for. E.g. suppose that x0 = 2 and so ln x0 is .693. Now suppose that the forecasted value for .693 from you log-log linear regression model is .2, then the forecast for x0 = 2 should be exp(.2) = 1.22.

Also see the following webpage:

http://www.real-statistics.com/multiple-regression/multiple-regression-log-transformations/

Charles

Charles,

Thank you for the quick reply! I want to make sure I’m understanding what you mean using my example above. If my discount % was 10 and ad % was 80, to predict the LN version I would say y = exp(-6.4)*(10^.198)*(80^.843)? How would I deal with the log version? I’ve seen it should be interpreted as for 1% change in y the coefficients represent the % change ie in my log example -1.3 would be the elasticity (at 10% discount) since a 1% change in discount = 1.3% change in demand.

Sorry Joe, but I don’t understand where you get the expression y = exp(-6.4)*(10^.198)*(80^.843). I also don’t understand your question about how to deal with the LN version.

Charles

Hi,

I run cut tests on various materials and input the force used to cut and the distance moved by the blade to cut through the material into a spreadsheet. The old method of assessing the data was to represent the data graphically and then compare different trend line types to see which “looked” the best. The force required to cut through at 20 mm can then be determined and the material categorised.

I am trying to reduce the amount of human error by using just the equations to determine the best kind of trend line for the data. I am no mathematician and am using the R^2 of the trend lines to determine which trend line is best.

Can you help me with formulas that will give me the R^2 for each trend line type without having to actually produce the graph each time?

Thanks!

James,

See the following webpage for how to calculate R^2

Regression analysis in Excel

Charles

That’s great, thanks!

Hi Charles,

Wonderfully informative site I’ve discovered here. I’m asking for advice on a series of straightforward length-mass regressions. I’m using a power model to develop a series of predictive equations. I can find the SE of both the slope and intercept quite easily using log x, log y transformation and LINEST function in Excel. Yet, I really require the SE of slope and intercept for the power model. Any advice on an approach? Is it appropriate to use the log-log approach and simply “back-transform” the SE values I produce for a and b? Thanks so much for your work on the site!

Yes, this is a reasonable approach.

Charles

Thanks for the quick reply. Again, simply need SE of fitted constants a and b in the power model. The SE of the exponent b was simple. However, for one example, using the log-log approach to obtain estimates of a and its SE yielded -2.4253 and 0.1403. Using base 10 and exponent of -2.4253 returns my fitted constant of a = 0.0038 as in the power model. Great! Yet, using base 10 and exponent 0.1403 to obtain the associated SE returns 1.3815. The end result of a = 0.0038, SE 1.3815 for my power model does not seem reasonable to me (seeing similar results for my other regressions too). In all cases I have r2 > 0.94 and thus exceptionally “good” power and log-log models. As a beginner, I must be missing something…Thanks in advance for the assistance.

LTR,

You shouldn’t take the reverse translation of the standard error, but of the lower and upper ends of the confidence interval.

I understand for the log-log regression model (base 10) you have a = -2.4253 and se = 0.1403. Now assuming you have say n = 10 observations, and so df = n-2 = 8, then the lower end of the 95% confidence interval will be a + se * T.INV.2T(.05,8) = -2.4253-.1403*2.306 = -2.74883 and similarly the upper end is -2.10177.

You now need to take the anti-log base 10 of these values to get a’ = 10^(-2.4253) = .003756 and a confidence interval of (10^(-2.74883), 10^(-2.10177)) = (.001783, .007911).

There are other approaches, but this is the simplest. The same approach is used for the slope.

Charles

Hi Charles,

Thank you very much. I found it very helpful for me. I am trying to solve a similar kind of problem. I have an equation as follows.

Y=C*[(x1)^z1]*[(x2)^z2]*[(x3)^z3]*[(x4)^z4]*[(x5)^z5]

I want to find out the values of C, z1, z2, z3, z4 and z5.

It’s an experimental study. I can solve this problem, if I can take readings of Y, by varying one parameter (among x1, x2… x5) at a time, by maintaining other parameters constant.

But my x1 varies with a change in each other parameter.

First I can solve the following equation for finding C1 and z1 using the procedure you suggested.

Y=C1*[(x1)^z1]

So, from the second step onwards, at every step, I will have an equation as follows

Y=C2*[(x1)^z1]*[(x2)^z2]

in each stage, C2 varies among C2, C3, C4 and C5 and x2 varies among x2, x3, x4 and x5.

When apply LN on both sides, I am getting

ln Y = ln C2 + z1*ln x1 + z2*ln x2

Here, I noticed that z1*ln x1 is a known value, as I already calculated z1 value in step 1, but varies with each set of readings of x2 and Y.

But, I stuck here, I couldn’t go forward to solve this. Please help me.

Sorry, but I don’t completely understand the series of steps that you have outlined, but here is a possible approach. I understood that the step where you get stuck is ln Y = ln C2 + z1*ln x1 + z2*ln x2. Since z1*ln x1 is a known value, this reduces to the form ln y = C3 + z2*ln x2 where C3 = ln C2 + z1*ln x1. Thus you can use regression techniques to find the coefficients C3 and z2 in ln y = C3 + z2*ln x2. Once you know C3 you can solve for C2 using the equation C2 = exp(C3 – z1*ln x1).

Charles

z1*ln x1 is a known value but not a constant; it varies through out the series of readings. When I explaining you the problem, I got an idea. I modified the equation as follows.

ln Y – z1*ln x1 = ln C2 + z2*ln x2

Then the complete LHS has been treated as ln Y and done the regression. Then I got C2 and z2 values. Is this procedure correct?

I can explain my problem in detail with the following example.

x2 Y x1

2.5 22.8 0.689

3 23.6 0.689

3 24 1.379

3.5 24.4 2.068

4 24.8 4.482

5 25.2 6.551

5.5 25.4 8.96

5.5 26 24.13

6 26.4 34.827

6 27.2 45.172

now I calculated ln Y – z1*ln x1 for each row. Then this column has been treated as ln Y and done the regression. Tell me if this is wrong. Sorry if I couldn’t explain you well.

It should work as long as z1 is known.

Charles

Okay, thank you very much sir. You helped me a lot.

Thank you very much, that was very informing, but I am stuck with a similar problem (the herschel-bulkley fluid model); how do we solve a problem like this :

y = a + b*x^c

how can we determine a, b, and c?

Maamar,

If c is a positive integer, then you can use the approach described on the following webpage

http://www.real-statistics.com/multiple-regression/polynomial-regression/polynomial-regression-analysis-tool/

If c is not a positive integer, then you can use a non-linear regression approach which is similar to that explained on the following webpage

http://www.real-statistics.com/regression/exponential-regression-models/exponential-regression-using-solver/

Charles

Thank you very much Charles, that was very helpful!

I tried the solver method, and it worked.

again, thank you Charles.

Maamar

Hello, any bibliographic reference that you recommend to me to study the whole theoretical framework of this regression model? Thank you!

Genaro,

Are you looking to understand the mathematics?

Charles

Hi Charles.

I am conducting research on metal fatigue and this regression model best describes the trend of experimental data. Hence my interest in knowing in depth the theoretical framework of it.

Thank you. Best regards!

Genaro,

I don’t know of any books related to the theoretical framework for metal fatigue. The theoretical framework that I am familiar with are mathematical in nature.

Charles

Charles

I think I did not explain myself well. I apologize for it. My interest is to know the theoretical framework of the potential regression, since this regression model applied to the experimental data obtained in tests of metal fatigue, allows to obtain a better approximation of the variability of the data.

For this reason the request of some bibliographical reference to know more about the potential regression.

Best regards!

Genaro,

There are hundreds of books which which give a theoretical background on regression, but I can’t identify any one book on the subject. The Real Statistics website also includes a lot of information on this topic.

Charles

Charles, you can correct me if I’m wrong, but I am trying to find the standard error of the coefficients and I think it requires an approximation for the intercept that is not shown in the Figure 2. Since we have α = exp(δ), the standard error of α can be calculated with Taylor approximation (https://en.wikipedia.org/wiki/Taylor_expansions_for_the_moments_of_functions_of_random_variables). This results in std(δ) ≈ exp(α) * std(α). So in your case, std(δ) ≈ exp(2.81) * 0.206 ?

Steven,

From Figure 2, we see that δ = ln α = 2.813 with s.e. for δ = .206. Also, as you say, α = exp(δ).

Using a Taylor series approximation, we find in general that if y = g(x), then var(g(x)) = (g'(x))^2 * var(x). This is called the delta method.

In this case g(x) = exp(x) and so g'(x) = exp(x). Thus, the s.e. of α = exp(2.813) * 0.206, which is what you wrote, although I think you mixed up std(δ) with std(α).

Charles

Charles

Hello Charles,

Can you please help me with my equation y=a*(b^x)*u.

Adela,

First take the log of both sides of the equation to get logy = loga + xlogb + logu. If I let y’ = logy, a’ = loga, b’ = logb and u’ = logu, I get the equation

y’ = a’ + b’x + u’

Assuming u is another independent variable, then this can be analyzed using multiple linear regression. If instead u is a constant, then let c = loga + logu, to get the simple linear regression model y’ = b’x + c.

Charles

Charles,

Sorry for my English, i will try to explain .

The model on wich I am working, has more or less the shape of the upper part of an aircraftwing.

I used your idea to find the curve from front to back. And the other axes in the model is of the type y=ax+b. These are the prominent dimensions.

I experienced the problem with Excel, that i could not bent the surface in an apropiate curve in one dimension since it is all lineair, like a flat sheet of metal which you can manipulate.

The result with ln(x) is that de model now has a curve, uses less varibeles, and predicts better.

Rene,

Yes, that is the idea behind using non-linear regression models such as y = b*ln(x) + a. The good news is that if you set z = ln(x) you have a linear model of form y = bz + a and so can use linear regression. You will get a slightly better model if you use a non-linear model, but the linear model usually works pretty well.

Charles

Charles,

Thank you very much, smart solution.

This is also my solution to the problem that Excel Multi Lineair Regression gives a flat plate. Where as there is variable in the collection which has a power function.

Rene,

Sorry, but I don’t understand your question.

Charles

Rene,

Sorry, but I don’t know what a “flat plate” means. I also don’t understand your second sentence. Do you mean, where is the data analysis tool for power regression? You can use the Linear Regression and/or Exponential Regression data analysis tools.

Charles

Hello Charles,

Thank you for your insights here.I happen to have a question on the power law; however, it seems to combine a number of statistical aspects.

I am looking to fit a line on the linear part of a log-log plot of a power law. Unfortunately with excel, the power trendline fitted automatically takes into account the entire data set. I need to ignore the outlying first part. I have tried to look for methods to solve this and somewhere I found a suggestion that to bin my data. Other suggestions were to use maximum likelihood estimation or weighted least squares.

I did try to use Linear regression but it did not help. The biggest problem is where to choose to begin the regression from; what point in the data set?

Do you have any tricks up your sleeve as regards this?

Musa,

Can’t you just restrict your analysis to those points that are on the subset of the curve that you are interested in?

Charles

the power of developed equation is attained when the predicted value are within the range of input data

hi Charles,

Firstly, sorry if my question is not related here. I know one of my IV have no relationship with the DV(corr= 0.07). But I still wanted to put in the equations even though the result of the parameter variable is not significant after regression. The adjusted R square is 0.76 and the whole equation can be trusted. (<0.05). What can I do with the no correlation variables that I want it? Can I transform the particular data? Thank you in advance.

Yuna,

If you want to retain some independent variable in the model for theoretical reasons (based on your domain knowledge), then just keep it in the model and don-t worry about the fact that it is not significant. If you instead want to use some transformation that yields a significant regression coefficient, then make that transformation (I would do this based on some theoretical, not statistical, basis).

Charles

Pheww thank you Charles. However, can we make transformation to the variables if its already no relationship with the DV? Ive tried some method on transformation but only slight changes. Still far from significant. Thank you again Charles.

Yuna,

Here is a an example where a transformation can make a big difference

x y

1 -0.002004008

2 0.001908397

3 1.70797E-05

4 9.54129E-07

5 1.02405E-07

6 1.65383E-08

7 3.54014E-09

The correlation coefficient is .14876. If you use the transformation y –> (1/y + 500)^.1 then the correlation coefficient will be 1.

I don’t know how useful this is, but at least it shows that a transformation can make a difference in the correlation coefficient.

Charles

thank you so much Charles. Wish you are given longevity of health so you can always be here helping us.

In model: ln y = β ln x + α

β is short term elasticity.

How to calculate long term elasticity? I think it is connected with:

ln y = β ln x + β1 ln yt-1 + α

Matija,

I think you are asking me a question about economics, not statistics. It looks like you are looking for a time series model of long term elasticity. The website explains how to model time series and create forecasts based on the resulting model. This part of the website is under construction, but there is already a lot of useful information in the site about this topic.

Charles

Pingback: How many tickets will be sold before Wednesday? …and other burning Powerball questions | The Final Wager

Hi,

Near the end of the page, you explained how to get an X, if you know the Y. You did it like this: =EXP(TREND(LN(B6:B16),LN(A6:A16),LN(26))).

Is there any way to find Y, when you know the X?

Thanks in advance,

Kevin

Kevin,

It depends on which power model you are referring to. For the log-log model, you simply perform regression of log x on log y, and so can you the same Excel formula, exchanging the roles of x and y.

Charles

Are you talking about this?

http://spreadsheetpage.com/index.php/tip/chart_trendline_formulas/

Power Trendline

Equation: y=c*x^b

c: =EXP(INDEX(LINEST(LN(y),LN(x),,),1,2))

b: =INDEX(LINEST(LN(y),LN(x),,),1)

x and y are the data set that you have to generate this formula.

Is it possible to transform a model that has both a power and a linear variable?

My formula is y=a*x^b+z*d, where a*x^b covers what can be considered fixed tasks with improvement over months of time (x) and z*d covers variable support tasks that will scale with the effort z in hours of the people being supported.

I’ve currently set it up using an addition column for y-hat and used solver to estimate a, b, and d by maximizing the r2. I’m rather pleased with the result, however I’m wondering if there’s a way to transform this for use with linest. Also, being that I’m not nor should I ever be considered a mathematician I wonder if there’s anything I’m missing that would cause my results to be in error.

Please note that I also performed multivariable linear and transformed power regressions using linest. The results between my model and the two variable linear model are somewhat close, I just have a conceptual issue with the linear model since it estimates the fixed tasks as being negative if you go far enough in the future. I appreciate any help you can provide.

Thanks,

Jason,

Sorry, but I don’t know any way to use a transformation so that linest can be used.

Charles

Jason,

I may have that same question too, i.e. one predictor variable (x) that has a power relationship with response y, and another predictor (d) that has a linear relationship with y, which I want both together run in same (linear) model.

Probably you can simply run such (linear) model by linearizing (log-transform) all but the d predictor variable:

ln y = ln a + b * ln x + z*d

But, please, anybody confirm that, or correct me if I am wrong.

Jason,

This model looks correct to me. You can address it as a linear model or a non-linear model (e.g. using Solver).

Charles

Hi Charles,

I just wanted some clarification on why do we use a linear trend-line for the log-log transformed data? If we used a power trend-line, would it be less accurate?

Thanks for your help,

Anna

Anna,

The idea of the log-log transformation is to get a linear relationship. For this reason after the transformation you check for a linear trend. For the data before making the transformation, you won’t see a linear relationship and so your would not use a linear trendline.

Charles

Hi,

Thanks for your answer. But I think the same error has also been done in the following page. I’m referring to figure 2.

http://www.real-statistics.com/regression/exponential-regression/

Thanks again Hamed,

I have corrected the error that you detected. Thanks for catching these errors.

Charles

Hi,

In figure 2 the coefficient for Ln x is .23 and the coefficient for intercept is 2.81 but in your equation it has been shown otherwise (Ln y = .23+2.81 Ln x).

What is going on?

Hi Hamed,

Thanks for finding the error. My dyslexia has caught up with me again. I inadvertently exchanged the two parameters. I have now corrected the webpage. Thanks again for catching the error.

Charles