Power Regression

Another non-linear regression model is the power regression model, which is based on the following equation:

image7075

Taking the natural log (see Exponentials and Logs) of both sides of the equation, we have the following equivalent equation:

image7070

This equation has the form of a linear regression model (where I have added an error term ε):

image7071

Observation: A model of the form ln y = β ln x + δ is referred to as a log-log regression model. Since if this equation holds, we have

image7072

it follows that any such model can be expressed as a power regression model of form y = αxβ by setting α = eδ.

Example 1: Determine whether the data on the left side of Figure 1 is a good fit for a power model.

Log-log transformation

Figure 1 – Data for Example 1 and log-log transformation

The table on the right side of Figure 1 shows y transformed into ln y and x transformed into ln x. We now use the Regression data analysis tool to model the relationship between ln y and ln x.

Log-log regression Excel

Figure 2 – Log-log regression model for Example 1

Figure 2 shows that the model is a good fit and the relationship between ln x and ln y is given by

image7107

Applying e to both sides of the equation yields

image7108

We can also see the relationship between x and y by creating a scatter chart for the original data and choosing Layout > Analysis|Trendline in Excel and then selecting the Power Trendline option (after choosing More Trendline Options). We can also create a chart showing the relationship between ln x and ln y and use Linear Trendline to show the linear regression line (see Figure 3).

Log-log transformation graph

Figure 3 – Trend lines for Example 1

As usual we can use the formula  described above for prediction. For example, if we want the y value corresponding to x = 26, using the above model we get

image7109

Excel doesn’t provide functions like TREND/GROWTH (nor LINEST/LOGEST) for power/log-log regression, but we can use the TREND formula as follows:

=EXP(TREND(LN(B6:B16),LN(A6:A16),LN(26)))

to get the same result.

Observation: Thus the equivalent of the array formula GROWTH(R1, R2, R3) for log-log regression is =EXP(TREND(LN(R1), LN(R2), LN(R3))).

Observation: In the case where there is one independent variable x, there are four ways of making log transformations, namely

level-level regression: y = βx + α

log-level regression: ln y = βx + α

level-log regression: y = β ln x + α

log-log regression: ln y = β ln x + α

We dealt with the first of these in ordinary linear regression (no log transformation). The second is described in Exponential Regression and the fourth is power regression as described on this webpage. We haven’t studied the level-log regression, but it too can be analyzed using techniques similar to those described here.

54 Responses to Power Regression

  1. James says:

    Hi,

    I run cut tests on various materials and input the force used to cut and the distance moved by the blade to cut through the material into a spreadsheet. The old method of assessing the data was to represent the data graphically and then compare different trend line types to see which “looked” the best. The force required to cut through at 20 mm can then be determined and the material categorised.

    I am trying to reduce the amount of human error by using just the equations to determine the best kind of trend line for the data. I am no mathematician and am using the R^2 of the trend lines to determine which trend line is best.

    Can you help me with formulas that will give me the R^2 for each trend line type without having to actually produce the graph each time?

    Thanks!

  2. LTR says:

    Hi Charles,

    Wonderfully informative site I’ve discovered here. I’m asking for advice on a series of straightforward length-mass regressions. I’m using a power model to develop a series of predictive equations. I can find the SE of both the slope and intercept quite easily using log x, log y transformation and LINEST function in Excel. Yet, I really require the SE of slope and intercept for the power model. Any advice on an approach? Is it appropriate to use the log-log approach and simply “back-transform” the SE values I produce for a and b? Thanks so much for your work on the site!

    • Charles says:

      Yes, this is a reasonable approach.
      Charles

      • LTR says:

        Thanks for the quick reply. Again, simply need SE of fitted constants a and b in the power model. The SE of the exponent b was simple. However, for one example, using the log-log approach to obtain estimates of a and its SE yielded -2.4253 and 0.1403. Using base 10 and exponent of -2.4253 returns my fitted constant of a = 0.0038 as in the power model. Great! Yet, using base 10 and exponent 0.1403 to obtain the associated SE returns 1.3815. The end result of a = 0.0038, SE 1.3815 for my power model does not seem reasonable to me (seeing similar results for my other regressions too). In all cases I have r2 > 0.94 and thus exceptionally “good” power and log-log models. As a beginner, I must be missing something…Thanks in advance for the assistance.

        • Charles says:

          LTR,

          You shouldn’t take the reverse translation of the standard error, but of the lower and upper ends of the confidence interval.

          I understand for the log-log regression model (base 10) you have a = -2.4253 and se = 0.1403. Now assuming you have say n = 10 observations, and so df = n-2 = 8, then the lower end of the 95% confidence interval will be a + se * T.INV.2T(.05,8) = -2.4253-.1403*2.306 = -2.74883 and similarly the upper end is -2.10177.

          You now need to take the anti-log base 10 of these values to get a’ = 10^(-2.4253) = .003756 and a confidence interval of (10^(-2.74883), 10^(-2.10177)) = (.001783, .007911).

          There are other approaches, but this is the simplest. The same approach is used for the slope.

          Charles

  3. Srikanth says:

    Hi Charles,
    Thank you very much. I found it very helpful for me. I am trying to solve a similar kind of problem. I have an equation as follows.
    Y=C*[(x1)^z1]*[(x2)^z2]*[(x3)^z3]*[(x4)^z4]*[(x5)^z5]
    I want to find out the values of C, z1, z2, z3, z4 and z5.
    It’s an experimental study. I can solve this problem, if I can take readings of Y, by varying one parameter (among x1, x2… x5) at a time, by maintaining other parameters constant.
    But my x1 varies with a change in each other parameter.
    First I can solve the following equation for finding C1 and z1 using the procedure you suggested.
    Y=C1*[(x1)^z1]
    So, from the second step onwards, at every step, I will have an equation as follows
    Y=C2*[(x1)^z1]*[(x2)^z2]
    in each stage, C2 varies among C2, C3, C4 and C5 and x2 varies among x2, x3, x4 and x5.
    When apply LN on both sides, I am getting
    ln Y = ln C2 + z1*ln x1 + z2*ln x2

    Here, I noticed that z1*ln x1 is a known value, as I already calculated z1 value in step 1, but varies with each set of readings of x2 and Y.
    But, I stuck here, I couldn’t go forward to solve this. Please help me.

    • Charles says:

      Sorry, but I don’t completely understand the series of steps that you have outlined, but here is a possible approach. I understood that the step where you get stuck is ln Y = ln C2 + z1*ln x1 + z2*ln x2. Since z1*ln x1 is a known value, this reduces to the form ln y = C3 + z2*ln x2 where C3 = ln C2 + z1*ln x1. Thus you can use regression techniques to find the coefficients C3 and z2 in ln y = C3 + z2*ln x2. Once you know C3 you can solve for C2 using the equation C2 = exp(C3 – z1*ln x1).
      Charles

      • Srikanth says:

        z1*ln x1 is a known value but not a constant; it varies through out the series of readings. When I explaining you the problem, I got an idea. I modified the equation as follows.
        ln Y – z1*ln x1 = ln C2 + z2*ln x2

        Then the complete LHS has been treated as ln Y and done the regression. Then I got C2 and z2 values. Is this procedure correct?

        I can explain my problem in detail with the following example.

        x2 Y x1
        2.5 22.8 0.689
        3 23.6 0.689
        3 24 1.379
        3.5 24.4 2.068
        4 24.8 4.482
        5 25.2 6.551
        5.5 25.4 8.96
        5.5 26 24.13
        6 26.4 34.827
        6 27.2 45.172
        now I calculated ln Y – z1*ln x1 for each row. Then this column has been treated as ln Y and done the regression. Tell me if this is wrong. Sorry if I couldn’t explain you well.

  4. Maamar Dliouah says:

    Thank you very much, that was very informing, but I am stuck with a similar problem (the herschel-bulkley fluid model); how do we solve a problem like this :
    y = a + b*x^c
    how can we determine a, b, and c?

  5. Genaro Luna Tapia says:

    Hello, any bibliographic reference that you recommend to me to study the whole theoretical framework of this regression model? Thank you!

    • Charles says:

      Genaro,
      Are you looking to understand the mathematics?
      Charles

      • Genaro Luna Tapia says:

        Hi Charles.

        I am conducting research on metal fatigue and this regression model best describes the trend of experimental data. Hence my interest in knowing in depth the theoretical framework of it.

        Thank you. Best regards!

        • Charles says:

          Genaro,
          I don’t know of any books related to the theoretical framework for metal fatigue. The theoretical framework that I am familiar with are mathematical in nature.
          Charles

          • Genaro Luna Tapia says:

            Charles

            I think I did not explain myself well. I apologize for it. My interest is to know the theoretical framework of the potential regression, since this regression model applied to the experimental data obtained in tests of metal fatigue, allows to obtain a better approximation of the variability of the data.

            For this reason the request of some bibliographical reference to know more about the potential regression.

            Best regards!

          • Charles says:

            Genaro,
            There are hundreds of books which which give a theoretical background on regression, but I can’t identify any one book on the subject. The Real Statistics website also includes a lot of information on this topic.
            Charles

  6. Steven says:

    Charles, you can correct me if I’m wrong, but I am trying to find the standard error of the coefficients and I think it requires an approximation for the intercept that is not shown in the Figure 2. Since we have α = exp(δ), the standard error of α can be calculated with Taylor approximation (https://en.wikipedia.org/wiki/Taylor_expansions_for_the_moments_of_functions_of_random_variables). This results in std(δ) ≈ exp(α) * std(α). So in your case, std(δ) ≈ exp(2.81) * 0.206 ?

    • Charles says:

      Steven,
      From Figure 2, we see that δ = ln α = 2.813 with s.e. for δ = .206. Also, as you say, α = exp(δ).
      Using a Taylor series approximation, we find in general that if y = g(x), then var(g(x)) = (g'(x))^2 * var(x). This is called the delta method.
      In this case g(x) = exp(x) and so g'(x) = exp(x). Thus, the s.e. of α = exp(2.813) * 0.206, which is what you wrote, although I think you mixed up std(δ) with std(α).
      Charles
      Charles

  7. Adela says:

    Hello Charles,

    Can you please help me with my equation y=a*(b^x)*u.

    • Charles says:

      Adela,
      First take the log of both sides of the equation to get logy = loga + xlogb + logu. If I let y’ = logy, a’ = loga, b’ = logb and u’ = logu, I get the equation
      y’ = a’ + b’x + u’
      Assuming u is another independent variable, then this can be analyzed using multiple linear regression. If instead u is a constant, then let c = loga + logu, to get the simple linear regression model y’ = b’x + c.
      Charles

  8. rene.s says:

    Charles,
    Sorry for my English, i will try to explain .
    The model on wich I am working, has more or less the shape of the upper part of an aircraftwing.
    I used your idea to find the curve from front to back. And the other axes in the model is of the type y=ax+b. These are the prominent dimensions.

    I experienced the problem with Excel, that i could not bent the surface in an apropiate curve in one dimension since it is all lineair, like a flat sheet of metal which you can manipulate.

    The result with ln(x) is that de model now has a curve, uses less varibeles, and predicts better.

    • Charles says:

      Rene,
      Yes, that is the idea behind using non-linear regression models such as y = b*ln(x) + a. The good news is that if you set z = ln(x) you have a linear model of form y = bz + a and so can use linear regression. You will get a slightly better model if you use a non-linear model, but the linear model usually works pretty well.
      Charles

  9. rene.s says:

    Charles,

    Thank you very much, smart solution.

    This is also my solution to the problem that Excel Multi Lineair Regression gives a flat plate. Where as there is variable in the collection which has a power function.

    • Charles says:

      Rene,
      Sorry, but I don’t understand your question.
      Charles

    • Charles says:

      Rene,
      Sorry, but I don’t know what a “flat plate” means. I also don’t understand your second sentence. Do you mean, where is the data analysis tool for power regression? You can use the Linear Regression and/or Exponential Regression data analysis tools.
      Charles

  10. Musa says:

    Hello Charles,
    Thank you for your insights here.I happen to have a question on the power law; however, it seems to combine a number of statistical aspects.

    I am looking to fit a line on the linear part of a log-log plot of a power law. Unfortunately with excel, the power trendline fitted automatically takes into account the entire data set. I need to ignore the outlying first part. I have tried to look for methods to solve this and somewhere I found a suggestion that to bin my data. Other suggestions were to use maximum likelihood estimation or weighted least squares.
    I did try to use Linear regression but it did not help. The biggest problem is where to choose to begin the regression from; what point in the data set?

    Do you have any tricks up your sleeve as regards this?

    • Charles says:

      Musa,
      Can’t you just restrict your analysis to those points that are on the subset of the curve that you are interested in?
      Charles

      • Jamil says:

        the power of developed equation is attained when the predicted value are within the range of input data

  11. Yuna says:

    hi Charles,

    Firstly, sorry if my question is not related here. I know one of my IV have no relationship with the DV(corr= 0.07). But I still wanted to put in the equations even though the result of the parameter variable is not significant after regression. The adjusted R square is 0.76 and the whole equation can be trusted. (<0.05). What can I do with the no correlation variables that I want it? Can I transform the particular data? Thank you in advance.

    • Charles says:

      Yuna,
      If you want to retain some independent variable in the model for theoretical reasons (based on your domain knowledge), then just keep it in the model and don-t worry about the fact that it is not significant. If you instead want to use some transformation that yields a significant regression coefficient, then make that transformation (I would do this based on some theoretical, not statistical, basis).
      Charles

      • Yuna says:

        Pheww thank you Charles. However, can we make transformation to the variables if its already no relationship with the DV? Ive tried some method on transformation but only slight changes. Still far from significant. Thank you again Charles.

        • Charles says:

          Yuna,

          Here is a an example where a transformation can make a big difference

          x y
          1 -0.002004008
          2 0.001908397
          3 1.70797E-05
          4 9.54129E-07
          5 1.02405E-07
          6 1.65383E-08
          7 3.54014E-09

          The correlation coefficient is .14876. If you use the transformation y –> (1/y + 500)^.1 then the correlation coefficient will be 1.

          I don’t know how useful this is, but at least it shows that a transformation can make a difference in the correlation coefficient.

          Charles

          • Yuna says:

            thank you so much Charles. Wish you are given longevity of health so you can always be here helping us.

  12. Matija says:

    In model: ln y = β ln x + α

    β is short term elasticity.

    How to calculate long term elasticity? I think it is connected with:
    ln y = β ln x + β1 ln yt-1 + α

    • Charles says:

      Matija,
      I think you are asking me a question about economics, not statistics. It looks like you are looking for a time series model of long term elasticity. The website explains how to model time series and create forecasts based on the resulting model. This part of the website is under construction, but there is already a lot of useful information in the site about this topic.
      Charles

  13. Pingback: How many tickets will be sold before Wednesday? …and other burning Powerball questions | The Final Wager

  14. Kevin says:

    Hi,

    Near the end of the page, you explained how to get an X, if you know the Y. You did it like this: =EXP(TREND(LN(B6:B16),LN(A6:A16),LN(26))).

    Is there any way to find Y, when you know the X?

    Thanks in advance,

    Kevin

  15. Jason says:

    Is it possible to transform a model that has both a power and a linear variable?

    My formula is y=a*x^b+z*d, where a*x^b covers what can be considered fixed tasks with improvement over months of time (x) and z*d covers variable support tasks that will scale with the effort z in hours of the people being supported.

    I’ve currently set it up using an addition column for y-hat and used solver to estimate a, b, and d by maximizing the r2. I’m rather pleased with the result, however I’m wondering if there’s a way to transform this for use with linest. Also, being that I’m not nor should I ever be considered a mathematician I wonder if there’s anything I’m missing that would cause my results to be in error.

    Please note that I also performed multivariable linear and transformed power regressions using linest. The results between my model and the two variable linear model are somewhat close, I just have a conceptual issue with the linear model since it estimates the fixed tasks as being negative if you go far enough in the future. I appreciate any help you can provide.

    Thanks,

    • Charles says:

      Jason,
      Sorry, but I don’t know any way to use a transformation so that linest can be used.
      Charles

    • Goetz says:

      Jason,
      I may have that same question too, i.e. one predictor variable (x) that has a power relationship with response y, and another predictor (d) that has a linear relationship with y, which I want both together run in same (linear) model.
      Probably you can simply run such (linear) model by linearizing (log-transform) all but the d predictor variable:
      ln y = ln a + b * ln x + z*d
      But, please, anybody confirm that, or correct me if I am wrong.

      • Charles says:

        Jason,
        This model looks correct to me. You can address it as a linear model or a non-linear model (e.g. using Solver).
        Charles

  16. Anna says:

    Hi Charles,

    I just wanted some clarification on why do we use a linear trend-line for the log-log transformed data? If we used a power trend-line, would it be less accurate?

    Thanks for your help,
    Anna

    • Charles says:

      Anna,
      The idea of the log-log transformation is to get a linear relationship. For this reason after the transformation you check for a linear trend. For the data before making the transformation, you won’t see a linear relationship and so your would not use a linear trendline.
      Charles

  17. hamed says:

    Hi,
    Thanks for your answer. But I think the same error has also been done in the following page. I’m referring to figure 2.
    http://www.real-statistics.com/regression/exponential-regression/

  18. hamed says:

    Hi,
    In figure 2 the coefficient for Ln x is .23 and the coefficient for intercept is 2.81 but in your equation it has been shown otherwise (Ln y = .23+2.81 Ln x).
    What is going on?

    • Charles says:

      Hi Hamed,
      Thanks for finding the error. My dyslexia has caught up with me again. I inadvertently exchanged the two parameters. I have now corrected the webpage. Thanks again for catching the error.
      Charles

Leave a Reply

Your email address will not be published. Required fields are marked *