Multiple Regression with Logarithmic Transformations

In Exponential Regression and Power Regression we reviewed four types of log transformation for regression models with one independent variable. We now briefly examine the multiple regression counterparts to these four types of log transformations:

image7080 image7081 image7082image7083

Level-level regression is the normal multiple regression we have studied in Least Squares for Multiple Regression and Multiple Regression Analysis. Log-level regression is the multivariate counterpart to exponential regression examined in Exponential Regression. Namely, by taking the exponential of each side of the equation shown above we get the equivalent form


Similarly, the log-log regression model is the multivariate counterpart to the power regression model examined in Power Regression. We see this by taking the exponential of both sides of the equation shown above and simplifying to get


Since any positive constant c can be expressed as  eln c, we can re-express this equation by


(where clearly the coefficients  are not the same, and where we included negative values for  as well).

We now give an example of where the log-level regression model is a good fit for some data.

Example 1:  Repeat Example 1 of Least Squares for Multiple Regression using the data on the left side of Figure 1.

Log-level transformation Excel

Figure 1 – Log-level transformation

The right side of the figure shows the log transformation of the price: e.g. cell G6 contains the formula =LN(C6). We next run regression data analysis on the log transformed data. We could use the Excel Regression tool, although here we use the Real Statistics Linear Regression data analysis tool (as described in Multiple Regression Analysis) on the X input in range E5:F16 and Y input in range G5:G16. The output is shown in Figure 2.

Log-level regression Excel

Figure 2 – Regression on log-level transformed data

The high value for R-Square shows that the log-level transformed data is a good fit for the linear regression model. Since zero is not in the 95% confidence intervals for Color or Quality, the corresponding coefficients are significantly different from zero.

We could also use the array formula =LOGEST(C6:C16,A6:B16,TRUE,TRUE) to obtain the following output (the labels have been manually added):

LOGEST function Excel

Figure 3 – Use of LOGEST function

Note that the slope/intercept values in row 7 of Figure 3 are the exponential of the linear coefficients calculated in Figure 2: e.g. the value of cell R7 is equal to EXP(J23) and the value of cell T7 is equal to EXP(J21).

We can also use the regression model to predict the price of a given diamond. For example, suppose a diamond has Color = 4 and Quality = 5 or Color = 7 and Quality = 7, then the following three approaches show how to predict the Price based on the regression model:

Forecasting log-level regression

Figure 4 – Forecasting using the log-level model

Example 2:  Repeat Example 1 using the data on the left side of Figure 5.

Log-log transformation Excel

Figure 5– Log-log transformation

The right side of the figure shows the log transformation of the color, quality and price. We next run the regression data analysis tool on the log transformed data, i.e. with range E5:F16 as Input X and range G5:G16 as Input Y. The output is shown in Figure 6.

Log-log regression Excel

Figure 6 – Regression on log-log transformed data

As in the previous example, we see from Figure 6 that the model is a good fit for the data. We can also use the regression model for forecasting. Note that there are LOGEST or GROWTH functions for the log-log transformed models, but we still have the following two approaches for forecasting:

Forecast log-log regression

Figure 7 – Forecasting using the log-log model

33 Responses to Multiple Regression with Logarithmic Transformations

  1. ALI ALI says:

    for the log transformation of time series data ,in excel which function we have to press

    ln or log

    • Charles says:

      LN(x) is the natural log of x and LOG(x,b) is the the log of x base b. Note that LN(x) = LOG(x,EXP(1))
      Generally the natural log is used, although you could really use log to any base.

  2. chanda says:

    is it possible to apply logs to the regresand and not on all the regressors. because other regressors are negative and a log cannot be negative. for example log(exchange_rate)=B+log(oilprices)+interest_rates.
    is the model above correct or not.

    • Charles says:

      Yes, you can do this.

      • Martha Liliana Rodriguez says:

        Hello, Charles.

        adding to the previous question. In a Log-log regression if you are applying only to 2 independent variables the logarithm, then how you can read the results.

        I mean, the coefficient of the variables with logarithm are in percentages and the coefficient of the variable without the Log are in monetary units?

  3. Ivan says:

    Charles, thank you so much for your knowledge sharing!
    Got a question, I have gone over this article and tried to come up with an level-log equation like you did with $T$7*$S$7^W14*$R$7^X14 (Fig 4) for log-level and with EXP($J$51)*EXP($J$52)^LN(W38)*EXP($J$53)^LN(X38) (Fig. 7) for log – log.
    Any suggestions would be highly appreciated.

  4. Wondering says:

    Thanks for this information.

    Wondering about your cell references in Figure 7. Is it possible that the references to cells J56, J57, J58 in Figure 7 should actually refer to the coefficients in cells J51, J52, J53 in Figure 6?

    • Charles says:

      Yes, you are correct. Actually, these formulas refer to the exponential of the values in cells J51, J52, J53 of Figure 6.
      Thanks very much for catching this mistake. I have now updated the referenced figure to reflect the change.

  5. ruchi says:

    Sir, what if even after taking log data is not normal…..then how to make data normal?….I m having hard time please let me know. Can i take log of already log series …is it ok for making data normal.

  6. Shampa says:

    I am using 2 stage least square ans seemingly unrelated regression, where I have 12 independent variables. I am planning to use log value for dependent variable and only two independent variables among the 12 independent variables. It is not fully likes log-log regression. Would you please tell me can I do it and if I can, how I can refer the name of this type of model?
    I would appreciate any help on this.

    • Charles says:

      I don’t yet support 2 stage least squares, and so I don’t have any advice about this topic at this time.

  7. Gayathri says:

    Hello Charles
    If there is a zero value in the independent variable, how can we go ahead with the log transformation in the log-log model?


    • Charles says:

      Use log(x+a) instead of log(x) where a is a constant big enough so that x+a is always positive (for the values of x that you are considering).

  8. Angela says:

    Can I know the log-log regressions have more statistical sense or business sense?

    • Charles says:

      I don’t really know how to answer this question. They have practical application and are an interesting subject in statistics.

  9. Gabs says:

    Hello Charles,

    I am running two OLS models. Model 1 is a liner model and in model 2 i log my outcome variable y.

    When I ran the regression using model 1 it shows that my explanatory variable x has a positive and significant effect on y. Then when I ran the model using the log form (i.e., ln(y)) my explanatory variable becomes negative and insignificant.

    I am having a hard time understanding why in model 2, my x variable becomes negative and insignificant? Not sure how what is the correct interpretation here. Any suggestions?

    Thank you!

    • Charles says:

      It is difficult for me to answer your question without seeing the data. Perhaps the assumptions for an OLS model are not being met with one of these scenarios. If you send me an Excel file with the data, I will try to see why this is occurring.

  10. loraine says:

    Hi Sir,

    Could I ask for a bit of help? What kind of fitting should I use if I have a log-log plot of two independent variables (x and y have been measured with error)?

    Thanks so much

  11. Christi says:

    This post helped me work through issues I was having with a log regression. Love your site, your posts and examples are detailed and easy to implement. Thank you!

  12. jyotsna says:

    Hello SIr, i am implementing a log transfromation on OLS regressioni.e Log transformation on multiple regression. But among the 3 types of log transformations namely log-level,level-log and log-log, which transformation should i go with? Is log-level similar to box cox transformation?

    • Charles says:

      There are many types of transformations in addition to the ones you have referenced. The specific transformation depends on your data. Usually you are picking the transformation that achieves some objective (e.g. making the data more linear or making the data better fit the normal distribution).

      The log-linear (i.e. log-level) transformation is one of the transformations in the Box-Cox family of transformations.


  13. Adeeb says:


    If I only had one independent variable I could do a scatter plot against the dependent variable to visually determine whether the relationship is linear, and if not, whether a transformation (log, ln, 1/x, etc.) is appropriate. but when I have multiple independent variables (say 3 o 4) in a multiple regression, what’s the best way to test for linearity, and what if some are liner and others curves (e.g. exponential)? Thanks

    • Charles says:

      I tend to simply perform the multiple regression analysis and see if I have a good fit (based on the value of R-square and the significance of the correlation coefficients). You can compare difference transformations in this way as well.

  14. Daniel says:

    If I am doing a multivariate regression, but on my left hand side of the equation some of my independent variables have values in thousands which are much higher compared to the others having absolute values in range of, say, 0 to 100, should I use log for all of them or just for the ones with the high values so they can be put on the same level? Also, if some of my variables are in percentages, is it ok if I still apply the log on to them?

    Thank you very much!

    • Charles says:

      You can use log for these, but you also might be better off not doing so. The important thing is not that absolute values be on the same scale, but that the assumptions for multiple regression be satisfied (linearity, normality, homogeneity of variances). If using the log contributes to this then using the log can be a good idea, otherwise it is better not to use the log. You can use log for some variables but not others.

  15. Anil says:

    I am trying to build a forecasting model using multiple regression, can you have a look at it and tell me if I am doing it right?.

    I would appreciate any help on this.



    • Charles says:

      I don’t generally do this sort of thing since it can be very timeconsuming. If you send me your model I will take a quick look, but I won’t be able to try to decipher things.

Leave a Reply

Your email address will not be published. Required fields are marked *