Log-linear Regression

In Linear Regression Models for Comparing Means and ANOVA using Regression we studied regression where some of the independent variables were categorical. In this section we look at log-linear regression, in which all the variables are categorical. In fact log-linear regression provides a new way of modeling chi-squared goodness of fit and independence problems (see Independence Testing and Dichotomous Variables and Chi-square Test for Independence).

The model we will use is


where all the xij are dummy variables coded to represent categorical variables. In addition, we also consider more complicated models which contain factors consisting of interactions between these variables, as described in the sections listed below, and the yi are used to express the frequency of outcomes.

We will consider the cases where k = 2 or 3. The case where k = 2 corresponds to the two-way contingency tables studied in Independence Testing and re-examined in Two-way Contingency Tables. The case where k = 3 corresponds to three-way contingency tables, which are examined in Three-way Contingency Tables.


11 Responses to Log-linear Regression

  1. mamuye says:

    My name is Mamuye from Ethiopia, I need to evaluate the community’s satisfaction in their irrigation service using Logit Model with five independent variables, I have understood what and how I am going to do, as I saw different literature’s how the model parameters will be estimated ,but it make gap, so how this parameters ( the constant value and coefficients ) will be estimated.

  2. Samuel says:

    Dear Sir,
    I have learnt a lot from your literature on and examples of using logistic regressions. I am analyzing a community’s satisfaction on the performance of a project (dependent variable), with 5 independent variables that are categorical; in fact, they are rated on a scale of 1 to 4(1 = highly unsatisfactory; 4 = highly satisfactory). Plus, I have 3 dummy variables to throw into model. How will I model this using logistic regression? I have read your notes on log-linear regression, but I still can’t figure out how to apply it.

    Looking forward to your reply.

    • Charles says:


      I understand that you are using a 1 to 4 rating for “community’s satisfaction on the performance of a project (dependent variable).” If so then you probably want to use an ordinal logistic regression model.

      You can’t use the binary logistic regression model since you have 4 (and not 2) values for the dependent variable. You could use a multinomial model, but this wouldn’t take the order of the ratings into account.

      You can’t use a log-linear regression model since the dependent variable doesn’t take continuous values. Think of the log-linear regression model as an extension of chi-squares independence testing.


  3. Quasia says:


    I’m currently taking a bio statistic course online and of course its very difficult. What makes it difficult is that on some variables data is missing. Also I have to figure out which statistical analysis is best suited. I know I calculate it with the missing data or I can calculate the mean and use that in place of the missing data. I’m just not sure which route to go. I have to select the appropriate statistical test that allows me to conduct an analysis of the factors affecting hospital costs. can I sent the workbook for better clarification?

    • Charles says:

      Please look at the following webpage regarding handling missing data:
      Handling Missing Data

      You can send me an Excel file with your data (see Contact Us for my email address). I only have limited time to look at such files, but I’ll see whether I can give you some suggestion.


  4. Dr.bs tardes, Dr excuseme la molestia, que modelo me sugire si tengo en las variables indepientes una continua y varias dummy, y en la dependiente una variable continua?.

    Dr good evening. Dr. excuseme which multiple model you could suggest me when it have some dummy variables and one continuos varible? The dependent variable is continuous.

  5. Ramon says:

    I was told to use a log linear relationship to solve a multiple regression problem involving 10 different y values ranging from 10- 30 million and 10 different x values where x1 ranges from 1-2, x2 ranges from 30-71 and x3 ranges from 444-687. How do I do this with Xcel?

Leave a Reply

Your email address will not be published. Required fields are marked *