Multinomial and Ordinal Logistic Regression

In this section we extend the concepts from Logistic Regression where we describe how to build and use binary logistic regression models to cases where the dependent variable can have more than two outcomes. Using such models the value of the categorical dependent variable can be predicted from the values of the independent variables.

We first address the categorical case where there is no order to these outcomes (multinomial logistic regression). We then turn our attention the situation  where there is order (ordinal logistic regression).

Topics:

35 Responses to Multinomial and Ordinal Logistic Regression

  1. Mike says:

    Dear Professor Zaiontz,
    I encountered a major problem with the Binary Logistic Regression. Context: I have a file with 312 rows and 35 columns (all binary data) that represent certain business conditions and an outcome – employee engagement (engaged vs. not engaged). BUT it seems that I can’t run a proper BLR on that data now. (I was able to do so a 2 weeks ago. Using Newton’s method.)

    There are 2 specific issues: while using Newton’s method I get a diagonal line for the ROC curve and p-Pred at 0.5 for all observations, also Coeff = 0 for all observations. Second issue: after switching to Solver I get various p-Pred and Coeff’s but the Covariance Matrix returns a “#NUM!” error for which there seems to be no explanation. As you can imagine this stops the whole analysis half-way through. I checked the data formats and tried numbers, general, others – no change. I also changed the representation for decimal places – both commas and dots yield no improvement.

    I am writing this inquiry since I think the problem could be an issue with Excel / Real Stats versions. I am using Excel at Version 1702 (Build 7870.2020) and Real Stats at 4.13 Excel 2010/2013/2016.

    Also, a short additional question – what is your opinion on interpreting BLR ratios in such a case? The way I was working 2 weeks ago was re-running the analysis with 12 and then 8 variables to get a significant model for estimating engagement based on a limited amount of variables, adding an ability to derive a company-wide improvement strategy in 8 key areas instead of 35. Assuming I am focusing only on statistically significant ratios, but I wonder what impact there is on the validity of the data with so many variables. On the other hand all those variables are there (with dozens of others) they only thing that I can change is the amount o variables I collect and do math with.

    Thanks for all the great knowledge here and have a nice day,
    M.

    • Charles says:

      Mike,
      I have used binary logistic regression in the past few days on Excel 2013 and had no problems.
      If you send me an Excel file with your data and analysis I can check to see whether something I changed in the latest logistic regression release is causing the problem that you are seeing. You can find my email address at Contact us.
      Re BLR ratios, which ratios are you referring to?
      Charles

      • Mike says:

        Re BLR ratios, I was referring to the odds ratios (exp(b)). I would like to offer some deeper understanding to my presentations addressees.
        (But to give one you got to have one. 🙂 )
        So, I am wondering how can I relate this in more understandable terms. One way to go is to “translate” odds ratios to probability. BUT this helps only sightly. What I am actually after is a way to show the cumulative impact of manipulating several variables as a sum. I am operating with binary variables all the way so something is either done or not. How can I show what the outcome will be if we change some specific 8 variables? Is showing the difference (increse) in the p-Pred a way?
        My earlier question still stands – with so many variables, and only 312 observations – how seriously should I take the odds ratios? Is p-value enough to actually infer a relationship?

        Best,
        M.

        • Charles says:

          Mike,
          In looking at your data, I see that var20 and var26 have identical values, and so the algorithm won’t converge due to collinearity. If you remove var26, everything works fine.
          Charles

  2. Kathleen Kerwin says:

    I am trying to use the binary logistic regression function. I added solver and the real statistics addin. When I select the logistic regression function, I get a runtime error 424. I repaired my microsoft office 2010 software and rebooted. Same errors.

    Every once in a while I get an error with solver. In any case I’m stuck.

    Any ideas?

    • Charles says:

      Kathleen,
      When you press Alt-TI do you see both RealStats and Solver on the list of addins with check marks next to them? If not you need to either add these addins or make sure that there are check marks next to them.
      When you first use Real Statistics, what do you see when you press the =VER() formula?
      Charles

      • Kathleen Kerwin says:

        Charles: yes both addins are there and checked. The first time I placed the =ver() in the cell, 2007 showed up. Interesting because I am using 2010 Office. This time #NAME shows up.

        Kathleen

        • Charles says:

          Kathleen,
          That 2007 showed might mean that you have installed the wrong version of the software. I suggest that you reinstall the Real Statistics addin. I plan to issue a new release in a couple of days.
          Charles

  3. Fatimah says:

    Dear Charles,

    IF the Model fitting is not significant, should I proceed?
    If yes, what does it mean for the model fitting to be not significant while the parameter estimates
    is significant?

    Model Fitting Information
    Model Fitting Criteria
    -2 Log Likelihood
    95.673 90.756
    Likelihood Ratio Tests
    Model
    Intercept Only Final
    Chi-Square df
    4.917
    Sig.
    2
    .086
    IF the Model fitting is not significant, should I proceed?
    If yes, what does it mean for the model fitting to be not significant while the parameter estimates is significant?

  4. Sam says:

    Dear Sir,
    Please help me, I’m a newbie about this problem.
    Well, I’m now completing a research study about the relationship between narcissism (IV) and cyberbullying (DV) to instagram user. My independent variable has low-mid-high (interval data) and my dependant variable has a categorical data which consist of cyberbullying perpretator-cyberbullying victim-and the unidentified one.
    Yesterday, i tried a multinomial logistic regression analysis in SPSS, and it gave me a warning:

    “There are 1 (11,1%) cells (i.e., dependent variable levels by subpopulations) with zero frequencies.
    Unexpected singularities in the Hessian matrix are encountered. This indicates that either some predictor variables should be excluded or some categories should be merged.
    The NOMREG procedure continues despite the above warning(s). Subsequent results shown are based on the last iteration. Validity of the model fit is uncertain.”

    What’s the warning means ? I don’t understand
    And is a multinomial logistic regression analysis that i’ve choosen right to be analysed in my research ?

    Sam
    Thankyou, Sir

    • Charles says:

      Sam,
      From your description, multinomial logistic regression analysis seems to be a good choice, except for the warning. You should pay attention to warning “There are 1 (11,1%) cells (i.e., dependent variable levels by subpopulations) with zero frequencies.” You can ignore the rest of the warning.
      I don’t use SPSS and so I can’t comment further about the warning message, but I suspect that your sample is very small with not enough data to find a fit for the logistic regression model.
      Charles

  5. Dennis says:

    Is it possible to use your resource pack for conditional logisitic regression? Think of analyzing which horse will win a given horse race relative to the other horses….Thanks!

  6. Ashik says:

    Sir
    Please help me with this notification i am very new to real statistic package while i am trying to perform multinomial logistic regression its saying “last column of input range must contain all the values 0,1,2,…, and only these values where r=max value in the last column of input range (r must be <25). How can i solve this problem ?

    • Charles says:

      Ashik,
      If you send me an Excel file with your data, I will explain what you need to do. You can find my email address at Contact Us.
      Charles

      • Amar says:

        Hi Charles,

        I am facing a similar problem. I am trying to fit a logistic regression model whereby I can predict the attrition probability of an employe. I have other independent variables like tenure, performance, etc.

        I am a bit confused on how to use the tool.

        First, I was facing the same problem as Ashik. However, I moved the attrition column (0 – not attrited, 1 – attrited) to the end which removed the error.

        Now the output is not making sense to me. I think if you could include some steps or instructions on how to use the workbook or tool could be helpful within the workbook itself.

        Thanks.

        BTW your website is a great resource.

        • Charles says:

          Amar,
          I’ll look into adding some additional information. In the meantime, if you send me an Excel file with your data, I will explain what you need to do. You can find my email address at Contact Us.
          Charles

  7. sarah kolshuk says:

    Hi Dear Dr. Zaiontz,
    Im am completing a research study looking to see if there is an association between rates of hypotension (yes/no) during surgery (primary outcome) and use of a certain blood pressure medication (given /held prior to surgery). I have multiple regressors / confounding variables that I am trying to account for. Some are binary in nature (0,1) and some are continuous (ex. blood pressure readings). Someone had suggested I split my regression analysis: 1) do a multi nominal analysis for comparing my independent variable and nominal data, 2) do a multivariate linear regression for comparing independant variable with continuous regressors. What is your opinion on the above advise? What type of test do you feel would be most appropriate?
    Thanks,
    Sarah

    • Charles says:

      Sarah,
      These approaches could be useful, but I would need to have a more complete picture of the situation before I could definitively answer your question.
      Charles

  8. AGHA says:

    My Independent variables are gender and academic achievements in term of CGPA. While my DV is Emotional intelligence EI. What type of tests i will do to prove that gender has relationship with EI, and Academic achievements predict EI.

  9. Hamad says:

    Dear Dr. Zaiontz,

    I am planning on using Conjoint Analysis to measure preference for new products. As you know, it uses a multinomial logit model. However, I have found special softwares to conduct such analysis but they are very expensive. Do you know if Conjoint Analysis could be performed using Excel, or are there other ways of doing it? (I have been told that I could find free codes to use it on R, but I got lost when I saw those). Any help is greatly appreciated.
    Sincerely,
    Hamad

  10. Hossein Jamaly says:

    Hi Prof. Zaiontz
    I appreciate if you kindly help me in doing multinomial logistic regression between my categorical phenotypic data (as dependent variables) and genotypic data (both binary and allelic states as independent variables).
    FYI, I am analysing my data in a panel of 143 barley genotypes for association mapping in barley. I have used GLM and MLM models for my quantitative and ordinal phenotypic data in TASSEL software(http://www.maizegenetics.net/index.php?option=com_content&task=view&id=89&Itemid=119).
    regards,
    Hossein

  11. Iresha says:

    Dear sir,
    Can u tell me, when we have Categorical variable for both dependent & Independent variables, How we will do the regression analysis

Leave a Reply

Your email address will not be published. Required fields are marked *