Finding Logistic Regression Coefficients using Excel’s Solver

We now show how to find the coefficients for the logistic regression model using Excel’s Solver capability (see also Goal Seeking and Solver). We start with Example 1 from Basic Concepts of Logistic Regression.

Example 1 (Example 1 from Basic Concepts of Logistic Regression continued): From Definition 1 of Basic Concepts of Logistic Regression, the predicted values pi for the probability of survival for each interval i is given by the following formula where xi represents the number of rems for interval i.

image2174

The log-likelihood statistic as defined in Definition 5 of Basic Concepts of Logistic Regression is given by

Log-likelihood statistic

where yi is the observed value for survival in the ith interval (i.e. yi = the fraction of subjects in the ith interval that survived). Since we are aggregating the sample elements into intervals, we use the modified version of the formula, namely

Modified Log-likelihood statistic

where yi is the observed value of survival in the ith of r intervals and

image7012

We capture this information in the worksheet in Figure 1 (based on the data in Figure 2 of Basic Concepts of Logistic Regression).

Logistic regression initial Excel

Figure 1 – Calculation of LL based on initial guess of coefficients

Column I contains the rem values for each interval (copy of column A and E). Column J contains the observed probability of survival for each interval (copy of column F). Column K contains the values of each pi. E.g. cell K4 contains the formula =1/(1+EXP(-O5–O6*I4)) and initially has value 0.5 based on the initial guess of the coefficients a and b given in cells O5 and O6 (which we arbitrarily set to zero). Cell L14 contains the value of LL using the formula =SUM(L4:L13); where L4 contains the formula =(B4+C4)*(J4*LN(K4)+(1-J4)*LN(1-K4)), and similarly for the other cells in column L.

We now use Excel’s Solver tool by selecting Data > Analysis|Solver and filling in the dialog box that appears as described in Figure 2 (see Goal Seeking and Solver for more details).

Excel Solver logistic regression

Figure 2 – Excel Solver dialog box

Our objective is to maximize the value of LL (in cell L14) by changing the coefficients (in cells O5 and O6). It is important, however, to make sure that the Make Unconstrained Variables Non-Negative checkbox is not checked. When we click on the Solve button we get a message that Solver has successfully found a solution, i.e. it has found values for a and b which maximize LL.

We elect to keep the solution found and Solver automatically updates the worksheet from Figure 1 based on the values it found for a and b. The resulting worksheet is shown in Figure 3.

Logistic regression Solver

Figure 3 – Revised version of Figure 1 based on Solver’s solution

We see that a = 4.476711 and b = -0.00721. Thus the logistics regression model is given by the formula

image7020

For example, the predicted probability of survival when exposed to 380 rems of radiation is given by

image7021

Note that

image7022

Thus, the odds that a person exposed  to 180 rems survives is 15.5% greater than a person exposed to 200 rems.

Real Statistics Data Analysis Tool: The Real Statistics Resource Pack provides the Logistic Regression supplemental data analysis tool. This tool takes as input a range which lists the sample data followed the number of occurrences of success and failure. E.g. for Example 1 this is the data in range A3:C13 of Figure 1. For this problem there was only one independent variable (number of rems). If additional independent variables are used then the input will contain additional columns, one for each independent variable.

We show how to use this tool to create a spreadsheet similar to the one in Figure 3. First press Ctrl-m to bring up the menu of Real Statistics data analysis tools and choose the Regression option. This in turn will bring up another dialog box. Choose the Logistic Regression option and press the OK button. This brings up the dialog box shown in Figure 4.

Logistic regression dialog Solver

Figure 4 – Dialog Box for Logistic Regression data analysis tool

Now select A3:C13 as the Input Range (see Figure 5) and since this data is in summary form with column headings, select the Summary data option for the Input Format and check Headings included with data. Next select the Solver as the Analysis Type and keep the default Alpha and Classification Cutoff values of .05 and .5 respectively.

Finally press the OK button to obtain the output displayed in Figure 5.

Logistic regression Solver initial

 Figure 5 – Output from Logistic Regression tool

This tool takes as input a range which lists the sample data followed the number of occurrences of success and failure (this is considered to be the summary form). E.g. for Example 1 this is the data in range A3:C13 of Figure 1 (repeated in Figure 5 in the same cells). For this problem there was only one independent variable (number of rems). If additional independent variables are used then the input will contain additional columns, one for each independent variable.

Note that the coefficients (range Q7:Q8) are set initially to zero and  (cell M16) is calculated to be -526.792 (exactly as in Figure 1). The output from the Logistic Regression data analysis tool also contains many fields which will be explained later. As described in Figure 2, we can now use Excel’s Solver tool to find the logistic regression coefficient. The result is shown in Figure 6. We obtain the same values for the regression coefficients as we obtained previously in Figure 3, but also all the other cells are updated with the correct values as well.

Logistic regression Solver final

Figure 6 – Revised output from Solver

50 Responses to Finding Logistic Regression Coefficients using Excel’s Solver

  1. Bob Francis says:

    Hi Charles,

    I just downloaded real stats and placed it into excel. In the past, I have used solver to determine power ratings for NFL teams with the purpose of determining a true point spread and total for betting on sports. However, solver uses linear regression and while it does a good job, I believe that a logistic regression markov chain may be a more dynamic option. I know that major sports betting syndicates use logistic regression for these purposes but of course they will not reveal how they do it. What I want to accomplish is get close to that. Can this be done in excel with the real stats package? I have years of data and statistics and what I want to accomplish is to power rate these NFL teams, determine each teams home field advantage, and ultimately forecast a final score. Also, you must realize, I am not one of these MIT statisticians. I never went to school for statistics. But I do know sports statistics and how they are valued when it comes to betting on sports. How would I go about determining the above in excel using the realstats package. If you could help me I would be forever grateful. If this is a major project, I understand your time is valuable. Just need to be pointed in the right direction.

    • Charles says:

      Bob,
      I have not tried to implement this sort of approach myself using logistic regression, but it is pretty easy to use the Real Statistics Logistic Regression data analysis tool, and so I suggest that you start by just playing around with it to see how it works. I am available to answer questions.
      Charles

  2. Marc says:

    Thank you very much for your statistical tools and providing helpful hands-on examples. You are in a league of your own. Combining complex statistical knowledge with the creation of simple to use tools is no small feat!
    I am working on a problem for which I would require your guidance. I am trying to determine how performance rating is linked to gender and position level in our organization. I am assuming that I working with an ordinal model since the ratings range from 1 (did not meet, i.e., bad performance) to 5 (surpassed, i.e., excellent performance). Gender is male=0, female=1 and, level is 0 to 4. What regression approach should I use, binary or multinomial? The data looks like this:
    Gender Level 1 2 3 4 5 Total
    00 16 167 790 377 78 1,428
    0 1 5 69 366 220 61 721
    0 2 0 19 225 186 78 508
    0 3 0 3 57 47 24 131
    0 4 0 0 13 21 19 53
    1 0 3 109 762 364 84 1,322
    1 1 2 39 273 195 58 567
    1 2 1 17 155 143 67 383
    1 3 0 2 39 28 20 89
    1 4 0 0 10 10 8 28

    • Charles says:

      Marc,
      If the dependent variable takes only two values (e.g. Male or Female), then use binary logistic regression. If the dependent variable takes a small number of values more than two (e.g. North America, Europe, Asia), then use multinomial logistic regression.
      Charles

  3. Ryan says:

    Hi Charles,

    Is there a place where all the information that is given in the output shown in Figure 6 is explained?

    Thanks,

    Ryan

  4. Lauren says:

    Hello,

    I apologize, as this has probably already been asked, but do you know of the way to solve for a and b by hand, rather than in a statistics software? Or could you possibly direct me to a place which shows it? I’ve tried looking, but it just seems as if every site either relies on Stata, R, or Excel to find a and b rather than calculating them out.

    Best Regards

  5. TJ Gacott says:

    Hi, Charles!

    First off, I would like to thank you for this insightful discussion that you gave. It helped me a lot! I would just like to ask though, what if I would like to determine the coefficients for a logistic regression model that I’m working on. I have several independent variables, would it be advisable that I determine their coefficients individually or is there another method which I could use to determine them simultaneously?

    Your reply will be very much appreciated! Thanks in advance!

  6. satish says:

    Dear Charles

    I am glad to find this site about logistic regression, I have a data dependent variable is binary(1,0), and 28 independent variables are both metric and non metric variable, once I run the logistic regression in Excel and SPSS, most of the coefficients getting negative and zero. is this affect show on prediction?, how do i resolve this issue? also give hint of method to significance test for logistic regression.

    Thanks

    Satish

    • Charles says:

      Dear Satish,
      Without seeing your data I have no way of commenting about why the coefficients are negative or zero. Also, please note that there is nothing wrong with negative coefficients, and so there may be nothing to resolve.
      Regarding your second request, please see the following webpage:
      Testing the Fit of a Logistic Regression Model
      Charles

  7. Lee says:

    Hi Dr Charles,
    I’m wondering if you can help – do I always need to zero out the coefficients I’m trying to solve for before hitting the solve button? I’m normalizing energy consumption using the logit-3, logit-4 or gompertz models, and when I zero out the coefficients before solving I get one answer, but if I leave in the coefficients that were the results from last years model and then hit solve, the answer differs…

    Any comments or advice would be more than welcomed!

  8. Luke says:

    Hi Charles, this provides a great introduction, thanks for putting in the time to elaborate, and from all the comments your blog is off assistance to many readers. I have a questions when we introduce a secondary variable, lets say age in your example. I have done so by creating a secondary table of categorical values, and followed the same method. I have created ‘a’, ‘b’, and ‘c’ and set them to zero. So now I have two tables similiar to Fig 1 (ie. all pi values are 0.5 – because a=b=c=0 currently)
    Now for pi_table1 i use the following equation 1/(1+exp(-a-b*t1x)
    For pi_table2 i use the following equation 1/(1+exp(-a-c*t2x)
    When I go to the solver, and tell it I want to maximise the two LL values, by changing ‘a’, ‘b’ and ‘c’, i get an error that says “Objective Cell must be an objective cell on the datasheet”.
    in my “Set Objective” box I have LL_table1 cell ; LL_table2 cell
    I have put in a formula to solve pi in the

    • Charles says:

      Sorry Luke, but I don’t completely follow what you are trying to do. Also you comment is incomplete and ends with “I have put in a formula to solve pi in the”. It seems like you are trying to maximize two cells at once. Solver doesn’t do this, although you can probably come up with a way to combine the maximization of two values into one.
      Charles

  9. Dear Dr. Charles,

    I have been trying out your Logistic Regression tool using the data set below. This data set is part of the famous Fisher data set for irises. The binary outcome is called Type and appears in the last column. The first four columns are iris properties.

    I decided to use the Logistic Regression tool with just one independent variable at a time. For SL and Type, the output coefficients are fine. Same is the case for SW and Type. However, if I try PL and Type or PW and Type, the program complains #VALUE is all the cells including p-Pred. My suspicion is that as the computer searches the parameter space to determine coefficients, the logit sometimes get large, and Excel does not know how to handle numbers beyond approx. exp(710). Please let me know what to do. Thanks.

    PW PL SW SL Type
    2 14 33 50 0
    2 10 36 46 0
    2 16 31 48 0
    1 14 36 49 0
    2 13 32 44 0
    2 16 38 51 0
    2 16 30 50 0
    4 19 38 51 0
    2 14 30 49 0
    2 14 36 50 0
    4 15 34 54 0
    2 14 42 55 0
    2 14 29 44 0
    1 14 30 48 0
    3 17 38 57 0
    4 15 37 51 0
    2 13 35 55 0
    2 13 30 44 0
    2 16 32 47 0
    2 12 32 50 0
    1 11 30 43 0
    2 14 35 51 0
    4 16 34 50 0
    1 15 41 52 0
    2 15 31 49 0
    4 17 39 54 0
    2 13 32 47 0
    2 15 34 51 0
    1 15 31 49 0
    2 15 37 54 0
    4 13 39 54 0
    3 13 23 45 0
    3 15 38 51 0
    2 15 35 52 0
    3 14 34 46 0
    5 17 33 51 0
    2 14 34 52 0
    6 16 35 50 0
    3 14 30 48 0
    2 19 34 48 0
    2 12 40 58 0
    2 14 32 46 0
    4 15 44 57 0
    2 15 34 52 0
    2 15 31 46 0
    3 13 35 50 0
    3 14 35 51 0
    2 16 34 48 0
    2 17 34 54 0
    2 15 37 53 0
    24 56 31 67 1
    23 51 31 69 1
    20 52 30 65 1
    19 51 27 58 1
    17 45 25 49 1
    19 50 25 63 1
    18 49 27 63 1
    21 56 28 64 1
    19 51 27 58 1
    18 55 31 64 1
    15 50 22 60 1
    23 57 32 69 1
    20 49 28 56 1
    18 58 25 67 1
    21 54 31 69 1
    25 61 36 72 1
    21 55 30 68 1
    22 56 28 64 1
    15 51 28 63 1
    23 59 32 68 1
    23 54 34 62 1
    25 57 33 67 1
    18 51 30 59 1
    23 53 32 64 1
    21 57 33 67 1
    18 60 32 72 1
    18 49 30 61 1
    23 61 30 77 1
    18 48 30 60 1
    20 51 32 65 1
    25 60 33 63 1
    18 55 30 65 1
    22 67 38 77 1
    21 66 30 76 1
    13 52 30 67 1
    20 64 38 79 1
    20 67 28 77 1
    14 56 26 61 1
    18 48 28 62 1
    24 56 34 63 1
    16 58 30 72 1
    21 59 30 71 1
    18 56 29 63 1
    23 69 26 77 1
    19 61 28 74 1
    18 63 29 73 1
    22 58 30 65 1
    19 53 27 64 1
    20 50 25 57 1
    24 51 28 58 1

    • Charles says:

      Uday,

      The problem seems to be different. For the case of PW and Type, if PW is < = 6 then the outcome is always a failure, while if PW >= 13 then outcome is always a success. There is no data where PW is between 6 and 13. This trivial situation prevents the model from converging to a solution. In any case, the correct model is not given by a logistic regression model, but by the rule success is equivalent to PW >= 13, failure to PW <= 6 and undetermined for values in between. The situation is similar for PL and Type. One question I have for you, is why aren't you using all the variables instead of just one? Charles

      • Uday says:

        Thanks for your prompt response. Just like you suggest, I had started by using all the four variables at one time, and received all the #VALUE! responses. In order to identify why, I gradually reduced the number of independent variables. Just like you, I found that PW and PL are step functions, and that this is the source of why the Logistic Regression tool (as it currently stands) is not able to find a solution.

        From a deeper viewpoint, a step function is the limiting case of the logistic s-curve, so I looked into why Excel cannot get a solution. I think the problem is that the logit sometimes get large, and Excel does not know how to handle numbers beyond approx. exp(710). So when calculating the probability, 1/(EXP(-Logit)+1, I was thinking that an IF statement like =IF(E2>-700, 1/(EXP(-Logit)+1),1) may work.

        In any case, it would be nice to have a tool which works for data which happen to be step functions.

        • Charles says:

          Uday,
          Have you tried making the change on one of the spreadsheets? Does it solve the problem? If so, I will make the change you suggest.
          Charles

          • Uday Sukhatme says:

            Dear Charles,
            I am sorry I did not get back to you sooner – got sidetracked into other problems! In any case, I did make the change referred to above and tested it. It vastly improves the usage of the logistical regression tool, in particular for data which may happen to be close to step functions. In order to keep the Logit value from becoming too large or too small (both of which are problems for Excel) in the Solver process, a well-chosen IF statement works really well. Suppose we want to keep Logit in the range -30 to 30. Then, basically where you have the statement =1/(EXP(-Logit)+1) for computing the output on the spreadsheet, I changed it to =IF(Logit>30, EXP(-30), IF(Logit>-30,1/(EXP(-Logit)+1),1-EXP(-30))). This change works very well for fitting the Fisher iris data.
            Uday

  10. Sydney says:

    Thanks for posting this. Do you know how to take this and make an s-curve? Is there a way on excel to make a graph for binary logistic regressions?

  11. Erika says:

    Dear Charles,

    I try to follow this example, but I have raw data instead of summarized data. I understad that I should download the Resource Pack, and I tried that, but it did not work on my version of excel (office 365). So my question is how to do this without the resource pack?

    Regards,
    Erika

  12. Annabel says:

    Dear Dr. Zaiontz,

    Thank you for your wonderful website and very useful add-in! I am a senior medical student in the process of analyzing data for a student-initiated study of the individual effects of six, binary independent variables on a binary outcome, which happens to be hospital readmission vs. no hospital readmission.

    I was able to do the logistic regression and used Solver to find a coefficient and intercept for each of the variables. I have also found information that will allow me to calculate an odds ratio estimate for each variable using each coefficient.

    I am struggling with figuring out how to figure out an upper and lower confidence interval for the coefficient and how to test the null hypotheses (that the independent variables have no impact). I can see how the non-binary data above (rems) and outcomes can be plugged in the Logistic Regression tool to figure out the values in Figure 6 (which I think will then allow me to tackle the challenge of figuring out how to evaluate the significance of our findings), but I do not understand how to input my binary data.

    Do you have an example that shows how to use the Logistic Regression tool with a binary independent variable?

    Thank you!
    Annabel

    • Charles says:

      Dear Annabel,
      The approach with categorical independent variables is the same. An example with a binary variable is given on the webpage http://www.real-statistics.com/logistic-regression/finding-logistic-regression-coefficients-using-newtons-method/
      Charles

      • Annabel says:

        Thank you!!!

        • Annabel says:

          Dear Dr. Zaiontz:

          Your comments and the add-in worked very well for our project!

          I have one further question: For one of our independent variables, the coefficient was -0.2987, while the 95% CI for the coefficient was calculated as (0.39613, 1.38896). It has been a very long time since I studied statistics (so this project has been very engaging!), and I am struggling with the fact that the lower limit of the CI for our negative coefficient is not negative. The CI does not appear to include 0, but if the lower limit were negative, we would accept the null hypothesis.

          Can you help me understand?

          Thank you!
          Annabel

          • Charles says:

            Dear Annabel,
            It sounds like something isn’t quite right. The CI for a coefficient should contain the value of the coefficient. If you send me the spreadsheet where you got this result, I can try to figure what happened.
            Charles

  13. bgkt sih says:

    Standard Solver can only generate result from a set of data less than 200. How about want to use it for more sets of data ?

  14. svpam says:

    Hye,

    How do you calculate the std error ?

  15. smurt says:

    Dear Charles,

    I’m quite new to all this logistic regression thing. My question, is there any method where we didn’t have to download the ‘Real Statistic Resource Pack’ to solve the last part of this Logistic Regression. Which means that, try to solve using normal Excel Regression tools.

    Sorry if my question does not make any sense

    • Charles says:

      I don’t know any way of doing this with the standard Excel regression capabilities. It can be done using Excel’s Solver as described in the website. Charles.

  16. WIlliam Agurto says:

    Dear Charles:

    Thanks for your answer.

    Really, there’s no signicant difference between Solver method results and Newton’s method results (obviously, they are not equal). So, in my case, using Newton’s method is enough.

    I will review the information in the webpage, with the purpose of detect redundancy (I suspect that two of my six independent variables bring the same information to my model, and both have significant beta values; in case of doubt, I will preserve both variables).

    Thanks a lot for your suggestion.

    William Agurto.

  17. William Agurto says:

    Dear Charles:

    Logistic regression automatically gests good results when using Newton’s method, but that’s not the case when using Excel’s Solver: the user have to enter parameters in Solver to finish the regression and to obtain results. It will be better to modify VBA code with the purpose of automatically run Excel’s Solver without user’s intervention, using the following VBA functions: SolverOK (for setting all the parameters) and SolverOptions (for indicating that values don’t need to be non-negative) . VBA help shows how you can use those functions. Bellow there’s an example of how to use them (it was extracted from VBA help, Spanish language version; English version uses the same example).
    ‘**************************************************************************
    Ejemplo
    En este ejemplo se establece la opción Precisión en 0,001.

    Worksheets(“Sheet1”).Activate
    SolverReset
    SolverOptions Precision:=0.001
    SolverOK SetCell:=Range(“TotalProfit”), _
    MaxMinVal:=1, _
    ByChange:=Range(“C4:E6”)
    SolverAdd CellRef:=Range(“F4:F6”), _
    Relation:=1, _
    FormulaText:=100
    SolverAdd CellRef:=Range(“C4:E6”), _
    Relation:=3, _
    FormulaText:=0
    SolverAdd CellRef:=Range(“C4:E6”), _
    Relation:=4
    SolverSolve UserFinish:=False
    SolverSave SaveArea:=Range(“A33”)
    ‘*************************************************************************

    I hope this suggestion be helpful.

    Regards.

    William Agurto.
    PD: I ran logistic regression, checking “Raw parameters” option, and with 6 variables. In both Newton’s method case and Solver’s case I couldn’t see the column “HL Stat” next to the column “%Correct”. I wonder if it happens because my data have six variables and not only one as your example, or because I used “raw data” option instead of “summary data”. I hope you can answer that questions. Thank you.

    • Charles says:

      Hi William,

      Thanks for all your thoughtful comments.

      Regarding Solver, are you suggesting that in order to use Solver successfully the user (or VBA) needs to fill in parameters beyond those that I have indicated on the website, in particular those in the Subject to the constraints… fields?

      Regarding HL Stats, this is something I just added in Release 2.4. Anyone using an earlier version won’t see this column. You can check the release by entering the function =VER(). In any case I just ran Logistics Regression with 6 independent variables, checking “Raw data”, and did see the HL Stats column.

      Charles

      • William Agurto says:

        Dear Charles:

        Thank you for you answer.
        Regarding your first comment (Solver):
        I mean that when I use Newton method (in my case: 6 variables, “raw data”), filling all parameters: (a) Input range, (b) check in the option “headings included in data”, (c) check in the option “raw data”, (d) check in the option “Use Newton method”, (e) number of iterations=100, (f) output range, an (g) press OK, I directly obtain the regression results (betas, odds ratios, and so on). But that is not the case when I use Solver method: The result set all betas at zero value (beta i =0), and a message appears (a text bellow all the tables), with and indication to finish the regression:
        ***********************************************************
        Now use Excel’s Solver as follows to complete the analysis:
        – Set Objectives: sum of LL; To: Max
        – By Changing Variable Cells: Coeff range
        – Make Unconstrained Variabes Non-Negative: uncheck this option
        – Select a Solving Method: GRG Nonlinear
        **********************************************************
        Instead of that, is possible to automatically run Solver and directly obtain beta values only introducing the correct rutine in your VBA code, using “SoverOk” and “SolverOptions” functions. In that case, the user doesn’t need to introduced parameters to Solver, because VBA automatically calls “SolverOK” and “SolverOptions” to obtain beta values.

        If you do that, user can choose the method (Newton’s method or Solver method) and obtain results without any additional intervention (that’s not currently the case for Solver method: user have to introduce parametes to Solver to obtain the final result).

        Regarding HLStats:
        Thank you very much. I’ll prove the release 2.4.

        Additional question:
        Are you thinking about add a routine in logistics regression to detect redundant data? In my case, I suppose that one of my six variables is redundant, but I have no statistical method to prove that.

        Thanks a lot.

        William Agurto.

        • William Agurto says:

          Dear Charles:

          I had to correct some sentences in my prior message. Here the correct message.
          Regards.

          William Agurto.

          Thank you for you answer.
          Regarding your first comment (Solver):
          I mean that when I use Newton method (in my case: 6 variables, “raw data”), filling all parameters: (a) Input range, (b) check in the option “headings included in data”, (c) check in the option “raw data”, (d) check in the option “Use Newton method”, (e) number of iterations=100, (f) output range, and (g) press OK, I directly obtain the regression results (betas, odds ratios, and so on). But that is not the case when I use Solver method: The result set all betas at zero value (beta i =0), and a message appears (a text bellow all the tables), with and indication to finish the regression:
          ***********************************************************
          Now use Excel’s Solver as follows to complete the analysis:
          – Set Objectives: sum of LL; To: Max
          – By Changing Variable Cells: Coeff range
          – Make Unconstrained Variabes Non-Negative: uncheck this option
          – Select a Solving Method: GRG Nonlinear
          **********************************************************
          Instead of that, is possible to automatically run Solver and directly obtain beta values only entering the correct rutine in your VBA code, using “SoverOk” and “SolverOptions” functions. In that case, the user doesn’t need to enter parameters to Solver, because VBA automatically calls “SolverOK” and “SolverOptions” to obtain beta values.

          If you do that, user can choose the method (Newton’s method or Solver method) and obtain results without any additional intervention (that’s not currently the case for Solver method: user have to enter parametes to Solver to obtain the final result).

          Regarding HLStats:
          Thank you very much. I’ll prove the release 2.4.

          Additional question:
          Are you thinking about add a routine in logistics regression to detect redundant data? In my case, I suppose that one of my six variables is redundant, but I have no statistical method to prove that.

          Thanks a lot.

          William Agurto.

          • Charles says:

            Dear William,

            Thanks for your clarification.

            The reason why I provide the Solver approach is for people who don’t want to use VBA software. These people can copy one of the logistic regression worksheets from the Real Statistics Examples Worksheet and use Solver to find the coefficients. For everyone else I provided the Newton approach, which is completely automated. Is there any advantage of the Solver VBA solution given that I already offer a Newton VBA solution?

            The objective of webpage http://www.real-statistics.com/logistic-regression/comparing-logistic-regression-models/ is to test whether any of a model’s independent variables is “redundant”, i.e. doesn’t make a significant contribution to the model. I have not created software for doing this yet, but I will likely do so shortly. For now you could copy the approach used on that webpage.

            Charles

  18. Dr Dave says:

    Can you give an example where the data are not grouped in the source data?

  19. Richmond says:

    Hi Charles,
    Nice article! Do you happen to have examples of finding the coefficients with more than 1 variable. Using Example 1, instead of looking just at rems, how do we modify the spreadsheet and formula if we add in age as a variable?

    Thanks!
    Richmond

  20. So glad I searched for logistic regression. So glad I found this site. I can’t thank you enough for the fantastic, clean, direct guide to using Excel for this analysis. I was lost. and am now found!

    • Charles says:

      Glenn,
      Thank you very much for your kind words. I am very pleased that you found the site useful and I hope that you will continue to use the Real Statistics website in the future. The site continues to grow: I will shortly be adding support for multivariate statistical analyses including MANOVA and Factor Analysis.
      Charles

Leave a Reply

Your email address will not be published. Required fields are marked *