Logistic Regression using Newton’s Method Detailed

Property 1: The maximum of the log-likelihood statistic occurs when

image2186

Proof: Let

Log-likelihood statistic

where the yi are considered constants from the sample and the pi are defined as follows:

image3511

Here

image7112which is the odds ratio (see Definition 3 of Basic Concepts of Logistic Regression). Now let

image7134

To make our notation simpler we will define xi0 = 1 for all i, and so we have

image7113

Thus

image7114image7115

Also note that

image7121

The maximum value of ln L occurs where the partial derivatives are equal to 0. We first note that

image7123 image7124

Thus

image7125

image7126

The maximum of  occurs when

image7127

for all j, completing the proof.

Observation: To find the values of the coefficients bi we need to solve the equations of Property 1.

We do this iteratively using Newton’s method (see Definition 2 and Property 2 of Newton’s Method), as described in the following property.

Property 2: Let B = [bj] be the (k+1) × 1 column vector of logistic regression coefficients, let Y = [yi] be the n × 1 column vector of observed outcomes of the dependent variable, let X be the × (k+1) design matrix, let P = [pi] be the n × 1 column vector of predicted values of success and V = [vi] be the n × n matrix where vi = pi (1 – pi). Then if B0 is an initial guess of B and for all m we define the following iteration

Logistic regression iteration Newton

then for m sufficiently large  Bm+1 ≈ Bmand so Bm is a reasonable estimate of the coefficient vector.

Proof: Define

image7128

where xi0 = 1. We now calculate the partial derivatives of the fj.

image7129

Let vi = pi  (1 – pi) and using the terminology of Definition 2 of Newton’s Method, define

image7130

Now

image7131

where X is the design matrix (see Definition 3 of Multiple Regression Least Squares),  Y is the column matrix with elements yi and P is the column matrix with elements pi. Let V = the diagonal matrix with the elements vi on the main diagonal. Then

image7132

We can now use Newton’s method to find B, namely define the k × 1 column vectors Pm and Bm and the (k+1) × (k+1) square matrices Vm and Jm as follows based on the values of P, F, V and J described above.

image8022

image8023

image7133

Then for sufficiently large m, F(Bm) = 0, which is equivalent to the statement of the property.

3 Responses to Logistic Regression using Newton’s Method Detailed

  1. Mike says:

    I wish there was even a simpler step by step explanation of this. I get lost with all the variable substitutions. I.e. newtons method for solving logistic regression for dummies.

    • Charles says:

      Mike,
      Good to see that some people are looking at the more mathematical part of the site. I agree that the proof given is a bit complicated. I will look at this again shortly and see if I can find a simpler approach.
      Charles

    • Charles says:

      Mike,
      I have just updated this page on the website. I hope that you find the new explanation clearer.
      Charles

Leave a Reply

Your email address will not be published. Required fields are marked *