Property 1: The maximum of the log-likelihood statistic occurs when
where the yi are considered constants from the sample and the pi are defined as follows:
which is the odds ratio (see Definition 3 of Basic Concepts of Logistic Regression). Now let
To make our notation simpler we will define xi0 = 1 for all i, and so we have
Also note that
The maximum value of ln L occurs where the partial derivatives are equal to 0. We first note that
The maximum of occurs when
for all j, completing the proof.
Observation: To find the values of the coefficients bi we need to solve the equations of Property 1.
We do this iteratively using Newton’s method (see Definition 2 and Property 2 of Newton’s Method), as described in the following property.
Property 2: Let B = [bj] be the (k+1) × 1 column vector of logistic regression coefficients, let Y = [yi] be the n × 1 column vector of observed outcomes of the dependent variable, let X be the n × (k+1) design matrix, let P = [pi] be the n × 1 column vector of predicted values of success and V = [vi] be the n × n matrix where vi = pi (1 – pi). Then if B0 is an initial guess of B and for all m we define the following iteration
then for m sufficiently large Bm+1 ≈ Bm, and so Bm is a reasonable estimate of the coefficient vector.
where xi0 = 1. We now calculate the partial derivatives of the fj.
Let vi = pi (1 – pi) and using the terminology of Definition 2 of Newton’s Method, define
where X is the design matrix (see Definition 3 of Multiple Regression Least Squares), Y is the column matrix with elements yi and P is the column matrix with elements pi. Let V = the diagonal matrix with the elements vi on the main diagonal. Then
We can now use Newton’s method to find B, namely define the k × 1 column vectors Pm and Bm and the (k+1) × (k+1) square matrices Vm and Jm as follows based on the values of P, F, V and J described above.
Then for sufficiently large m, F(Bm) = 0, which is equivalent to the statement of the property.