The Hosmer-Lemeshow test is used to determine the goodness of fit of the logistic regression model. Essentially it is a chi-square goodness of fit test (as described in Goodness of Fit) for grouped data, usually where the data is divided into 10 equal subgroups. The initial version of the test we present here uses the groupings that we have used elsewhere and not subgroups of size ten.
Since this is a chi-square goodness of fit test, we need to calculate the HL statistic
where g = the number of groups. The test used is chi-square with g – 2 degrees of freedom. A significant test indicates that the model is not a good fit and a non-significant test indicates a good fit.
Example 1: Use the Hosmer-Lemeshow test to determine whether the logistic regression model is a good fit for the data in Example 1 in Comparing Logistic Regression Models.
Figure 1 – Hosmer-Lemeshow Test
In our example the sum is taken over the 12 Male groups and the 12 Female groups. The observed values are given in columns H and I (duplicates of the input data columns C and D), while the expected values are given in columns L and M. E.g. cell L4 contains the formula =K4*J4 and cell M4 contains the formula =J4-L4 or equivalently =(1-K4)*J4.
The HL statistic is calculated in cell N16 via the formula =SUM(N4:N15). E.g. cell N4 contains the formula =(H4-L4)^2/L4+(I4-M4)^2/M4.
The Hosmer-Lemeshow test results are shown in range Q12:Q16. The HL stat is 24.40567 (as calculated in cell N16), df = g – 2 = 12 – 2 = 10 and p-value = CHIDIST(24.40567, 10) = .006593 < .05 = α, and so the test is significant, which indicates that the model is not a good fit.
Observation: The Hosmer-Lemeshow test needs to be used with caution. It tends to be highly dependent on the groupings chosen, i.e. one selection of groups can give a negative result while another will give a positive result. Also when there are too few groups (5 or less) then usually the test will show a model fit.
As a chi-square goodness of fit test, the expected values used should generally be at least 5. In Example 1 the cells L9, L15, M4 and M10 all have values less than 5, with cells M4 and M10 especially troubling with values less than 1. We now address the problems of cells M4 and M10.
We can eliminate the first of these by combining the first two rows, as shown in Figure 2. Here p-Pred for the first row (cell K23) is calculated as a weighted average of the first two values from Figure 1 using the formula =(J4*K4+J5*K5)/(J4+J5). In a similar manner we combine the 7th and 8th rows from Figure 20.23.
Figure 2 – Revised Hosmer-Lemeshow Test
The revised version shows a non-significant result, indicating that the model is a good fit.
Observation: The Real Statistics Logistic Regression data analysis tool automatically performs the Hosmer-Lemeshow test. For Example 1 of Finding Logistic Regression Coefficients using Solver, we can see from Figure 5 of Finding Logistic Regression Coefficients using Solver that the logistic regression model is a good fit. For Example 1, Figure 2 of Comparing Logistic Regression Models shows that the model is not a good fit, at least until we combine rows as we did above.
Observation: the following supplemental functions can be used to perform the Hosmer-Lemeshow test with exactly 10 equal-sized data ranges.
Real Statistics Functions: The Real Statistics Resource Pack provides the following two supplemental functions.
HOSMER(R1, lab, raw, iter) – returns a table with 10 equal-sized data ranges based on the data in range R1 (without headings)
HLTEST(R1, lab, raw, iter) – returns the Hosmer statistic (based on the table described above) and the p-value.
When lab = True then the output includes column headings and when lab = False (the default) only the data is outputted. When raw = True then the data in R1 is in raw form and when raw = False (the default) then the data in R1 is in summary form. The parameter iter determines the number of iterations used in the Newton method for calculating the logistic regression coefficients; the default value is 20.
Observation: We repeat Example 1 using these two functions, obtaining the results shown in Figure 3.
Figure 3 –Hosmer-Lemeshow Test
Referring to Figure 1, the output shown in range F40:K50 of Figure 3 is calculated using the formula =HOSMER(A3:D15, TRUE) and the output shown in range O40:P42 of Figure 3 is calculated using the formula =HLTEST(A3:D15, TRUE). Since the p-value > .05 (assuming α = .05) we conclude that the logistic regression model is a good fit.