Hosmer-Lemeshow Test

The Hosmer-Lemeshow test is used to determine the goodness of fit of the logistic regression model. Essentially it is a chi-square goodness of fit test (as described in Goodness of Fit) for grouped data, usually where the data is divided into 10 equal subgroups. The initial version of the test we present here uses the groupings that we have used elsewhere and not subgroups of size ten.

Since this is a chi-square goodness of fit test, we need to calculate the HL statistic

Hosmer-Lemeshow test

where g = the number of groups. The test used is chi-square with g – 2 degrees of freedom. A significant test indicates that the model is not a good fit and a non-significant test indicates a good fit.

Example 1: Use the Hosmer-Lemeshow test to determine whether the logistic regression model is a good fit for the data in Example 1 in Comparing Logistic Regression Models.

Hosmer-Lemeshow test Excel

Figure 1 – Hosmer-Lemeshow Test

In our example the sum is taken over the 12 Male groups and the 12 Female groups. The observed values are given in columns H and I (duplicates of the input data columns C and D), while the expected values are given in columns L and M. E.g. cell L4 contains the formula =K4*J4 and cell M4 contains the formula =J4-L4 or equivalently =(1-K4)*J4.

The HL statistic is calculated in cell N16 via the formula =SUM(N4:N15). E.g. cell N4 contains the formula =(H4-L4)^2/L4+(I4-M4)^2/M4.

The Hosmer-Lemeshow test results are shown in range Q12:Q16. The HL stat is 24.40567 (as calculated in cell N16), df = g – 2 = 12 – 2 = 10 and p-value = CHIDIST(24.40567, 10) = .006593 < .05 = α, and so the test is significant, which indicates that the model is not a good fit.

Observation: The Hosmer-Lemeshow test needs to be used with caution. It tends to be highly dependent on the groupings chosen, i.e. one selection of groups can give a negative result while another will give a positive result. Also when there are too few groups (5 or less) then usually the test will show a model fit.

As a chi-square goodness of fit test, the expected values used should generally be at least 5. In Example 1 the cells L9, L15, M4 and M10 all have values less than 5, with cells M4 and M10 especially troubling with values less than 1. We now address the problems of cells M4 and M10.

We can eliminate the first of these by combining the first two rows, as shown in Figure 2. Here p-Pred for the first row (cell K23) is calculated as a weighted average of the first two values from Figure 1 using the formula =(J4*K4+J5*K5)/(J4+J5). In a similar manner we combine the 7th and 8th rows from Figure 20.23.

Hosmer-Lemeshow revised test

Figure 2 – Revised Hosmer-Lemeshow Test

The revised version shows a non-significant result, indicating that the model is a good fit.

Observation: The Real Statistics Logistic Regression data analysis tool automatically performs the Hosmer-Lemeshow test. For Example 1 of Finding Logistic Regression Coefficients using Solver, we can see from Figure 5 of Finding Logistic Regression Coefficients using Solver that the logistic regression model is a good fit. For Example 1, Figure 2 of Comparing Logistic Regression Models shows that the model is not a good fit, at least until we combine rows as we did above.

Observation: the following supplemental functions can be used to perform the Hosmer-Lemeshow test with exactly 10 equal-sized data ranges.

Real Statistics Functions: The Real Statistics Resource Pack provides the following two supplemental functions.

HOSMER(R1, lab, raw, iter) – returns a table with 10 equal-sized data ranges based on the data in range R1 (without headings)

HLTEST(R1, lab, raw, iter) – returns the Hosmer statistic (based on the table described above) and the p-value.

When lab = True then the output includes column headings and when lab = False (the default) only the data is outputted. When raw = True then the data in R1 is in raw form and when raw = False (the default) then the data in R1 is in summary form. The parameter iter determines the number of iterations used in the Newton method for calculating the logistic regression coefficients; the default value is 20.

Observation: We repeat Example 1 using these two functions, obtaining the results shown in Figure 3.

Hosmer-Lemeshow functions

Figure 3 –Hosmer-Lemeshow Test 

Referring to Figure 1, the output shown in range F40:K50 of Figure 3 is calculated using the formula =HOSMER(A3:D15, TRUE) and the output shown in range O40:P42 of Figure 3 is calculated using the formula =HLTEST(A3:D15, TRUE). Since the p-value > .05 (assuming α = .05) we conclude that the logistic regression model is a good fit.

9 Responses to Hosmer-Lemeshow Test

  1. Colin says:


    The HOSMER(R1, lab, raw, iter) function fails to calculate the last columns (HL-Suc and HL-Fail). I am using the 2.12 version add-in.


    • Charles says:


      That is correct. As you can see from the comments following Figure 3, the HOSMER function does not calculate these last two columns. They are easy enough to calculate, however. E.g. Cell L41 can be calculated by the formula =(H41-I41)^2/I41 and cell M41 by =(K41-J41)^2/K41.

      I will consider adding these columns to the output of the function in the next release.


  2. Shirley says:

    Dear Sir:
    I’m really curious that how could we get the p-pred value in column K figure 1?

    Thank you very much.
    With regards

  3. jessica says:

    Dear Sir,

    I have calculated the HL statistic using your example. It shows that my model is not a good fit. p-value = 0.000016 and alpha = 0.05. I would like to figure out in which decile the test performs badly. Can I just calculate the p-value for each decile using the chidist funtion?

    With kind regards,


    • Charles says:

      I am not using the true Hosmer-Lemeshow test and so there aren’t any deciles. I would look at other indicators; if they look good then I wouldn’t worry too much about the Hosmer-Lemeshow result.

  4. liana says:

    Dear Sir,
    I have calculated statistics like your example, but I am confused if the independent variable consists of 3 variables. how to find exp value for third variable. whereas in the example you gave in Figure 1, look for exp values for women in a 1-ppred * total way
    thank you,

Leave a Reply

Your email address will not be published. Required fields are marked *