# Comparing the slopes for two independent samples

In this section we test whether the slopes for two independent populations are equal, i.e. we test the following null hypothesis:

H0:  β1 = β2 i.e. β1 – β2 = 0

The test statistic is

If the null hypothesis is true then

where

If the two error variances are equal, then as for the test for the differences in the means, we can pool the estimates of the error variances, weighing each by their degrees of freedom, and so

Now

Since we can replace the numerators of each by the pooled value $s_{Res}^2$, we have

Note that the while the null hypothesis that β = 0 is equivalent to ρ = 0, the null hypothesis that  β1 = β2  is not equivalent to ρ1 = ρ2.

Example 1: We have two samples, each comparing life expectancy vs. smoking. The first sample is for males and the second for females. We want to determine whether there is any significant difference in the slopes for these two populations. We assume that the two samples have the values in Figure 1 (for men the data is the same as that in Example 1 of Regression Analysis):

Figure 1 – Data for Example 1

As can be seen from the scatter diagrams in Figure 1, it appears that the slope for women is less steep than for that of men. In fact, as can be seen from Figure 2, the slope of the regression line for men is -0.6282 and the slope for women is -0.4679, but is this difference significant?

As can be seen from the calculations in Figure 2, using both pooled and unpooled values for sRes, the null hypothesis, H0: the slopes are equal, cannot be rejected. And so we cannot conclude that there is any significant difference between the life expectancy of males and females for any incremental amount of smoking.

Figure 2 – t-test to compare slopes of regression lines

Real Statistics Function: The following supplemental array function is provided by the Real Statistics Resource Pack. Here Rx1, Ry1 are ranges containing the X and Y values for one sample and Rx2, Ry2 are the ranges containing the X and Y values for a second sample.

SlopesTest(Rx1, Ry1, Rx2, Ry2, b, lab): outputs the standard error of the difference in slopes sb1–b2, t, df and p-value for the test described above for comparing the slopes of the regression lines for the two samples.

If b = True (the default) then the pooled standard error sb1–b2 is used (as in cell T10 of Figure 2); otherwise the non-pooled standard error is used (as in cell N10 of Figure 2).

If lab = True then the output is a 4 × 2 range where the first column contains labels and the second column contains the values described above and if lab = False (the default) only the data is outputted (in the form of a 4 × 1 range).

The SlopesTest  function only produces the correct results if there are no missing data elements in Rx1, Ry1, Rx2, Ry2.

Observation: For Example 1, the formula

=SlopesTest(A5:A19,B5:B19,D5:D20,E5:E20,FALSE,TRUE)

generates the output in range N29:N32 of Figure 3, while the formula

=SlopesTest(A5:A19,B5:B19,D5:D20,E5:E20)

generates the output in range O29:O32.

Figure 3 – Comparing slopes using Real Statistics function

### 47 Responses to Comparing the slopes for two independent samples

1. Aravindh says:

Hi,

You have a cell wrong in the excel sheet. Please fix it if you can. In Figure 2, cell M11 should be equal to (b1-b2)/(sb1-b2), NOT (b1-b2)/(sb1-sb2)

Thanks Aravindh,
That was a great catch. I have made the change that you suggested. Thanks for your help.
Charles

Dear Dr. Zaiontz
thank you for your useful example. I have a question:
As I found in your example, this method will be useful if linear regression consider. If we have some data that power regression will fit, in this case what should we do? can we use directly with those data or we should change them to linear regression (by log transferred for example)? for example, if we want to compare the regression line between fish male and female height and weight (which is power regression), we can use directly from those data?

• Charles says:

I believe that what you have suggested should work since you are only using a transformation. The formulas for comparing the slopes need to be applied after the transformation so that you are comparing the slopes of two (straight) lines.
Charles

Thank you for your laconic comments. What about “a” (Y intercepts) between two lines? Is this factor important if we want to compare the slopes between two lines or only the “b” (slope) should be compared?

3. Andrew Tilley says:

Do you have a textbook or a paper you can cite to justify these equations? I’m fairly certain the test you present here is incorrect. The test statistic should be (b1 – b2) / sqrt(SE(b1)^2 + SE(b2)^2), where SE(b1) is given not by steyx (this is the standard error of the predicted y value) nor by the equation you give for s_b1. See, e.g., this link: (http://stats.stackexchange.com/questions/44838/how-are-the-standard-errors-of-coefficients-calculated-in-a-regression) on accurate calculation of these standard errors.

• Charles says:

Andrew,
I used David Howell’s textbook entitled “Statistical Methods for Psychology”, Wadsworth CENGAGE Learning, 2010. Shortly I will recheck my test samples using your approach and the approach I used on the website.
Charles

4. Andrew Tilley says:

Thanks for your quick reply, Charles. I actually caught the source of my confusion, and I’m now convinced that your approach is actually correct! Sorry about that.

5. Lauri says:

Thank you for this, your slopestest function saved me a lot of trouble!

6. Colin says:

Sir

I am a little confusing about the “pool the estimates of the error variances” . The formula (S_res squre) you used in this website is different with the formula you used in Excel workbook.

Colin

• Colin says:

Sir

Please ignore my question, you are right.

Colin

7. Johnathan Clayborn says:

Hi Dr. Zaiontz,

This is exactly the type of information that I was looking for to complete a study that I was working on. I was wondering two questions;

1st) can you explain more about about I would go about finding the X/Y values of the lines in order to perform these calculations? I’m using a trendline in time-series line graph and I can see that there is definitely statistical significance, but I need to express it mathematically.

2nd) Do you know if this method is possible using SPSS?

• Charles says:

Hi Johnathan,

1) I am not sure what you mean. The X/Y values are the data that you are testing.

2) I don’t use SPSS, but I believe that the answer is yes. E.g. the following webpage references doing this in SPSS: http://core.ecu.edu/psyc/wuenschk/MV/multReg/Potthoff.pdf

Charles

8. Carl says:

Hi, I tried testing your SlopesTest using your Example1 data. When I input it I only get the following “result”: std err. This appears to be only the label as in your Figure 3. When I repeat the formula by excluding the “false,true” part I get the result 0.23271.
Any ideas?

• Charles says:

Carl,
SlopesTest is an array formula. Try entering the formula and then pressing Ctrl-Shift-Enter. The full results should be displayed. If you press Enter instead, then only the first cell in the output will appear.
Charles

• Laura says:

I have the same problem as Carl, except I have tried ‘Ctrl-Shift-Enter’ and it makes no difference to the result. It’s either ‘std err’ or a number (in my case 28.16). Please let me know if you are aware of any other factors that might be stopping this formula from working.

• Charles says:

Laura,

Since this is an array function, you need to first highlight a 4 x 1 column range, then enter a formula of form SlopesTest(R1, R2, R3, R4, b) where R1, R2. R3 and R4 are ranges and b is either TRUE or FALSE, and finally press Ctl-Shft-Enter. This will fill the highlighted range with the following values: std err, t, df, p-value.

Alternatively you can first highlight a 4 x 2 range, then enter a formula of form SlopesTest(R1, R2, R3, R4, b, TRUE) and finally press Ctrl-Shift-Enter. This will fill the second highlighted column with the same values as described above and fill the first column with the appropriate labels.

The key is that you must first highlight an output range of sufficient size to contain all the output. It can even be larger than necessary (the extra cells will be filled with #N/A.

Charles

• Laura says:

Thank you for the advice Charles. The problem is now fixed thanks to your suggestion!

Cheers

9. Ramiro says:

Hi Charles,

I noticed that if there are holes in the data the result of SlopesTest is different. Belos are the numbers I tried, they are the same but some points are missing one or the other piece of data. Since only data with x and y should count, I thought the SlopesTest would give me the same result. Should I always remove missing data before doing the SlopesTest?
Thanks you,
Ramiro

This gave me p=0.131431
1 3 1 1
2 3 2 2
3 3 3 5
4 4 6
5 4 8
4 5 9
6 5 6
7 6 7 13

and this gave me p=0.14889

1 3 1 1
2 3 2 2
3 3 3 5
5 4 4 6
6 5 5 9
7 6 7 13

• Charles says:

Hi Ramiro,
In the current implementation of the SlopesTest function the correct values are generated only if there is no missing data. You need to remove any missing data before using the function.
Charles

Hi.

How can I do this on excel 2010? I was trying TDIST formula but this function is available only with Excel 2007 or earlier versions and I am unable to understand the 2010 version. Or post a picture using Excel 2010 please.

Cheers

• Charles says:

I am using Excel 2010 and have no problem using the Excel 2007 functions such as TDIST. In any case here are substitutions for Excel 2010:

Replace TDIST(x,df,2) by T.DIST.2T(x,df)
Replace TINV(p,df) by T.INV.2T(p,df)

Charles

11. David says:

Hi Charles,
Thank you very much for this great post!
I have a small question. What if each one of my data (y) is actually a mean over a lager data set, how can I account for it? should I expect a different result?
Thanks.

• Charles says:

David,
I’m not sure how you would account for this (or if you could account for this). I would think that this would change things considerably.
I suggest that you try a few examples where you create some data (i.e. the larger data sets) and have the y values be the means over the larger data set that you have created. Then run the test using the means and run it again using the larger data set and see what sort of differences there are.
Charles

• David says:

David

12. Patricia Olson says:

Hi Charles
Your RealStatistics Resource Pack for Excel is great. Thank you for providing it. I have been using R but am still learning the language. Your tool is much more time saving for some statistical analyses than R. However, I am having some problems accessing some of the functions such as SlopesTest. I’m using Excel 7. I tried to access it through the example worksheet and still just get the #VALUE! message.
Thank you for your help on this.
Cheers
Patricia

13. Patricia Olson says:

Charles
Never mind… I figured it out finally. I was entering the array data incorrectly.

Thanks
Patricia

14. Gina W says:

Hey,

I’ve got a question: Does this comparison also work if I have 3 samples and not only 2?

• Charles says:

Gina,
No, you need to run ANCOVA. See Chapter 18 of J Zar, Biostatistical Analysis, 2nd edition, Prentice-Hall, 1984 for more details.
Charles

• Gina W says:

Thanks!

I used a Mediationmodel.

Is it right, that in ANCOVA I use “Group” as fixed, Outcome as dependent and Mediator as Covariate?

Greetings

• Charles says:

Gina,
Sorry that it has taken me so long to respond to your question. I seemed to have missed your response. I am not very familiar with Mediation models and so I am reluctant to answer your question. I plan to look into these sorts of models later this year.
Charles

15. Mattt says:

Sir,
in Figure 2, cell V10 the code cites: “= Sres…” does that refer to the sqrt of cell V9? Because cell S9 refers to the parameter “Sres^2″ and elsewhere small details such these are called out quite explicitly, I don’t know for certain which way to go [i.e., use the “Sres^2″ or sqrt(Sres^2)].

Matt

• Charles says:

Matt,
The formula in cell T10 (which corresponds to V10) is =SQRT(T9*(1/(N7^2*(N4-1))+1/(O7^2*(O4-1))))
The formula in cell T9 is =((N4-2)*N6^2+(O4-2)*O6^2)/(N4+O4-4)
Charles

16. MJ says:

Any chance of providing the t-test to compare two intercepts? Thanks.

17. Rebecca says:

Hi,
I was just wondering why the degrees of freedom are n-4?
Thanks,
Rebecca

• Charles says:

Rebecca,
As you can see from the webpage http://www.real-statistics.com/regression/hypothesis-testing-significance-regression-line-slope/ df = n-2 for a one slope test. With two slopes the n-2 becomes n-4 (minus 2 for each slope).
Charles

• Rebecca says:

Thank you very much.
Also I was wondering what are the assumptions that are made about the data to conduct the t-test? That it was normally distributed for both groups?
I was a bit confused about whether to use the pooled or non-pooled standard error, could you explain what you meant by “if b = true”?
And finally, would you report these results in a lab report the same as you would report the results of any t-test, except reporting the values of b and standard deviations of b instead of means?
Sorry about all the questions, I’m very new to stats and using excel!
Best wishes,
Rebecca

• Charles says:

Rebecca,
The assumptions are similar to those for the usual t test, including both samples are normally distributed (although such tests are usually pretty robust to violations of this assumption)
Use the pooled standard error if the variances are approximately equal; otherwise use the non-pooled standard errors.
b = True means that the fifth argument in the function takes the value True.
Yes, you should report the results in a manner similar to the usual t test.
Charles

18. Jordan Chill says:

Hi Dr. Zaiontz,

Thanks for the illumniating example.
What happens if the two functions are NOT linear and CANNOT be transformed easily into a linear function?
In my case, I am fitting two sets of time-dependent data to the function f(t), and f(t) is a1*cos(pi*a2*t)cos(pi*a3*t)*exp(-a4*t), where a1…a4 are the fitted parameters. I am interested in whether the a3 parameter obtained for two different fits is indeed significantly different. We have 18 measurements in each fit (so 14 degrees of freedom).
Jordan

• Charles says:

Jordan,
You need to perform non-linear regression. Probably the easiest way to do this is with Excel’s Solver. I give some examples on the following webpages:
Exponential Regression
Logistic Regression
Charles

19. Greg says:

Hi Charles,

Can you give a little more information on the citation for this work? Meaning I see you said Statistical Methods for Psychology”, Wadsworth CENGAGE Learning, 2010 however what Chapter and/or pages where you using?

Thanks,

Greg

• Charles says:

Greg,
This is a good suggestion. I’ll try to do this in the future, since it would quite difficult for me to do this for all the previous references.
Charles

20. Max K says:

Quick question for a non-expert, how do you calculate Sb1-b2 (as in the subscript = (b1-b2)). This is different than Sb1-Sb2 yes?

• Charles says:

Max,

Yes, the standard error of b1-b2 is different from the standard error of b1 minus the standard error of b2.

I give two formulas for how to calculate standard error of b1-b2 in the very beginning of the referenced webpage. The first is that the standard error of b1-b2 is the square root of the sum of the square of the standard error of b1 plus the square of the standard error of b2.

Charles

21. JJ says:

Hihi,
I have a question, also for a non-expert.
I have 10 values for x and y plotted as a scatter with a trend line (which gave me an intercept and slope).
I want to test if the slope is sig. different from slope x=y.
can the test =SlopesTest be used in this example?
( i have the dutch excel and for some reason cannot find the corresponding dutch command, to try for myself)
thanks, JJ

• Charles says:

JJ,

SlopesTest is not a standard Excel function. It is part of the Real Statistics Resource Pack. You need to download and install the resource pack to use this function. This is free.

The Slopes Test requires two samples of xy points. You only have one such sample. Here are a few possible options for how to conduct the test you want. I prefer choice 3. It is the easiest to implement.

1. You could try to create a second sample consisting of points whose x and y values are equal and then use the SlopesTest function. I’m not sure this approach is completely sound, though.

2. You can use the Testing significance of slope approach. This method tests whether the slope is equal zero based on testing the correlation coefficient. You want to test whether the slope equals 1 (the slope of y = x). You would need to modify this test probably using the Fisher transformation in some way (see testing correlation coefficient).

3. You take all the xy points in your sample and create a new sample consisting of the points xy’ where y’ = y-x. If the points xy in your original sample have slope which is not significantly different from 1 (the slope of y=x) then the points xy’ in the new sample should have slope which is not significantly different from zero. The converse is also true. Fortunately there is a test to see whether the slope of a regression line is significantly different from zero, namely the test described on the webpage Testing significance of slope.

In any case, since your sample is so small, the power of any of these tests will likely be low.

Charles