Sometimes the dependent variable depends not just on the independent variables but also on the interaction between the variables. The model to use in this case is:
This is equivalent to a usual multiple regression model
studied in Multiple Regression Analysis where x3 = x1 · x2.
Example 1: We postulate that the amount of votes a candidate gets depends on the amount of amount of money they spend and their quality (position on issues, ability to debate, charisma, organizational abilities, etc.). The table on the left of Figure 1 shows the percentage of votes 10 candidates received in different elections along with the amount of money spent and their quality. Determine the relationship between votes, money and quality.
Figure 1 – Data for Example 1 plus interaction model
To capture the interaction between money and quality, we add an independent variable called “Interaction” (as described in the table on the right of Figure 1). Interaction is simply the product of the money and quality values. We now use the Regression data analysis tool on the interaction model. The resulting output is shown in Figure 2.
Figure 2 – Regression with interaction
This model is almost a perfect fit for the data (99.7% Adjusted R Square), and shows that we can predict the percentage of votes a candidate will get via the formula:
Votes = -12.22 – 0.86 * Money + 4.86 * Quality + 1.56 * Money * Quality
We can also run the Regression data analysis tool on the original data without the interaction variable, obtaining the output in Figure 3.
Figure 3 – Regression without interaction
This model is also a good fit for the data (p-value = 0.000499 < .05 = α), but with an Adjusted R Square value of 77.4%, not quite as good as the model with interaction.
We can use the Real Statistics Extract Columns from a Data Range data analysis tool to automate the process of creating the interaction between two variables.
For example, to create the interaction between Money and Quality in Example 1, press Ctrl-m and select Extract Columns from a Data Range from the menu. Now enter A3:D19 into the Input Range of the dialog box that appears (as described in Figure 4 of Categorical Coding in Regression) and press the OK button.
Now, select both Money and Quality from the list box in the dialog box that appears as shown on the right side of Figure 4 (by clicking on Money and, while holding down the Ctrl key, clicking on Quality) and press the Add Inter button. Since neither Money nor Quality have yet been added to the output, these too are copied over along with the interaction. The result is as shown in range E4:G16 of Figure 4.
Figure 4 – Adding interaction using the Extract Columns data analysis tool