99We now turn our attention to the situation where we use regression with seasonal data: hourly, weekly, monthly, quarterly, etc. For hours we have 24 periods in a day, for months we have 12 periods in a year, etc. In particular, we are concerned with cases where the seasons influence the trend of the data (e.g. annual sales revenues are increasing, but revenues in June are lower than in September).
Example 1: Company XYZ’s quarterly revenues for 2012 through 2015 are shown in column C of Figure 1. We would like to forecast the quarterly revenues for 2016 based on a linear regression model.
As we see from the blue curve in Figure 2, although the annual trend of the revenues may be linear, the graph is certainly not linear due to seasonal fluctuations. We need a way to handle these seasonal fluctuations.
Figure 1 – Seasonal forecasting
The approach we use is to add categorical variables to represent the four seasons (Q1, Q2, Q3, Q4). Three dummy variables are required (one fewer than the number of periods). The coding based on these variables are shown in columns E, F and G of Figure 1. Column E contains a 1 for revenue data in Q1 and a 0 for revenue data not in Q1. Similarly column F contains a 1 for data in Q2 and a 0 for data not in Q2. Column G codes for data in Q3. Revenue data in Q4 will have a 0 in columns E, F and G.
We also include a variable t in column D which simply lists the time periods sequentially ignoring the quarter.
Figure 2 – Seasonal Trends
We now construct a multiple linear regression model using the data in range D3:G19 as our X values and range C3:C19 as our Y values. This analysis is shown in Figure 3.
Figure 3 – Regression Analysis with Seasonality
We can use this model to create predictions for the historical data in 2012-2015 as well as for 2016 (future forecast). These predictions are shown in column H of Figure 1 using the array formula =TREND(C4:C19,D4:G19,D4:G23). This is the red curve in Figure 2. E.g. the prediction for Q1 of 2012 is $10,812,500 (cell H4), which is fairly close to the actual revenue of $10,500,000 (cell C4).
The forecasted values for each quarter in 2016 is shown range H20:H23 of Figure 1.