In Dickey-Fuller Test we describe the Dickey-Fuller test which determines whether an AR(1) process has a unit root, i.e. whether it is stationary. We now extend this test to AR(p) processes.
For the AR(1) process
we take the first difference to obtain the equivalent form
where Δyi = yi – yi-1 and β = φ – 1, and test the hypothesis
H0: β = 0 (equivalent to φ = 1)
H1: β < 0 (equivalent to φ < 1)
If |φ| = 1, we have what is called a unit root (i.e. the time series is not stationary). We have three version of the test.
|Type 0||No constant, no trend||Δyi = β1 yi-1 + εi|
|Type 1||Constant, no trend||Δyi = β0 + β1 yi-1 + εi|
|Type 2||Constant and trend||Δyi = β0 + β1 yi-1 + β2 i+ εi|
The extension to AR(p) processes has the following three versions.
|Type 0||No constant, no trend|
|Type 1||Constant, no trend|
|Type 2||Constant and trend|
Once you know how many lags to use, the augmented test is identical to the simple Dickey-Fuller test. We can use the Akaike Information Criterion (AIC) or Bayesian Information Criteria (BIC) to determine how many lags to consider, as described in Comparing ARIMA Models.
Thus we can now use the full version of the ADFTEST function which was introduced in Dickey-Fuller Test.
Real Statistics Function: The Real Statistics Resource Pack provides the following array function where R1 contains the a column of time series data.
ADFTEST(R1, lab, lag, criteria, type, alpha): returns a 8 × 1 range which contains the following values: tau-statistic, tau-critical, yes/no (stationary or not), AIC value, BIC value, # of lags (p), the first-order autoregression coefficient and estimated p-value.
If lab = TRUE (default is FALSE), the output consists of a 8 × 2 range whose first column contains labels. type = the test type (0, 1, 2, default is 1). The default value for alpha is .05.
The arguments lag and criteria, which were not used for the Dickey-Fuller Test, are defined as follows:
- lag = the maximum number of lags to use in the test (default 0)
- criteria = “none” : no criteria is used, and so p is set to the value of lag
- criteria = “aic” : the AIC is used to determine the number of lags p (where p ≤ lag)
- criteria = “bic” : the BIC is used to determine the number of lags p (where p ≤ lag)
To specify the criteria, you can use “AIC” or 1 instead of “aic”, you can use “BIC” or 2 instead of “bic” and you can use “” or 0 instead of “none”.
If lag < 0 then lag will automatically be set to value =Round(12*(n/100)^.25,0), as proposed by Schwert, where n = the number of elements in the time series.
To specify the test type, you can use “” or “none” instead of 0, you can use “drift” or “constant” instead of 1 and you can use “trend” or “both” instead of 2.
Example 1: Determine whether the data in column A of Figure 1 has a unit root based on a model without trend based on the Schwert estimate for maximum number of lags using the AIC criteria. Also determine whether there is a unit root based on a model with trend and a maximum number of lags equal to 7 using the AIC criteria.
Figure 1 – Time Series
Here range J4:K8 contains the array formula =DescStats(A3:A22,TRUE). We see that the mean value of the time series is 2.376, and so we conclude that the time series likely has a non-constant mean. We could confirm this by using a t test to see whether the population mean is significantly different from zero.
We now use the array formula =ADFTEST(A3:A22,TRUE,-1) to show the results of the ADF test without trend. The -1 means that we are using the Schwert estimate for the maximum number of lags. We are also using the default type = 1, which results in the test for constant without trend. As we can see from range P4:P11 in Figure 2, since tau-stat > tau-crit, the time series is not stationary.
Figure 2 – ADF Test
Note that the above formula is effectively using a maximum lag count of 8, which can see using the formula =ROUND(12*(K4/100)^0.25,0) in cell K10 from Figure 1.
Looking at the chart in Figure 1, it appears that the time series has a trend, and so we repeat the ADF Test with constant and trend to get the results shown in range S4:T11 of Figure 2 using the array formula =ADFTEST(A3:A22,TRUE,7,”aic”,2). Here type = 2 (constant and trend) and maximum number of lags = 7. Note that we didn’t use 8 as the maximum number of lags since that would produce error values (based on insufficient degrees of freedom in the underlying regression analysis).
Real Statistics Data Analysis Tool: As explained in Time Series Testing Tools, the Time Series Testing data analysis tool can be used to perform the Dickey-Fuller Test. In fact, it can also be used to perform the Augmented Dickey-Fuller Test.