Definition 1: The autocorrelation function (ACF) at lag k, denoted ρk, of a stationary stochastic process is defined as ρk = γk/γ0 where γk = cov(yi, yi+k) for any i.
Note that γ0 is the variance of the stochastic process.
Definition 2: The mean of a time series y1, …, yn is
The autocovariance function at lag k, for k ≥ 0, of the time series is defined by
The autocorrelation function (ACF) at lag k, for k ≥ 0, of the time series is defined by
The variance of the time series is r0. A plot of rk against k is known as a correlogram.
Observation: The definition of autocovariance given above is a little different from the usual definition of covariance between {y1, …, yn-k} and {yk+1, …, yn} in two respects: (1) we divide by n instead of n–k and we subtract the overall mean instead of the means of {y1, …, yn-k} and {yk+1, …, yn} respectively. For values of n which are large with respect to k, the difference will be small.
Example 1: Calculate s2 and r2 for the data in range B4:B19 of Figure 1.
Figure 1 – ACF at lag 2
The formulas for calculating s2 and r2 using the usual COVARIANCE.S and CORREL functions are shown in cells G4 and G5.
The formulas for s0, s2 and r2 from Definition 2 are shown in cells G8, G11 and G12 (along with an alternative formula in G13). Note that the values for s2 in cells E4 and E11 are not too different, as are the values for r2 shown in cells E5 and E12; the larger the sample the more likely these values will be similar
Real Statistics Function: The Real Statistics Resource Pack supplies the following functions:
ACF(R1, k) = the ACF value at lag k for the time series in range R1
ACVF(R1, k) = the autcovariance at lag k for the time series in range R1
Note that ACF(R1, k) is equivalent to
=SUMPRODUCT(OFFSET(R1,0,0,COUNT(R1)-k)-AVERAGE(R1),OFFSET(R1,k,0,COUNT(R1)-k)-AVERAGE(R1))/DEVSQ(R1)
Observation: There are theoretical advantages for using division by n instead of n–k in the definition of sk, namely that the covariance and correlation matrices will always be definite non-negative (see Positive Definite Matrices).
Observation: Even though the definition of autocorrelation is slightly different from that of correlation, ρk (or rk) still takes a value between -1 and 1, as we see in Property 2.
Property 1: For any stationary process, γ0 ≥ |γi| for any i
Proof: Click here
Property 2: For any stationary process, |ρi| ≤ 1 (i.e. -1 ≤ ρi ≤ 1) for any i > 0
Proof: By Property 1, γ0 ≥ |γi| for any i. Since ρi = γi /γ0 and γ0 ≥ 0 (actually γ0 > 0 since we are assuming that ρi is well-defined), it follows that
Example 2: Determine the ACF for lag = 1 to 10 for the Dow Jones closing averages for the month of October 2015, as shown in columns A and B of Figure 2 and construct the corresponding correlogram.
The results are shown in Figure 2. The values in column E are computed by placing the formula =ACF(B$4:B$25, D5) in cell E5, highlighting range E5:E14 and pressing Ctrl-D.
Figure 2 – ACF and Correlogram
As can be seen from the values in column E or the chart, the ACF values descend slowly towards zero. This is typical of an autoregressive process.
Observation: A rule of thumb is to carry out the above process for lag = 1 to n/3 or n/4, which for the above data is 22/4 ≈ 6 or 22/3 ≈ 7. Our goal is to see whether by this time the ACF is significant (i.e. statistically different from zero). We can do this by using the following property.
Property 3 (Bartlett): In large samples, if a time series of size n is purely random then for all k
Example 3: Determine whether the ACF at lag 7 is significant for the data from Example 2.
As we can see from Figure 3, the critical value for the test in Property 3 is .417866. Since r7 = .303809 < .417866, we conclude that is not significantly different from zero.
Figure 3 – Bartlett’s Test
Note that values of k up to 5 are significant and those higher than 5 are not significant.
Property 4 (Box-Pierce): In large samples, if ρk = 0 for all k ≤ m, then
A more statistically powerful version of Property 4, especially for smaller samples, is given by the next property.
Property 5 (Ljung-Box): If ρk = 0 for all k ≤ m, then
Example 4: Use the Box-Pierce and Ljung-Box statistics to determine whether the ACF values in Example 2 are statistically equal to zero for all lags less than or equal to 5 (the null hypothesis).
The results are shown in Figure 4.
Figure 4 – Box-Pierce and Ljung-Box Tests
We see from these tests that ACF(k) is significantly different from zero for at least one k ≤ 5, which is consistent with the correlogram in Figure 2.
Real Statistics Functions: The Real Statistics Resource Pack provides the following functions to perform the tests described by the above properties.
BARTEST(r, n, lag) = p-value of Bartlett’s test for correlation coefficient r based on a time series of size n for the specified lag.
BARTEST(R1,, lag) = BARTEST(r, n, lag) where n = the number of elements in range R1 and r = ACF(R1,lag)
PIERCE(R1,,lag) = Box-Pierce statistic Q for range R1 and the specified lag
BPTEST(R1,,lag) = p-value for the Box-Pierce test for range R1 and the specified lag
LJUNG(R1,,lag) = Ljung-Box statistic Q for range R1 and the specified lag
LBTEST(R1,,lag) = p-value for the Ljung-Box test for range R1 and the specified lag
In the above functions where the second argument is missing, the test is performed using the autocorrelation coefficient (ACF). If the value assigned instead is 1 or “pacf” then the test is performed using the partial autocorrelation coefficient (PACF) as described in the next section. Actually if the second argument takes any value except 1 or “pacf”, then the ACF value is used.
E.g. BARTEST(.303809,22,7) = .07708 for Example 3 and LBTEST(B4:B25,”acf”,5) = 1.81E-06 for Example 4.













