I am pleased to announce Release 4.6 of the Real Statistics Resource Pack. The new release is now available for free download (Download Resource Pack) for Excel 2010, 2013 and 2016 (Windows version) environments. It will be available tomorrow for Excel 2007 users.
The spreadsheets for all the examples used on the Real Statistics website, including those related to the new Release 4.6 features) are available for free download (Download Examples Workbooks). These are contained in three Excel files (i.e. workbooks): Examples Workbook Part 1, Examples Workbook Part 2 and Multivariate Examples. See Workbook Examples for a description of which examples are contained in which files.
The Real Statistics website is in the process of being updated to reflect the new features. These changes will be made over the next several days. In addition, I will be adding a new examples workbook, Time Series Examples, which contains the examples for the time series capabilities.
Release 4.6 contains the following new features:
Time Series Data Analysis Tools and Functions
Three new data analysis tools have been added:
Basic Forecasting: Simple Moving Average, Weighted Moving Average, Exponential Smoothing, Exponential Trend, Holt’s Linear Trend, Holt-Winter’s model
ARIMA Model: Builds autoregressive integrated moving average (aka Box-Jenkins) time series models and produces forecasts based on these models
Time Series Tests: Calculates ACF, ACVF and PACF, as well as performing the following tests: Bartlett’s, Box-Pierce, Ljung-Box and Augmented Dickey-Fuller (ADF) test.
The following new functions have also been added:
ADFCRIT(n, alpha, type) = critical value for the Augmented Dickey-Fuller test, where n = size of the time series and type = 0 (no trend, no drift), 1 (no trend, drift), 2 (trend and drift)
ADFTEST(R1, lab, lag, criteria, type, alpha): an array function which returns a column range for the ADF test consisting of tau-stat, tau-crit, stationary (yes/no), AIC, BIC and lags, where R1 is the column range which contains the time series, criteria = “aic”, “bic” or “none” and type is as in ADFCRIT. If criteria = “none” then lag = the lag being tested; otherwise it is the largest lag being tested. If lab = True then a column of descriptive labels is appended to the output.
ARROOTS(R1): array function which outputs one row for each of the p phi coefficients in the autoregressive process (AR) listed in range R1: real part of the root, imaginary part of the root, absolute value of the root
MAROOTS(R1): array function which outputs one row for each of the q theta coefficients in an moving average process (MA) listed in range R1: real part of the root, imaginary part of the root, absolute value of the root
ADIFF(R1, d): array function which outputs the differenced version of the time series contained in range R1 based on differencing d times.
PSICoeff(R1,R2,k): array function which returns the first k coefficients of the psi representation of the ARMA process whose phi coefficients are contained in range R1 and whose theta coefficients are contained in range R2.
ARMA_SSE(R1, R2, avg, p, q) = the SSE (residual sum of squares) value for the ARMA process with mean avg whose phi coefficients are contained in range R1 and whose theta coefficients are contained in range R2, where pth phi coefficient is set to zero and/or the qth theta coefficient is set to zero.
A big thanks to Milos Cipovic, who programmed the algorithms for the Augmented Dickey-Fuller (ADF) test.
Fisher Exact Test
The Fisher Exact Test for 2 × 2 contingency tables has been expanded to support the two-tailed test for 2 × 3, 2 × 4, 2 × 5, 2 × 6, 3 × 3 and 3 × 4 contingency tables.
The FISHERTEST function now supports these new table sizes and a Fisher Exact Test option has been added to the Chi-square Test for Independence data analysis tool.
Because these new Fisher Exact Tests are resource intensive, limits have been placed on the sum of all the cells in the contingency tables that are supported. These limits are currently set at 2,000 for a 2 × 3 table, 1,200 for a 2 × 4 table, 440 for a 2 × 5 table , 200 for a 2 × 6 table, 280 for a 3 × 3 table and 100 for a 3 × 4 contingency table. Even close to these limits the processing can be quite slow, taking about 45 seconds on my computer.
A big thanks to Paolo Cadringher, who programmed the algorithms for these new Fisher Exact Tests.
Data Analysis Tools
A small change has been made to the dialog boxes of the Real Statistics data analysis tools. If you click on any cell in the current worksheet prior to using any of the data analysis tools, that cell is used as the default location of the output. This remains as before.
Henceforth, if you highlight more than one cell, then the resulting range is used as the default Input Range value (or if there is more than one input range, then the first input range defaults to the highlighted range). In this case, the Output Range defaults to blank, which means that the output will be written to a new worksheet.
You can override any of these defaults.
Biserial Correlation Coefficient
The new BCORREL(Rx, Ry) function can be used to compute the biserial correlation function between the data in ranges Rx and Ry where Rx s a column range consisting of 0’s and 1’s.
Roots of a Polynomial
The new ROOTS array function is used to calculate the roots of any polynomial (with real coefficients). E.g. the roots of x2 – 1 are x = 1 and x = -1, while the roots of x2 + 1 are x = i and x = –i where i = the square root of -1 (an imaginary number).
Weighted Random Values
A new weighted random value function WRAND has been added. E.g. if the range A1:A4 contains the four weights 5,1,1,3, then WRAND(A1:A4) generates a random integer value between 1 and 4 where the probability of generating a 1 is 5/(5+1+1+3) = 50%, the probability of generating a 2 is 1/(5+1+1+3) = 10%, etc.
WRANDOM is a new array function. If, for example, you want to generate two integer random values between 1 and 4 using the above weights, then you would highlight a range consisting of two cells, enter the formula =WRANDOM(A1:A4, TRUE) and press Ctrl-Shft-Enter. If you want to make sure that the two values generated are not equal (i.e. sampling without repetition), then you would enter the formula =WRANDOM(A1:A4, FALSE).
Fixes a bug that occurs when choosing the Interaction Contrast option of the Two Factor Anova Follow Up data analysis tool
Fixes a bug which occurs when choosing the Two Independent Samples option of the T Test and Nonparametric Equivalents data analysis. This bug only occurs for the non-equal variances version of the test when a non-zero Hypothetical Mean difference value is entered.
Fixes a bug when using the Royston version of the Shapiro-Wilk test for samples of size 4.