# Real Statistics Regression/ANOVA Functions

Linear Regression

The following are ordinary, non-array functions where R1 contains the X data and R2 contains the Y data:

A second R2 parameter can be used with each of the df functions above, although this parameter is not used. Similarly you can use SSRegTot(R1, R2) and its value will be equivalent to SSRegTot(R2). All these functions can optionally take a third argument con, where con = TRUE (default) means that the regression model takes a constant term and con = FALSE means that the regression model doesn’t have a constant term.

There is also a second form of the RSquare function in which RSquare(R1, k) = R2 where the X data consist of all the columns in R1 except the kth column and the Y data consist of the kth column of R1.

The following are array functions where R1 contains the X data and R2 contains the Y data.

 DESIGN(R1) design matrix for the data in R1 HAT(R1, con) hat matrix for the data in R1 DIAGHAT(R1, con) diagonal of the hat matrix for the data in R1 CORE(R1) core of the hat matrix for the data in R1 LEVERAGE(R1, con) leverage vector = diagonal of hat matrix for the data in R1 RegCov(R1, R2) covariance matrix for the regression coefficients of the regression line RegCoeff(R1, R2, con) two column range with the regression coefficients for the regression line in the first column and the corresponding standard errors in the second column RegPred(R0, R1, R2, lab, alpha) 7 × 1 column range containing the predicted y value for the data in R0, the standard error for the confidence interval, the lower and upper ends of the 1–alpha confidence interval, the standard error for the prediction interval, the lower and upper ends of the 1–alpha prediction interval (alpha defaults to .05) RegPredCC(R0, Rc) predicted y values for x values in range R0 based on the regression coefficients in column range Rc; handles case with and w/o an intercept StdRegCoeff(R1, R2, Rc, ystd) column range with the standardized regression coefficients for R1 and R2 based on the regression coefficients in the column range Rc (including an intercept coefficient). If ystd = TRUE (default), then standardized regression coefficients are based on the y data being standardized UnStdRegCoeff(R1, R2, Rc, ystd) column range with the unstandardized regression coefficients for R1 and R2 based on the standardized regression coefficients in the column range Rc (including an intercept coefficient). If ystd = TRUE (default), then the standardized regression coefficients are based on the y data being standardized RRegCoeff(R1, R2, hc, con) two column range with the regression coefficients for the regression line in the first column and the corresponding robust standard errors in the second column, where hc = 0 through 4 corresponding to HC0 through HC4 WRegCoeff(R1, R2, R3) two column range with the regression coefficients for the weighted regression line in the first column and the corresponding standard errors in the second column, where R3 contains the weights RegCoeffSE(R1, R2) vector with the standard errors of the coefficients for the regression line RegY(R1, R2, con) vector of predicted values for Y based on the regression line = TREND(R2,R1) RegE(R1, R2, con) vector of residuals based on the regression line RegStudE(R1, R2) vector of studentized residuals based on the regression line SHAPLEY(R1, R2) vector with the Shapley-Owen decomposition of R2 SlopesTest(R1, R2, R3, R4, b, lab) vector containing s.e. of differences between slopes, t, df and p-value where R3 and R4 are the X and Y values for a second regression line; if b = TRUE (default) the pooled s.e. is used; if lab = TRUE then a column of labels is added to the output (default = FALSE)

The following are non-array functions where R1 contains the X data and R2 contains the Y data.

 RegPredC(R0, Rc) predicted y values for x values in range R0 based on the regression coefficients in range Rc. R0 and Rc can be column or rows ranges. RSquareTest(R1, R3, R2) p-value of the test of the significance of X data in R3 (reduced model) vs. X data in R1 (full model) RegAIC(R1, R2, con) Akaike’s Information Criterion (AIC) for the regression model RegAICc(R1, R2, con) corrected AICc for the regression model RegSBC(R1, R2, con) Schwarz Baysean Criterion (SBC) for the regression model TOLERANCE(R1, j) Tolerance of the jth variable for the data in range R1 VIF(R1, j) VIF of the jth variable for the data in range R1 DURBIN(R1) Durbin-Watson statistic d where R1 is a column vector containing residuals DURBIN(R1, R2) Durbin-Watson statistic d where R1 contains X data and R2 contains Y data

The following array functions are also supported:

 DURBIN(R1, lab, alpha) Outputs the Durbin-Watson statistic d, the lower and upper bounds of the 1 − alpha confidence interval and the test significance, where R1 is a column vector containing residuals DURBIN(R1, R2, lab, alpha) Outputs the Durbin-Watson statistic d, the lower and upper bounds of the 1 − alpha confidence interval and the test significance, where R1 contains X data and R2 contains Y data

If lab = FALSE (default) then the output is a 4 × 1 column range, while if lab = TRUE then the output is a 4 × 2 range with an extra column of labels.

Regression Power and Sample Size

 REG_POWER(e, n, k, type, α, m, prec) power of multiple regression where k = # of predictors, e = Cohen’s effect size f2 if type = 1 (default), e = R2 effect size if type = 2 and e = noncentrality parameter if type = 0 REG_SIZE(e, k, 1−β, type, α, m, prec) minimum sample size required to obtain power of at least 1−β for multiple regression where k = # of predictors, e = Cohen’s effect size f2 if type = 1 (default) and e = R2 effect size if type = 2

n = the sample size, tails = # of tails: 1 or 2 (default), α = significance level (default = .05) and m and prec as for the noncentral distribution functions.

Stepwise Regression

The following array functions are used to create a stepwise regression model. R1 is an n × k array containing x data values, R2 is an n × 1 array containing y data values and R3 is a 1 × k array containing a non-blank symbol if the corresponding variable is in the regression model and an empty string otherwise.

 RegRank(R1, R2, R3) returns a 1 × k array containing the p-value of each x coefficient that can be added to the regression model defined by R1, R2, R3 RegCoeffP(R1, R2, R3) returns a 1 × k array containing the p-value of each x coefficient in the regression model defined by R1, R2, R3 RegStepwise(R1, R2) returns a 1 × k array R where each non-blank elements in R corresponds to an x variable that should be retained in the stepwise regression model. Actually the output is a 1 × k+1 array where the last element is a positive integer equal to the number of steps performed in creating the stepwise regression model.

Exponential Regression

The following are array functions that support a nonlinear exponential regression model:

 ExpCoeff(R1, R2, iter, lab) outputs a 2 × 4 range whose first column contains the coefficients α and β for the regression, whose 2nd column contains the corresponding standard errors for these coefficients, whose 3rd column contains SSE and MSE and whose 4th column contains MSReg and dfT. If lab = TRUE then and extra row is added with labels (default = FALSE). ExpPred(R, R1, R2, iter) outputs an m × 1 column range with the values predicted by the exponential model for R1 and R2 based on the data in the m × 1 column vector of x values. ExpPredC(R, α, β) outputs an m × 1 column range with the values predicted by the exponential model with coefficients α and β based on the data in the m × 1 column vector R of x values.

Here iter = the number of iterations (default 20). The last two functions can also be used as non-array functions of the following form:

 ExpPred(x, R1, R2, iter) value predicted by the exponential model for x based on the data in R1 and R2 ExpPredC(x, α, β) value predicted by an exponential model with coefficients α and β for x

Polynomial Regression

The following are functions that support polynomial regression. The first two functions are array functions. R1 and R2 are column arrays containing x and y data values respectively and deg is the degree/order of the polynomial.

 PolyDesign(R1, deg, ones) returns an array consisting of x, x2, …, xdeg columns. If ones = TRUE, then the output is 1, x, x2, …, xdeg PolyCoeff(R1, R2, deg) returns a two column array consisting of the polynomial regression coefficients and their standard errors PolyRSquare(R1, R2, deg) R-square value for the polynomial regression PolyDegree(R1, R2, deg) the highest degree polynomial ≤ deg which produces a significantly different R-square value

The following are array functions which support LAD regression. R1 is an n × k array containing x data values, R2 is an n × 1 array containing y data values, con takes the value TRUE if the regression includes a constant term and iter is the number of iterations used in the iteratively reweighted least squares algorithm (default = 25).

 LADRegCoeff(R1, R2, con, iter) column array containing the LAD regression coefficients, k+1 × 1 array if con = TRUE and k × 1 array if con = FALSE LADRegWeights(R1, R2, con, iter) n × 1 column array consisting of the weights calculated from the iteratively reweighted least squares algorithm LADRegCoeffSE(R1, R2, con, iter, nboots) column array consisting of the standard errors of the LAD regression coefficients based on bootstrapping nboots times; k+1 × 1 array if con = TRUE and k × 1 array if con = FALSE

Deming Regression

The following are functions which support Deming regression, all but the last function are array functions. R1 is an array containing x data values, R2 is an array containing y data values and λ is the lambda value (in which case R1 and R2 contain one column) or omitted (in which case lambda is calculated from R1 and R2). If lab = TRUE (default is FALSE) then the output contains an extra column of labels, except for DRegResiduals where lab = TRUE means that the output contains an extra row of labels.

 DRegCoeff(R1, R2, λ, lab) 2 × 2 array containing the Deming  regression intercept and slope coefficients along with the standard errors of these coefficients DRegResiduals(R1, R2, λ, lab) n × 7 array consisting of predicted y, x-hat, y-hat, raw residual, x-residual, y-residual and optimized residual for each pair of data elements in R1 and R2 where n = the number of rows in R1 (or R2). DRegIdentity(R1, R2, λ, lab) 2 × 1 array consisting of x̄–ȳ and se(x̄–ȳ) for Deming regression DRegPred(xo, R1, R2, λ, alpha, lab) 4 × 1 array consisting of the predicted value of y for x0, the standard error of the prediction and the confidence interval for this prediction based on Deming regression DRegLambda(R1, R2) lambda value for Deming regression calculated from R1 and R2.

Total Least Squares (TLS) Regression

The following are array functions which support TLS regression.

 TRegCoeff0(R1, R2, lab) 2 × 1 column array consisting of the intercept and slope TLS regression coefficients; if lab = TRUE then an extra column of labels is appended (default FALSE) TRegCoeff(R1, R2, iter) k+1 × 1 column array consisting of the TLS regression coefficients, where k = # of columns in R1 and iter = # of iterations in SVD algorithm

Logistic Regression

If lab = TRUE then the output includes headings and if lab = FALSE (the default) only the data is outputted. Except as indicated above, if head = TRUE (default) then R1 and the output include column headings, while if head = FALSE then R1 and the output contain only data.

If raw = TRUE then the data in R1 is in raw form and if raw = FALSE (the default) then the data in R1 is in summary form. The parameter alpha is used to calculate a confidence interval and takes a value between 0 and 1 with a default value of .05. The parameter iter determines the number of iterations used in the Newton method for calculating the logistic regression coefficients; the default value is 20. The default value of head is FALSE.

Probit Regression

 ProbitCoeff(R1, lab, raw, head, alpha, iter) array function that returns probit regression coefficients and other parameters (s.e., Wald, confidence intervals, etc.) for data in range R1. If head = TRUE, then R1 contains column headings. ProbitCoeff2(R1, R2, lab, head, alpha, iter) array function like ProbitCoeff, except that R1 contains the data for the independent variables and R2 contains the data for the dependent variable. If R2 has one column the data is in raw format, while if it has two columns the data is in summary format. R1 and R2 can contain non-numeric data; such data is ignored in the analysis. ProbitCoeffs(R1, iter) array function like ProbitCoeff that outputs the coefficients plus the number of iterations made ProbitTest(R1, lab, raw, iter) array function that returns LL, LL0, chi-square and p-value for data in R1 ProbitRSquare(R1, lab, raw, iter) array function that returns LL, LL0, pseudo R-square, AIC, BIC  for data in R1 ProbitPred(R0, R1, raw, iter) outputs the probability of success for the values of each row of independent variables contained in the range R0 based on the probit regression model calculated from the data in R1 (without headings) ProbitPredC(R0, R2) outputs the probability of success for the values of each row of independent variables contained in the range R0 based on the probit regression coefficients contained in R2 (in the form of a column vector)

The arguments lab, raw, iter, head and alpha are as for the corresponding logistic regression function.

Multinomial and Ordinal Logistic Regression

The following are array functions where R1 is the data used to create the multinomial logistic regression model. When r = 0 (default) then the data in R1 is in raw form, whereas if r ≠ 0 the data is in summary form where the dependent variable takes values 0, 1, …, r.

 MLogitCoeff(R1, r, lab, head, iter) returns the coefficients for data in range R1. MLogitParam(R1, r, lab, head, alpha, iter) returns the coefficients and other parameters (s.e., Wald, confidence intervals, etc.) for data in range R1. If head = TRUE, then R1 contains column headings. MLogitTest(R1, r, lab, iter) returns LL, LL0, chi-square and p-value for data in R1 MLogitRSquare(R1, r, lab, iter) returns LL, LL0, pseudo R-square, AIC, BIC  for data in R1 MLogitPred(R0, R1,r, iter) returns a row vector with the probabilities of the outcomes of the dependent variable for the values of the independent variables contained in the range R0 (row or column vector) based on the logistic regression model calculated from the data in R1 (without headings) MLogitPredC(R0, R2) returns a row vector with the probabilities of the outcomes of the dependent variable for the values of the independent variables contained in the range R0 (row or column vector) based on the logistic regression coefficients in R2 MLogitSummary(R1, r, head) returns a summary of the raw data in range R1. MLogitSelect(R1, s, head) array function which takes the summary data in range R1 and outputs an array in summary form based on s. The string s is a comma delimited list of independent variables in R1 and/or interactions between such variables. E.g. if s = “2,3,2*3” then the data for the independent variables in columns 2 and 3 of R1 plus the interaction between these variables are output. MLogitExtract(R1, r, s, head) fills the highlighted range with the columns defined by string s from the data from R1. The string s takes the form of a comma delimited list of numbers 0, …, r. MLogit_Accuracy(R1, r, lab, head, iter) fills the highlighted range with a column array with the accuracy of the multinomial logistic regression model defined from the data in R1 for each independent variable and the total accuracy of the model. If R1 contains k independent variables, then the output is a k+1 × 1 column array (or a k+1 × 2 array if lab = TRUE).

Here lab, headalpha and iter are as for the logistic regression functions described above. The following array function is used with ordinal logistic regression models:

 OLogitPredC(R0, R2) returns a row vector with the probabilities of the outcomes of the dependent variable for the values of the independent variables contained in the range R0 (row or column vector) based on the logistic regression coefficients in R2

Poisson Regression

 PoissonCoeff(Rx, Ry, lab, phi, Rt, head, alpha, iter, guess) array function that returns Poisson regression coefficients and other parameters (s.e., Wald stat, p-value and confidence interval) for data in range Rx, Ry and Rt. If head = TRUE, then Rx, Ry and Rt contain column headings (default FALSE). If phi = TRUE (default FALSE) then the phi correction is applied to the standard errors. PoissonCov(Rx, Ry, Rt, iter, guess) array function that contains the coefficient covariance matrix PoissonPred(Rx0, Rx, Ry, lab, Rt0, Rt, alpha, iter, guess) array function that returns a column array containing the predictions for Rx0 and Rt0 based on the Poisson model based on the data in Rx, Ry and Rt PoissonPredC(Rx0, Rc, Rt) array function that returns a column array with the predictions for Rx0 and Rt0 based on the Poisson coefficients in Rc. PoissonPredCC(Rx0, Rc, Rv, lab, Rt0, alpha) array function that returns an array with 4 columns containing the predictions, standard errors and confidence intervals for Rx0 and Rt0 based on the Poisson coefficients in Rc and coefficient covariance array Rv.

Rx contains the X range data, Ry contains the Y range data and Rt contains the frequency range data. If Rt or Rt0 is omitted it defaults to a column array of ones. Rt or Rt0 can also be a numeric value, in which case it is treated as a column array containing this numeric value.

If lab = TRUE, then an extra column is appended to the output containing labels, except for PoissonPredCC where an extra row of labels is appended. alpha is the significance level (default .05). iter is the number of iterations used in calculating the coefficients using Newton’s method (default 20). guess is the initial guess of these coefficients (if missing then all the coefficients are initially set to one).

Ridge and LASSO Regression

The following are array functions where Rx contains x values, Ry is a column range containing y values and Rc is a column range containing coefficients.

 RidgeRegCoeff(Rx, Ry, lambda, std) array with standard Ridge regression coefficients and their standard errors for the Ridge regression model based on Rx, Ry and lambda If std = TRUE, then the values in Rx and Ry have already been standardized; if std = FALSE (default) then the values have not been standardized. RidgeCoeff(Rx, Ry, lambda) array with unstandardized Ridge regression coefficients and their standard errors for the Ridge regression model based on Rx, Ry and lambda; the values in Rx and Ry are not standardized. RidgePred(Rx0, Rx, Ry, lambda) column array of predicted y values for the x data in range Rx0 based on the Ridge regression model defined by Rx, Ry and lambda RidgeVIF(Rx, lambda) column array with the VIF values using a Ridge regression model based on Rx, Ry and lambda RidgeCVError(Rx, Ry, lambda, map) column array whose first element contains the k-fold cross validation error for lambda based on the Ridge regression for the standardized data in Rx and Ry, where the partition is as defined by map, a column array with the same number of rows as Ry containing the values 1, 2, …, k; the other elements in the output are the CV errors for each of the k partition elements. LASSOCoeff(Rx, Ry, lambda, iter, guess) column array with standardized LASSO regression coefficients based on Rx, Ry and lambda using the cyclical coordinate descent algorithm with iter iterations (default 10000) and with the initial guesses for each coefficient specified in the column array guess; alternatively guess can specify a single initial value for all the coefficients (default .2)

There are also the following non-array functions:

 RidgeRSQ(Rx, Rc, std) R-square value for Ridge regression model based on Rx and Rc; if std= TRUE, then the values in Rx have already been standardized; if std = FALSE (default) then the values have not. RidgeLambda(Rx, vif, iter) the lowest lambda value for Ridge regression on the x values in Rx that generates a maximum VIF value less than vif; iter = the number of iterations in the search (default 25) RidgeMSE(Rx, Ry, lambda) MSE of the Ridge regression defined by Rx, Ry and lambda

Finally, there are the following array functions which define partitions used by RidgeCVError.

 RandPart(n,k) column array with n rows with the values 1, 2, …, k randomly distributed where the number of times each integer appears is approximately equal. OrderedPart(n,k) column array with n rows with the values 1, 2, …, k repeated as many times as necessary in that order. SortedPart(Rx, k) column array with the same number of rows as R1 and containing the values 1, 2, …, k where the number of times each integer appears is approximately equal. The order of the values 1, 2, …, k is determined by the sort order in R1.

Survival Analysis

 LOGRANK(R1, R1, lab) array function which returns the following statistics along with their p-value: log-rank 1, log-rank 2, Wilcoxon, Tarone-Ware COXEST(R1, approx, iter) array function which returns Cox regression coefficients, their standard errors, convergence values, LL1 and LL0 values and covariance matrix COXPRED(R1, R2, R0, lab, approx, iter, alpha) array function which predicts the hazard ratio between the two subject profiles in R1 and R2 (plus standard error and 1−alpha confidence interval) based on a Cox regression model derived from the input data in  R0.

If lab = TRUE then the output includes a column of labels, while if lab = FALSE (the default) only the data is outputted. The approx parameter takes the value: 0 if the continuum approximation is used and 1 (default) if the Breslow approximation is used. The output for COXEST and COXPRED is calculated using Newton’s Method with iter iterations (default = 20).

Analysis of Variance (ANOVA)

The following functions are used for one factor ANOVA with replication where R1 = the input data in Excel format

Here b is an optional argument. When b = TRUE (default) then the columns denote the groups, while when b = FALSE, the rows denote the groups.

If R1 is in standard (i.e. stacked) format, then you can use the following formulas, where the colth column contains the data for the one-way ANOVA (default col = 2):

SSWStd(R1, col) = SSW         SSBetStd(R1, col) = SSBet        SSTotStd(R1, col) = SSTot

The following functions are used for two factor ANOVA where R1 = the input data in Excel format and r = the number of rows in R1 that make up an A factor level.

The second argument for the column and interaction terms is optional and can be dropped. Note that the column and total terms are identical to the between groups and total terms, respectively, for one factor ANOVA.

 ANOVARow(R1,r) = MSA/MSW ATESTRow(R1,r) = p-value of A factor ANOVACol(R1,r) = MSB/MSW ATESTCol(R1,r) = p-value of B factor ANOVAInt(R1,r) = MSAB/MSW ATESTInt(R1,r) = p-value of AB factor

The following array functions are used to convert data between Excel’s formatting for Anova and standard format:

 StdAnova1(R1) converts data in R1 in standard format into Excel Single Factor Anova format Anova1Std(R1) converts data in R1 in Excel Single Factor Anova format into standard format StdAnova2(R1) converts data in R1 in standard format into Excel Two Factor Anova format Anova2Std(R1, r) converts data in R1 in Excel Single Factor Anova format with r rows per group into standard format Anova3Rows(R1) converts data in R1 in standard Three Factor Anova columns format into Three Factor Anova rows format Anova3Cols(R1) converts data in R1 in standard Three Factor Anova rows format into Three Factor Anova columns format StdNested(R1) converts data in R1 in standard format into Excel Nested Anova format

The following array functions are used to perform ANOVA via regression:

 SSAnova2(R1, r) returns a column array with SSRow, SSCol, SSInt and SSW for a two factor ANOVA for the data in R1 using a regression model; if r > 0 then R1 is assumed to be in Excel Anova format with r rows per sample, while if r = 0 or is omitted then R1 is assumed to be in standard format; data is w/o headings SSAnova3(R1) returns a column array with SSA, SSB, SSC, SSAB, SSAC, SSBC, SSABC and SSW for a three factor ANOVA for the data in R1 using a regression model where the data in R1 is assumed to be in standard format by columns w/o column headings

ANOVA-related Functions

 LEVENE(R1, type) p-value of Levene’s test for the data in range R1 (organized by columns) where type = 0 for deviations from group means, type = 1 for deviations from group medians and type = -1 for deviations from 10% trimmed group means FKTEST(R1) p-value of Fligner-Killeen test for data in range R1 (organized by columns) DunnSidak(α, k) 1−(1−α)1/k ADJK(nrows, ncols) adjusted k value for Tukey HSD after two factor ANOVA

Sphericity

 GGEpsilon(R1, ngroups, raw) Greenhouse and Geisser epsilon value for the data in range R1 where ngroups = the number of groups; if raw = TRUE then R1 contains raw data, otherwise it contains a covariance matrix HFEpsilon(R1, ngroups, nsubj) Huynh and Feldt epsilon value for the data in range R1 where ngroups = the number of groups; if nsubj = 0 then R1 contains raw data, otherwise it contains a covariance matrix which is derived from raw data with nsubj subjects (corresponding to rows). GG_Epsilon(R1) Greenhouse and Geisser epsilon value for the data in range R1 which includes a column of labels HF_Epsilon(R1) Huynh and Feldt epsilon value for the data in range R1 which includes a column of labels MauchlyTest(R1) p-value of Mauchly’s test for sphericity on the data in range R1 JNSTest(R1) p-value of the John-Nagao-Sugiura test for sphericity on the data in range R1

Non-parametric Tests

 KRUSKAL(R1) Kruskal-Wallis test statistic for the data in R1 KTEST(R1) p-value for Kruskal-Wallis Test for the data in R1 FRIEDMAN(R1) Friedman test statistic for the data in R1 FrTEST(R1) p-value for Friedman’s Test for the data in R1 MOODS_STAT(R1) chi-square statistic for Mood’s Median Test for the data in R1 MOODS_TEST(R1) p-value for Mood’s Median Test for the data in R1 COCHRAN(R1, raw, cont) Cochran’s Q statistic for the data in R1 QTEST(R1, raw, cont) p-value for Cochran’s Q Test for the data in R1 FSTAR(R1) Brown-Forsythe’s test statistic F* on the data in range R1 DFSTAR(R1) df* for Brown-Forsythe’s test on the data in range R1 BFTEST(R1) p-value of the Brown-Forsythe’s test statistic on the data in range R1

There are also the following array functions about non-parametric ANOVA-like tests:

 WELCH_TEST(R1, lab) returns the column range: F, df1, df2, and p-value for Welch’s test for the data in range R1 FSTAR_TEST(R1, lab) returns the column range: F, df1, df2, and p-value for Brown-Forsythe’s test for the data in range R1 KW_TEST(R1, lab, ties) returns the column range: H, H-ties, df and p-value for Kruskal-Wallis test for the data in range R1, if ties = TRUE (default) a ties correction is applied

If lab = TRUE then the output includes a column of labels, while if lab = FALSE (the default) only the data is outputted.

2k Factorial Design

 DESIGN2k(k, lab, d) returns the design matrix for a 2^k factorial design ExpandDesign2k(R1, d) returns the design matrix for a 2^k factorial design augmented by R1 Effect2k(R1, R2, lab, d) returns a column array consisting of the effect size values for the design described by R1 and R2, where R1 contains the +1 and +1 values and R2 contains the data values SS2k(R1, R2, lab, d) returns a column array consisting of SS values for the design described by R1 and R2, where R1 contains the +1 and +1 values and R2 contains the data values

If lab = TRUE then the output includes a column of labels (although for DESIGN2k this is a row of labels), while if lab = FALSE (the default) only the data is outputted.  Output includes d-way interactions (d = 0, 2, 3; default = 2);

ANOVA Power and Sample Size

 ANOVA1_POWER(f, n, k, type, α, m, prec) power of a one-way ANOVA where k = # of groups, f = Cohen’s effect size if type = 1 (default), f = RMSSE effect size if type = 2 and f = noncentrality parameter if type = 0 ANOVA1_SIZE(f, k, 1−β, type, α, m, prec) minimum sample size required to obtain power of at least 1−β for a one-way ANOVA where k = # of groups, f = Cohen’s effect size if type = 1 (default) and f = RMSSE effect size if type = 2

Intraclass Correlation

 ICC(R1, class, type, lab, α) array function which outputs the intraclass correlation coefficient ICC(class, type) plus the lower and upper bound of the 1−α confidence interval for the data in R1; default values are class = 2, type = 1, α = .05. If lab = TRUE, then an extra column of labels ICC_POWER(ρ0, ρ1, n, k, α) power of ICC(1,1) test where ρ0 = ICC(1,1) under the null hypothesis, ρ1 = ICC(1,1) under the alternative hypothesis and k = # of items ICC_SIZE(ρ0, ρ1, k, 1−β, α) minimum sample size required to obtain power of at least 1−β for ICC(1,1) test where ρ0 = ICC(1,1) under the null hypothesis, ρ1 = ICC(1,1) under the alternative hypothesis and k = # of items

Distribution Functions

 QDIST(q, k, df) studentized q cumulative distribution value for q with k independent variables and df degrees of freedom QINV(p, k, df, tails) inverse of the studentized q distribution, i.e. the critical value for the studentized q range; tails = 1 or 2 (default)

Table Lookup

 QCRIT(k, df, α, tails, h) critical value in the Studentized Range Q table DCRIT(k, df, α, tails, h) critical value in the Dunnett’s test table DLowerCRIT(n, k, α, h) lower critical value in the Durbin-Watson Table DUpperCRIT(n, k, α, h) upper critical value in the Durbin-Watson Table

If h = TRUE (default), then harmonic interpolation is used; otherwise linear interpolation is used.

Categorical Coding

 CATCODE(R1) array function which fills highlighted array with simple coding of values in range R1; if R1 is an m × n range, highlight an m × n range TAGCODE (R1, b) array function which fills highlighted array with dummy coding of values in column range R1; if R1 has m rows and k unique values then highlight an m × (k–1) range; if b = TRUE (default) normal dummy coding is used, if b = FALSE alternative dummy coding is used.

Iterative Proportional Fitting Procedure

 IPFP2(R1) array function which fills highlighted array with the output from the IPF procedure on the two-contingency table with targeted marginal totals in R1; if R1 is an m × n range, highlight an m-1 × n-1 range IPFP3 (R1, R2) array function which fills highlighted array with the output from the IPF procedure on the three-contingency table in R1 with targets in R2. If R1 is an m × n range, highlight an m × n range.