Although all the statistical analyses described in this website can be done with standard Excel capabilities (see Built-in Excel Functions and Built-in Statistical Functions), it is often easier to use the supplemental functions and data analysis tools provided in the Real Statistics Resource Pack. You use the supplemental functions exactly as for the standard Excel functions (see Worksheet Functions).

We now briefly describe the various Real Statistics supplemental functions. More complete descriptions of each of these functions can be obtained by clicking on the link for that function.

Also note that if you enter =*function-name* into any cell and press **Ctrl-Shift-A** the arguments for the named function will be displayed. E.g. =LogitCoeff + **Ctrl-Shift-A** will display

=LogitCoeff(rg,lab,raw,head,alpha,iter)

In addition to the Real Statistics functions described on this webpage, you can find descriptions of other Real Statistics functions by clicking on the following links.

See Real Statistics Regression/ANOVA Functions for a summary of the functions available in the Real Statistics Resource Pack for Regression and ANOVA.

See Real Statistics Multivariate Functions for a summary of the multivariate statistics functions available in the Real Statistics Resource Pack.

See Real Statistics Time Series Analysis Functions for a summary of the time series functions available in the Real Statistics Resource Pack.

See Real Statistics Missing Data Functions for a summary of the functions available in the Real Statistics Resource Pack that deal with missing data.

See Real Statistics Data Analysis Tools for a list of the statistical data analysis tools provided by the Real Statistics Resource Pack.

**Basic descriptive statistics**

MEAN(R1) |
same as AVERAGE(R1) |

IQR(R1, b) |
interquartile range of data in range R1; if b = TRUE then use QUARTILE while if b = FALSE then use QUARTILE.EXC or QUARTILE_EXC |

PERCENTILE_EXC(R1, p, m) |
p^{th} percentile from the data in R1 where m = Hyndman-Fan method (default of 6 is equivalent to PERCENTILE.EXC) |

QUARTILE_EXC(R1, quart, m) |
quart^{th} quartile from the data in R1 where m = Hyndman-Fan method (default of 6 is equivalent to QUARTILE.EXC) |

MAD(R1) |
median absolute deviation of the data in range R1 |

RNG(R1) |
range of data in range R1, i.e. MAX(R1) – MIN(R1) |

SKEWP(R1) |
population skewness for the data in R1 |

KURTP(R1, excess) |
population kurtosis for the data in R1; if excess = TRUE (default) then 3 is subtracted |

SUMCOL(R1) |
array function which returns a row range with sums of each column in R1 |

MEANCOL(R1) |
array function which returns a row range with means of each column in R1 |

VARCOL(R1) |
array function which returns a row range with sample variances of each column in R1 |

STDEVCOL(R1) |
array function which returns a row range with sample standard deviations of each column in R1 |

STDCOL(R1) |
array function which returns an array that has the same dimensions as R1 but with a standardization of the values in each column |

SUMROW(R1) |
array function which returns a column range with sums of each row in R1 |

MEANROW(R1) |
array function which returns a column range with means of each row in R1 |

VARROW(R1) |
array function which returns a column range with sample variances of each row in R1 |

STDEVROW(R1) |
array function which returns a column range with sample standard deviations of each row in R1 |

STDERR(R1) |
standard error of the data in range R1, i.e. STDEV(R1)/SQRT(COUNT(R1)) |

RANK_AVG(x, R1, order, num_digits) |
rank of x in range R1 taking ties into account; order = 0 (default) indicates decreasing order and a non-zero value indicates increasing order; equivalent to Excel’s RANK.AVG(x, R1, order) except that the data is first rounded off to num_digit decimal places. |

Ties_Correction(R1, R2, type) |
ties correction value for the data in range R1 and optionally range R2, where type = 0: one sample, type = 1: paired sample, type = 2: independent samples |

TRIMDATA(R1, p) |
array function which returns a column range equivalent to R1 after removing the lowest and highest p/2 % of the data values. |

WINSORIZE(R1, p) |
array function which returns a column range that is the Winsorized version of R1 removing the lowest and highest p/2 % of the data values. |

WINMEAN(R1, p) |
Winsorized mean of the data in range R1 removing the lowest and highest p/2 % of the data values. |

STANDARD(x, R1, type, exc) |
=STANDARD(x, AVERAGE(R1), STDEV(R1) if type = 0 (default); =(x−Q3)/IPR if type = 1 and x > Q3; =(x−Q1)/IPR if type = 1 and x < Q1; = 0 otherwise. IPR = IPR(R1,exc), Q1 = QUARTILE(R1,1) and Q3 = QUARTILE(R1,3) if exc = FALSE (default) and Q1 = QUARTILE.EXC(R1,1) and Q3 = QUARTILE.EXC(R1,3) if exc = FALSE |

**Count measurements**

COUNTU(R1) |
count of the number of unique numeric values in R1 |

COUNTAU(R1) |
count of the number of unique non-empty cells in R1 |

COUNTCOL(R1) |
array function which returns a row range with the number of numeric values in each column of R1 |

COUNTROW(R1) |
array function which returns a column range with the number of numeric values in each row of R1 |

CountRowsUnique(R1, head, ncols) |
count of the number of unique rows in R1. If head = TRUE (default FALSE) then first row of R1 (presumably a heading) is not counted; the last ncols columns of R1 are not considered when determining uniqueness (ncols = 0). |

CountFullRows(R1, b) |
number of rows in range R1 which don’t have any empty cells if b = TRUE (default) and the number of rows in R1 which don’t have any non-numeric cells if b = FALSE |

CountPairs(R1, R2, b) |
number of pairs of cells from R1 and R2 for which neither pair is empty if b = TRUE (default) and number of pairs of cells from R1 and R2 for which neither pair is non-numeric if b = FALSE |

**T distribution**

CONFIDENCE_T(alpha, s, size) |
the value k such that (x̄ − k, x̄ + k) is the confidence interval for the population mean based on the stated alpha value, standard deviation s and sample size, assuming a t distribution |

T_CONF(R1, alpha) |
the value k such that (x̄ − k, x̄ + k) is the confidence interval for the population mean based on the stated alpha value and sample data in range R1 assuming a t distribution |

T_LOWER(R1, alpha) |
the value x̄ − k such that (x̄ − k, x̄ + k) is the confidence interval for the population mean based on the stated alpha value and sample data in range R1 assuming a t distribution |

T_UPPER(R1, alpha) |
the value x̄ + k such that (x̄ − k, x̄ + k) is the confidence interval for the population mean based on the stated alpha value and sample data in range R1 assuming a t distribution |

DF_POOLED(R1, R2) |
degrees of freedom for the two sample t test for samples in ranges R1 and R2, especially when the two samples have unequal variances |

STDERR_POOLED(R1, R2, b) |
pooled standard error for two sample t test for samples in ranges R1 and R2; if b = TRUE assume equal variances, while if b = FALSE assume unequal variances |

STDEV_POOLED(R1, R2) |
pooled standard deviation for the two sample t test for samples in ranges R1 and R2 when the two samples have equal variances |

VAR_POOLED(R1, R2) |
pooled variance for the two sample t test for samples in ranges R1 and R2 when the two samples have equal variances |

T_DIST(x, df, cum) |
value of the cumulative distribution function of the t distribution with df degrees of freedom at x when cum = TRUE, and value of the pdf of the distribution with df degrees of freedom at x when cum = FALSE |

T_DIST is like Excel’s T.DIST function except that *df* doesn’t have to be an integer. In addition, the following functions are available which have a similar advantage over their Excel counterpart: **T_DIST_RT**, **T_DIST.2T**, **T_INV**, **T_INV.2T**.

There are also the following array formulas for identifying outliers:

GRUBBS(R1, lab, alpha) |
outputs a 4 × 1 column range with the following entries: one potential outlier, G-stat, G-crit and test significance |

ESD(R1, lab, alpha) |
outputs a 4 × k range where k = max # of potential outliers and each column contains: a potential outlier, G-stat, G-crit and test significance (based on ESD test) |

OUTLIERS(R1, alpha) |
outputs a column range with up to k outliers (based on the ESD test) |

If *lab* = TRUE (default = FALSE) then an extra column is added with labels

**Normal distribution**

NORM_CONF(R1, alpha) |
the value k such that x̄ + k is the confidence interval for the population mean based on the stated alpha value and sample data in range R1 assuming a normal distribution |

NORM_LOWER(R1, alpha) |
the value x̄ − k such that (x̄ − k, x̄ + k) is the confidence interval for the population mean based on the stated alpha value and sample data in range R1 assuming a normal distribution |

NORM_UPPER(R1, alpha) |
the value x̄ + k such that (x̄ − k, x̄ + k) is the confidence interval for the population mean based on the stated alpha value and sample data in range R1 assuming a normal distribution |

SHAPIRO(R1, b) |
the Shapiro-Wilk test statistic W for the data in the range R1 |

SWTEST(R1, b, h) |
the p-value of the Shapiro-Wilk test on the data in range R1 |

SWCoeff(n, k, b) |
the kth Shapiro-Wilk coefficient for samples of size n |

SWCoeff(R1, C1, b) |
the Shapiro-Wilk coefficient for cell C1 within sorted sample in range R1 |

JARQUE(R1, pop) |
the Jarque-Barre statistic for the data in range R1 |

JBTEST(R1, pop) |
the p-value of the Jarque-Barre test on the data in range R1 |

SKEWTEST(R1, lab, alpha) |
array function to test whether the skewness of the data in range R1 is zero (consistent with a normal distribution); output contains sample skewness, standard error, test statistic, p-value and 1–alpha confidence interval |

KURTTEST(R1, lab, alpha) |
array function to test whether the kurtosis of the data in range R1 is zero (consistent with a normal distribution); output contains sample kurtosis, standard error, test statistic, p-value and 1–alpha confidence interval |

DAGOSTINO(R1) |
the D’Agostino-Pearson statistic for the data in range R1 |

DPTEST(R1) |
the p-value of the D’Agostino-Pearson test on the data in range R1 |

In the above functions, when *b* = TRUE (default) the Royston algorithm is used and when *b* = FALSE the original Shapiro-Wilk algorithm is used (employing harmonic interpolation unless *h* = FALSE when linear interpolation is used).

**Chi-square distribution**

CHI_STAT2(R1, R2) |
Pearson’s chi-square statistic for observation values in range R1 and expectation values in range R2 |

CHI_MAX2(R1, R2) |
maximum likelihood chi-square statistic for observation values in range R1 and expectation values in range R2 |

CHI_STAT(R1) |
Pearson’s chi-square statistic for observation values in range R1 |

CHI_MAX(R1) |
maximum likelihood chi-square statistic for observation values in range R1 |

CHI_TEST(R1) |
p-value for Pearson’s chi-square statistic for observation values in range R1 |

CHI_MAX_TEST(R1) |
p-value for maximum likelihood chi-square statistic for observation values in range R1 |

CHISQ_DIST(x, df, cum) |
value of the cdf of the chi-square distribution with df degrees of freedom at x when cum = TRUE, and value of the pdf of the chi-square distribution with df degrees of freedom at x when cum = FALSE (df does not need to be an integer) |

CHISQ_INV(p, df) |
inverse of the chi-square distribution with df degrees of freedom at p (df does not need to be an integer) |

FISHERTEST(R1, tails, slow) |
probability calculated by the Fisher exact test for the contingency table in range R1 where for 2 × 2 tables tails = the number of tails: 1 (one-tail) or 2 (two tail, default), tails = 2 for other tables; slow is used to increase the acceptable sum of counts in R1. |

FISHER_TEST(R1, lab, slow) |
array function that returns a column array containing: p-value for the two-tailed Fisher’s exact test for the data in R1, sample size, df, chi-square statistic, Cramer’s V and w effect size; slow is used to increase the acceptable size of R1. |

CMHTest(R1, lab, yates, alpha) |
array function that returns a column range with the values M, p-value, r, lower and upper for the CMH test, where alpha is the significance level (default .05) and yates = TRUE (default) if the continuity correction is used. |

WoolfTest(R1, lab) |
array function that returns a column range with the values W, df, p-value for the Woolf’s heterogeneity test. |

If *lab* = TRUE (default FALSE), then an extra column of labels is appended.

CHISQ_DIST and CHISQ_INV are like Excel’s CHISQ.DIST and CHISQ.INV functions except that *df* doesn’t have to be an integer. In addition, the following functions are available which have a similar advantage over their Excel counterpart: **CHISQ_DIST_RT** and **CHISQ_INV.RT**.

**Covariance and Correlation**

COVARP(R1, R2) |
population covariance of the populations defined by ranges R1 and R2; equivalent to COVAR(R1, R2) |

COVARS(R1, R2) |
sample covariance of the samples defined by ranges R1 and R2; equivalent to COVARIANCE.S(R1, R2) |

CORREL_ADJ(R1, R2) |
estimated population correlation coefficient corresponding to the sample data in ranges R1 and R2 |

MCORREL(R, R1, R2) |
multiple correlation of dependent variable z with x and y where the samples for z, x and y are the ranges R, R1 and R2 respectively |

PCORREL(R1, i, j) |
partial correlation coefficient between the ith and jth columns in R1 |

SCORREL(R1, R2, lab, tails, α) |
array function which returns Spearman’s rank correlation (rho), t-stat and p-value for the data in R1 and R2 |

KCORREL(R1, R2, lab, tails, α, ties) |
array function which returns Kendall’s tau, standard error, z-stat, z-crit, p-value, lower and upper bounds of 1–α confidence interval of Kendall’s tau for the data in R1 and R2 |

COV(R1, b) |
array function which returns the sample covariance matrix for the data in R1 |

COVP(R1, b) |
array function which returns the population covariance matrix for the data in range R1 |

CORR(R1, b) |
array function which returns the correlation matrix for the data in range R1 |

PCORR(R1) |
array function which returns the partial correlation matrix for the data in range R1 |

For CORR, COV and COVP if *b* = TRUE (default) then any row in R1 which contains a non-numeric cell is not used, while if *b* = FALSE then correlation/covariance coefficients are calculated pairwise (by columns) and so any row which contains non-numeric data for either column in the pair is not used to calculate that coefficient value.

For KCORREL, if* lab* = TRUE then the output takes the form of a 7 × 2 range with the first column consisting of labels, while if* lab* = FALSE (default) the output takes the form of a 7 × 1 range without labels. If *ties* = TRUE (default) then a ties correction is applied.

Similarly SCORREL returns a 3 × 2 range if *lab* = TRUE and a 3 × 2 range if *lab* = FALSE.

The following are functions for correlation coefficient hypothesis testing:

CorrTTest(ρ, r, n, tails) |
p-value for one sample t test of the correlation coefficient ρ = 0 |

CorrTLower(r, n, α) |
lower bound of 1–α confidence interval of the population correlation coefficient |

CorrTUpper(r, n, α) |
upper bound of 1–α confidence interval of the population correlation coefficient |

CorrelTTest(r, n, α, lab, tails) |
array function which returns the test statistic z, p-value and lower and upper bounds of 1–α confidence interval of the population correlation coefficient based on a hypothetical correlation ρ = 0 |

CorrelTTest(R1, R2, α, lab, tails) |
CorrelTest(r, n, ρ, α, lab, tails) where r = CORREL(R1, R2) and n = CountPairs(R1,R2) |

CorrTest(ρ, r, n, tails) |
p-value for one sample test of the correlation coefficient |

CorrLower(r, n, α) |
lower bound of 1–α confidence interval of the population correlation coefficient |

CorrUpper(r, n, α) |
upper bound of 1–α confidence interval of the population correlation coefficient |

CorrelTest(r, n, ρ, α, lab, tails) |
array function which returns the test statistic z, p-value and lower and upper bounds of 1–α confidence interval of the population correlation coefficient |

CorrelTest(R1, R2, ρ, α, lab, tails) |
CorrelTest(r, n, ρ, α, lab, tails) where r = CORREL(R1, R2) and n = CountPairs(R1,R2) |

CorrETest(r, n, ρ, tails) |
p-value for one sample exact test of the correlation coefficient ρ |

CorrETest(R1, R2, ρ, tails) |
CorrETest(r, n, ρ, tails) where r = CORREL(R1, R2) and n = CountPairs(R1,R2) |

CorrelETest(r, n, ρ, lab, α), tails |
array function which returns the test statistic p-value and lower and upper bounds of 1–α confidence interval of the population correlation coefficient based on hypothetical correlation ρ |

CorrelETest(R1, R2, ρ, lab, α, tails) |
CorrelETest(r, n, ρ, lab, α, tails) where r = CORREL(R1, R2) and n = CountPairs(R1,R2) |

Correl2Test(r1, n1, r2, n2, α, lab) |
array function which returns the two independent sample test statistic z, p-value (two-tailed test) and lower and upper bounds of 1–α confidence interval of the difference of population correlation coefficients ρ1– ρ2, sample correlations r1 and r2, and sample sizes n1 and n2 |

Correl2Test(R1, R2, R3, R4, α, lab) |
Correl2Test(r1, n1, r2, n2, α, lab) where r1 = CORREL(R1, R2), r2 = CORREL(R3, R4), n1 = CountPairs(R1,R2) and n2 = CountPairs(R3,R4) |

Correl2OverlapTTest(r12, r13, r23, n, α, lab) |
array function for the two overlapping dependent sample test that returns the |r12-r13|, the t test statistic, p-value (two-tailed test) and lower and upper bounds of 1–α confidence interval for the difference of population correlation coefficients; n = size of all three samples. |

Corr2OverlapTTest(R1, R3, R2, α, lab) |
Correl2OverlapTTest(r12, r13, r23, n, α, lab) where r12 = CORREL(R1, R2), r13 = CORREL(R1, R3), r23 = CORREL(R2, R3) n = COUNT(R1) |

Correl2OverlapTest(r12, r13, r23, n, α, lab) |
array function which is just like Correl2OverlapTTest except that the Fisher transformation is used and the output consists of |r12-r13| and lower and upper bounds of 1–α confidence interval for the difference of population correlation coefficients. |

Corr2OverlapTest(R1, R3, R2, α, lab) |
Correl2OverlapTest(r12, r13, r23, n, α, lab) where r12 = CORREL(R1, R2), r13 = CORREL(R1, R3), r23 = CORREL(R2, R3) n = COUNT(R1) |

Correl2NonOverlapTest(R1, n1, n2, α, lab) |
array function which returns |r12-r34| and lower and upper bounds of 1–α confidence interval for the difference of population correlation coefficients, where R1 contains the correlation matrix for 4 samples and n1 = size of sample 1 (or 2) and n2 = size of sample 3 (or 4) |

Corr2NonOverlapTest(R1, α, lab) |
Correl2NonOverlapTest(R2, n1, n2, α, lab) where R2 = CORR(R1), n1 =# of elements in the 1^{st }(or 2^{nd}) column of R1 and n2 = # of elements in the 3^{rd} (or 4^{th}) column of R1 |

PCRIT(n, α, tails) |
critical value for the Pearson’s correlation coefficient based on two samples of size n with alpha of α (default .05) and tails = 1 or 2 (default) |

For CorrelTTest, CorrelTest and Correl2Test if* lab* = TRUE then the output takes the form of a 4 × 2 range with the first column consisting of labels, while if* lab* = FALSE (default) the output takes the form of a 4 × 1 range without labels. Similarly, a column of labels is appended to the output for all the other functions when *lab* = TRUE (default FALSE).

The CorrTest, CorrLower, CorrUpper, CorrelTest and Correl2Test functions are based a normal test using a Fisher transformation. The CorrTTest, CorrTLower, CorrTUpper and CorrelTTest functions are based on a t-test where the population correlation is assumed to be zero.

*r* = sample correlation, *ρ *= hypothetical population correlation*, n* = sample size and* tails* = the # of tails: 1 or 2 (default).

**Box-Cox Transformation**

BOXCOX(R1, λ) |
outputs a range containing the Box-Cox transformation of the data in range R1 using the given λ value. If λ is omitted, then the transformation which best normalizes the data in R1 is used based on maximizing the log-likelihood function. |

BOXCOXLL(R1, λ) |
log likelihood function of the Box-Cox transformation of the data in R1 using the given lambda value |

BOXCOXLambda(R1) |
value of lambda which maximizes the log likelihood function of the Box-Cox transformation of the data in R1 |

**Non-parametric tests**

SignTest(R1, m, tails) |
p-value for the sign test where R1 contains the sample data and m = the hypothesized median |

RANK_COMBINED(x, R1, R2, order) |
rank of x in the combined range R1 and R2 taking ties into account; order is as in RANK_AVG |

RANK_SUM(R1, R2, order) |
sum of the ranks of all the elements in R1 based in the combined range R1 and R2 taking ties into account; order is as in RANK_AVG |

RANK_SUM(R1, k, order) |
sum of the ranks of all the elements in the k^{th} column of R1 taking ties into account; order is as in RANK_AVG |

WILCOXON(R1, R2) |
minimum of W and W′ for the samples contained in ranges R1 and R2 |

WILCOXON(R1, n) |
minimum of W and W′ for the samples contained in the first n columns of range R1 and the remaining columns of range R1. If the second argument is omitted it defaults to 1 |

WTEST(R1, R2, tails) |
p-value of the Wilcoxon rank-sum test for the samples contained in ranges R1 and R2 |

WTEST(R1, n, tails) |
p-value of the Wilcoxon rank-sum test for the samples contained in the first n columns of range R1 and the remaining columns of range R1. If the second argument is omitted it defaults to 1 |

MANN(R1, R2) |
U for the samples contained in ranges R1 and R2 |

MANN(R1, n) |
U for the samples contained in the first n columns of range R1 and the remaining columns of range R1. If the second argument is omitted it defaults to 1 |

MTEST(R1, R2, tails) |
p-value of the Mann-Whitney U test for the samples contained in ranges R1 and R2 |

MTEST(R1, n, tails) |
p-value of the Mann-Whitney U test for the samples contained in the first n columns of range R1 and the remaining columns of range R1. If the second argument is omitted it defaults to 1 |

MANN_EXACT(R1, R2, tails, slow) |
p-value of the Mann-Whitney exact test for the samples contained in ranges R1 and R2; tails = 1 or 2 (default) |

SRank(R1, n) |
T for a single sample contained in range R1 minus n. If the second argument is omitted it defaults to zero. |

SRTEST(R1, n, tails) |
p-value for Signed-Rank test using the normal distribution approximation for the sample contained in range R1 minus n, If the second argument is omitted it defaults to zero. |

SRANK_EXACT(R1, med, tails, slow) |
p-value for Signed-Rank exact test for the sample in range R1 based on a hypothetical median med (default 0) |

SRankPair(R1, R2) |
T for a pair of samples contained in the one-column ranges R1 and R2, representing the paired sample |

SRankPair(R1) |
T for a pair of samples contained in range R1, where R1 consists of two columns, one for each paired sample |

SRTESTPair(R1, R2, tails) |
p-value for Signed-Rank test using the normal distribution approximation for the pair of samples contained in the one-column ranges R1 and R2, representing the paired sample |

SRANKPair_EXACT(R1, R2, tails, slow) |
p-value for Signed-Rank exact test for the samples in ranges R1 and R2 |

SRTESTPair(R1,,tails) |
p-value for Signed-Rank test using the normal distribution approximation for the pair of samples contained in range R1, where R1 consists of two columns, one for each paired sample |

KRUSKAL(R1) |
Kruskal-Wallis test statistic for the data in R1 |

KTEST(R1) |
p-value for Kruskal-Wallis Test for the data in R1 |

FRIEDMAN(R1) |
Friedman test statistic for the data in R1 |

FrTEST(R1) |
p-value for Friedman’s Test for the data in R1 |

MOODS_STAT(R1) |
chi-square statistic for Mood’s Median Test for the data in R1 |

MOODS_TEST(R1) |
p-value for Mood’s Median Test for the data in R1 |

COCHRAN(R1, raw, cont) |
Cochran’s Q statistic for the data in R1 |

QTEST(R1, raw, cont) |
p-value for Cochran’s Q Test for the data in R1 |

RLower_CRIT(n1, n2, α, tails) |
lower critical value for the runs test with n1 T’s and n2 F’s where α is the significance level (default .05) and tails = 1 or 2 (default) |

RUpper_CRIT(n1, n2, α, tails) |
upper critical value for the runs test with n1 T’s and n2 F’s where α is the significance level (default .05) and tails = 1 or 2 (default) |

The argument *tails* = the # of tails = 1 (default) or 2, except for MANN_EXACT, SRANK_EXACT and SRANKPair_EXACT where default = 2. These exact test functions are limited in terms of the size of the samples they can handle; the *slow* argument is used to increase these limits.

Two data input formats are supported for the COCHRAN, QTEST and QRATIO functions: raw data (*raw *= TRUE) and summarized data, i.e. a multi-variable frequency table (*raw* = FALSE, default). If *cont* = TRUE (default), then a continuity correction is used with two samples (equivalent to McNemar’s Test).

There are also the following array functions:

MANN_TEST(R1, R2, lab, alpha, tails, ties, cont) |
returns the column range: U, alpha, tails, z, r, U-crit, p-value for the Mann-Whitney test for the data in ranges R1 and R2 |

MANN_CONF(R1, R2, lab, ttype, alpha) |
returns the column range: U-crit, alpha, lower, upper, median, U-crit+1, alpha, lower, upper for the Mann-Whitney test for the data in ranges R1 and R2. |

SRANK_TEST(R1, R2, lab, alpha, tails, ties, cont) |
returns the column range: T, alpha, tails, z, r, T-crit, p-value for the paired Wilcoxon signed ranks test for the data in ranges R1 and R2 |

SRANK_TEST(R1, hyp, lab, alpha, tails, ties, cont) |
returns the column range: T, alpha, tails, z, r, T-crit, p-value for the one sample signed ranks test for the data in range R1 and hypothetical median hyp (default = 0) |

SR_CONF(R1, R2, lab, ttype, alpha, nzero) |
returns the column range: T-crit, alpha, lower, upper, median, T-crit+1, alpha, lower, upper for the signed-ranks test for the data in ranges R1 and R2. If nzero = TRUE (default), then differences between values in R1 and R2 that are zero are eliminated. |

SR_CONF(R1, hyp, lab, ttype, alpha, nzero) |
returns the column range: T-crit, alpha, lower, upper, median, T-crit+1, alpha, lower, upper for the signed-ranks test for the data in ranges R1 a with hypothetical median med. If nzero = TRUE (default), then differences between values in R1 and R2 that are zero are eliminated. |

RUNSTEST(s, lab, tails) |
returns the column range: n1, n2, mean, std dev, runs, tails, z-stat, p-value (normal approx.), p-value (exact test) for the one-sample runs test on the data in string s |

RUNSTEST(R1, lab, tails) |
returns the column range: n1, n2, mean, std dev, runs, tails, z-stat, p-value (normal approximation), p-value (exact test) for the one-sample runs test on the numeric data in range R1 |

RUNS2TEST(R1, R2, lab, iter) |
returns the column range: n1, n2, mean, std dev, runs, tails, z-stat, p-value (normal approximation), p-value (exact test) for the two-sample runs test on the data in ranges R1 and R2 when iter = 0 (default); if iter > 0 the test is run iter times to deal with ties, one column of output for each unique value for the # of runs |

QRATIO(R1, raw) |
returns a row range with the percentages of each independent variables in Cochran’s Q test |

If* lab* = TRUE then the output includes a column of labels, while if *lab* = FALSE (the default) only the data is outputted, *tails* = the # of tails: 1 or 2 (default), *alpha* is the significance level (default = .05), if *ties* = TRUE (default) then a correction for ties is applied. If *cont* = TRUE (default) then a continuity correction is applied. For MANN_TEST if COUNT(R1)+COUNT(R2) ≤ 26 and for SRANK if COUNT(R1) ≤ 26 then an exact test is also performed, in which case the critical value and p-value for this test is also output.

For small samples, the p-value of Wilcoxon’s Signed Ranks one-tail test for test statistic *T* and sample size *n *is given by PERMDIST(*T, n*, TRUE). The critical value is PERMINV(*α, n*). The two tail test is given by 2* PERMDIST(*T, n*, TRUE) and PERMINV(*α*/2*, n*). See below for PERMDIST and PERMINV.

For MANN_CONF, if *ttype* = 0 (default) then the normal approximation is used; if *ttype* = 1 then MCRIT and MPROB are used with harmonic interpolation; if *ttype* = 2 then MCRIT and MPROB are used with linear interpolation; and if *ttype* = 3 then MANNINV and MANNDIST are used. The situation is similar for SR_CONF.

**Kolmogorov-Smirnov and Lilliefors functions**

KDIST(x, m) |
value of the Kolmogorov distribution function at x |

KINV(p, m) |
inverse of KDIST; i.e. KINV(p, m) = x where 1−KDIST(x, m) = p |

KSDIST(x, n) |
p-value of the one-sample Kolmogorov-Smirnov test at x for samples of size n |

KSINV(p, n) |
critical value of the one-sample Kolmogorov-Smirnov test at p for samples of size n |

LDIST(x, n) |
p-value of the Lilliefors test for normality at x for samples of size n |

LINV(p, n) |
critical value of the Lilliefors test at p for samples of size n |

LTEST(R1) |
D-max for the data in range R1 based on KS test or Lilliefors test |

KSDIST(x, n1, n2, b, m) |
p-value of the two-sample Kolmogorov-Smirnov test at x (i.e. D-stat) for samples of size n1 and n2 |

KSINV(p, n1, n2, b, iter, m) |
critical value of the two-sample Kolmogorov-Smirnov test at p for samples of size n1 and n2 |

KS2TEST(R1, R2, lab, α, b, iter, m) |
array function which outputs a column vector with the values D-stat, p-value, D-crit, n1, n2 from the two sample KS test for the samples in ranges R1 and R2, where α is the significance level (default = .05) and b, iter and m are as in KSINV. |

If R2 is omitted (the default) then R1 is treated as a frequency table. If *lab* = TRUE then an extra column of labels is included in the output; thus the output is a 5 × 2 range instead of a 1 × 5 range if *lab* = FALSE (default). *m* = the # of iterations used in calculating an infinite sum (default = 10).

**Other Goodness of Fit functions**

ANDERSON(R1, dist, k) |
the Anderson-Darling D statistic for the theoretical cdf values in range R1 (assumed to be sorted in ascending order) based on the distribution defined by dist. |

ADTEST(R1, dist, lab, iter, alpha) |
range containing the Anderson-Darling D statistic, the critical value and the estimated p-value for the Anderson-Darling test on the data in range R1 (not necessarily in sorted order) based on the distribution defined by dist. |

GOFTEST(R1, dist, lab, iter) |
range containing the parameters for the distribution specified by dist, the p-value of the chi-square goodness of fit test and the data value in R1 which has the lowest expected frequency value followed by this expected frequency value |

GOFTESTExact(R1, dist, lab, param1, param2) |
range containing param1 and param2, the p-value of the chi-square goodness of fit test for the distribution specified by dist and the parameter values and the data value in R1 which has the lowest expected frequency value followed by this expected frequency value. |

FIT_TEST(R1, R2, npar) |
p-value for the chi-square goodness of fit test where where R1 = the array of observed data, R2 = the array of expected values and npar = the number of unknown parameters. |

*dist* = 0 (default) for a generic distribution with no unknown parameters, *dist* = 1 for the normal distribution, *dist* = 2 for the exponential distribution, *dist* = 3 for the Weibull distribution and *dist* = 4 for the gamma distribution. For GOFTEST and GOFTESTExact, we also have *dist* = 5 for beta distribution and *dist* = 6 for uniform distribution.

*k* = the shape parameter (alpha) of the gamma distribution (default 1) and is ignored for the other distributions. *alpha* is the significance level (default .05). If *h* = TRUE then harmonic interpolation is used in the in calculating the critical value; otherwise linear interpolation is used. If *lab* = TRUE (default FALSE), then an extra column of labels is appended to the output from ADTEST.

For ADTEST*, iter* is the number of iterations used in the calculation of an approximate p-value (default 20). For GOFTEST, if *iter* = 0, then the method of moments is used to estimate the unknown parameters; if *iter* = -1 then the pure method of moments is used; if *iter* = -2, then the regression approach is used (this is only valid for the Weibull distribution) and if *iter* > 0 (default 20) then an iterative approach is used with *iter* many iterations (except that no iteration is actually performed for the normal or exponential distribution; instead the result is the same as *iter* = -1).

**Interrater Reliability functions**

KAPPA(R1, k, lab, α, orig) |
array function which returns Cohen’s (Fleiss’) kappa for the data in R1 when k = 0 (default) and kappa for category k when k > 0, plus standard error, z-stat, p-value and lower and upper bound of 1−α confidence interval (α defaults to .05) |

WKAPPA(R1, R2, lab, α) |
array function which returns weighted kappa for the data in R1 using the weights in R2, plus standard error and lower and upper bound of 1−α confidence interval (α defaults to .05). |

ICC(R1, class, type, lab, α) |
array function which outputs the intraclass correlation coefficient ICC(class, type) plus the lower and upper bound of the 1−α confidence interval for the data in R1; default values are class = 2, type = 1, α = .05. |

KENDALLW(R1, lab, ties) |
array function which returns a column range consisting of Kendall’s W, r, χ^{2}, df and p-value; if ties = TRUE then a ties correction is applied |

KALPHA(R1, weights, ratings) |
Krippendorff’s alpha for the agreement table in range R1 based on the weights and ratings in the second and third arguments. |

KTRANS(R1, col) |
agreement table that corresponds to the rating table in R1 when column col is removed; no column is removed if col = 0 (default) |

WKAPPA(R1, R2, lab, α) |
array function which returns weighted kappa for the data in R1 using the weights in R2, plus standard error and lower and upper bound of 1−α confidence interval (α defaults to .05). |

ICC(R1, class, type, lab, α) |
array function which outputs the intraclass correlation coefficient ICC(class, type) plus the lower and upper bound of the 1−α confidence interval for the data in R1; default values are class = 2, type = 1, α = .05. |

KENDALLW(R1, lab, ties) |
array function which returns a column range consisting of Kendall’s W, r, χ^{2}, df and p-value; if ties = TRUE then a ties correction is applied |

KALPHA(R1, weights, ratings) |
Krippendorff’s alpha for the agreement table in range R1 based on the weights and ratings in the second and third arguments. |

KTRANS(R1, col) |
array with agreement table for Krippendorff’s alpha that corresponds to the rating table in R1 when column col is removed; no column is removed if col = 0 (default) |

KRIP_SES(R1, lab, weights, ratings, alpha, scorrection) |
column array that contains: Krippendorff’s alpha for the data in the agreement table in range R1, the corresponding standard error for subjects and the lower and upper ends of the 1 – alpha confidence interval for Krippendorff’s alpha. |

KRIP(R1, lab, weights, ratings, alpha, scorrection, rcorrection) |
column array that contains: Krippendorff’s alpha for the data in the rating table in range R1, the corresponding standard error for subjects and the lower and upper ends of the 1 – alpha confidence interval for Krippendorff’s alpha, followed by the total s.e., including subjects and raters) and the lower and upper ends of the 1 – alpha confidence interval corresponding to this s.e. |

KRIP_SER(R1, lab, weights, ratings, alpha, rcorrection) |
the standard error for raters, where range R1 contains data from a rating table for Krippendorff’s alpha. |

GWET_AC2(R1, weights, ratings) |
Gwet’s AC2 for the agreement table in range R1 based on the weights and ratings in the second and third arguments. |

GTRANS(R1, col) |
array with agreement table for Gwet’s AC2 that corresponds to the rating table in R1 when column col is removed; no column is removed if col = 0 (default) |

GWET_SES(R1, lab, weights, ratings, alpha, scorrection) |
column array that contains: Gwet’s AC2 for the data in the agreement table in range R1, the corresponding standard error for subjects and the lower and upper ends of the 1 – alpha confidence interval for Gwet’s AC2. |

GWET(R1, lab, weights, ratings, alpha, scorrection, rcorrection) |
column array that contains: Gwet’s AC2 for the data in the rating table in range R1, the corresponding standard error for subjects and the lower and upper ends of the 1 – alpha confidence interval for Gwet’s AC2, followed by the total s.e., including subjects and raters) and the lower and upper ends of the 1 – alpha confidence interval corresponding to this s.e. |

GWET_SER(R1, lab, weights, ratings, alpha, rcorrection) |
the standard error for raters, where range R1 contains data from a rating table for Gwet’s AC2. |

If *lab* = TRUE, then an extra column of labels is appended to the output (default = FALSE).

If range R2 is omitted in WKAPPA it defaults to the unweighted measure where the weights on the main diagonal are all zeros and the other weights are ones. Range R2 can also be replaced by a number *r*. A value of *r* = 1 means the weights are linear (as in Figure 1), a value of 2 means the weights are quadratic. In general this means that the equivalent weights range would contain zeros on the main diagonal and values (|*i−j*|)^{r} in the *i*th row and* j*th column when *i ≠ j*.

The *weights* argument is a square range containing the weights or the value 0 (default) if categorical weights are used, 1 if ordinal weights are used, 2 for interval weights and 3 for ratio weights.

The *ratings* argument is a row or column range containing the rating values. If omitted then the ratings 1, …, *q* are used where *q* is the size of the range.

*alpha* is the significance level (default .05). *scorrection* and *rcorrection* are the subject and rater correction factors (referred to as the* f* parameters).

**Other Reliability functions**

SPLIT_HALF(R1, R2) |
split half coefficient for the data in ranges R1 and R2 |

SPLITHALF(R1, type) |
split-half coefficient for the scores in the first half of the items in R1 vs. the second half of the items if type = 0 and the odd items in R1 vs. the even items if type = 1. |

GUTTMAN_SPLIT(R1, s) |
Guttman’s lambda for the data in range R1 based on the split described by string s consisting of 0’s and 1’s |

GUTTMAN(R1) |
the Guttman’s reliability measure for the data in range R1, i.e. the maximum Guttman’s lambda based on all possible splits; when the number of splits is too large, a second argument iter can be used to find an approximate maximum Guttman’s lambda based on a randomly generated iter number of splits. |

SB_SPLIT(R1, s) |
split half coefficient (after Spearman-Brown correction) for data in R1 based on the split described by string s consisting of 0’s and 1’s |

SB_CORRECTION(r, n, m) |
Spearman-Brown correction when the split-half correlation based on an m vs. n–m split is r. If n is omitted, then it is assumed that there is a 50-50 split. If n is present but m is omitted, then it is assumed that m = n/2. |

SB_PRED(m, rho, n) |
Spearman-Brown predicted reliability based on m items when Spearman-Brown for n items is rho. |

SB_SIZE(rho1, rho, n) |
the number of items necessary to bring the Spearman-Brown predicted reliability up (or down) to rho1 from n items with Spearman-Brown of rho. |

KUDER(R1) |
Kuder-Richarderson Formula 20 coefficient for the data in range R1 |

CRONALPHA(R1, k) |
Cronbach’s alpha for the data in range R1 if k = 0 and Cronbach’s alpha with kth item (i.e. column) removed if k > 0 |

CALPHA(R1) |
array function which returns a row of Cronbach’s alpha for R1 with each item removed |

ITEMDIFF(R1, mx) |
item difficulty index for the scores in R1 where mx is the maximum score for the item |

ITEMDISC(R1, R2, p, mx) |
item discrimination index where R1 contains the scores for each subject for a single item and R2 contains the corresponding scores for all items based on the top/bottom p% of total scores and mx is the maximum score for the item whose scores are contained in R1 |

**Partitioning**

INIT_SPLIT(n, m) |
a string of length n consisting of m 0’s followed by n–m 1’s. If omitted m defaults to n/2. |

NEXT_SPLIT(s) |
the string representing the next split after the split represented by s. |

RAND_SPLIT(n, m) |
a random string of length n consisting of m 0’s and n–m 1’s. If omitted m defaults to n/2. |

COV_SPLIT(R1, s) |
sample covariance for the data in range R1 based on the split described by string s. |

CORR_SPLIT(R1, s) |
correlation for the data in range R1 based on the split described by string s. |

INIT_PARTITION(n) |
a string consisting of n 0’s |

NEXT_PARTITION(n) |
the string representing the next partition after the partition represented by s. |

RAND_PARTITION(n) |
a random string of length n |

These strings consist only of 0’s and 1’s.

**Frequency table**

FREQTABLE(R1) |
array function which returns an n × 3 array which contains the frequency table for the data in range R1, where n = the number of unique values in R1 (i.e. the number of data elements in R1 without duplicates) |

FREQTABLE(R1, bsize, bmax) |
an array function which produces the frequency table for the data in range R1, assuming equally sized bins of size bsize; bmax = largest bin value, if omitted maximum value in R1 is used |

FREQ2RAW(R1, R2) |
array function which takes a frequency table described by R1 (data items) and R2 (frequencies) and outputs the equivalent raw data (in the form of a column vector) |

FREQ2RAW(R1) |
array function which is equivalent to FREQ2RAW(R2, R3) where R2 is the first column in R1 and R3 is the second column in R1 |

**Noncentral distribution functions**

NT_DIST (t, df, δ, cum, iter, prec) |
value of the noncentral t distribution T(df,δ) at t if cum = TRUE and pdf at t if cum = FALSE |

NCHISQ_DIST (x, df, λ, cum, iter, prec) |
value of the noncentral chi-square distribution χ^{2}(df,λ) at x if cum = TRUE and pdf at x if cum = FALSE |

NF_DIST(x, df1, df2, λ, cum, iter, prec) |
value of the noncentral F distribution F(df1,df2,λ) at x if cum = TRUE and pdf at x if cum = FALSE |

NT_INV (p, df, δ, iter, iter0) |
inverse of the noncentral t distribution T(df,δ) at p |

NCHISQ_INV (p, df, λ, iter, iter0, prec) |
inverse of the noncentral chi-square distribution χ^{2}(df,λ) at p |

NF_INV(p, df1, df2, λ, iter, iter0, prec) |
inverse of the noncentral F distribution F(df1,df2,λ) at p |

NT_NCP (p, df, x, iter, iter0, prec) |
ncp value δ such that NT_DIST(x, df, δ, TRUE, m) = p |

NCHISQ_NCP (p, df, x, iter, iter0, prec) |
ncp value λ such that NCHISQ_DIST(x, df, λ, TRUE, m) = p |

NF_NCP(p, df1, df2, x, iter, iter0, prec) |
ncp value λ such that NF_DIST(x, df1, df2, λ, TRUE, m) = p |

*iter* = maximum number of terms in the infinite sum that will be calculated (default 1000), *prec* = desired level of accuracy for the calculation (default 0.000000001 for the χ^{2} and F distribution and 0.000000000001 for the t distribution). *iter*0* *= the number of iterations used to calculate the inverse or noncentrality parameter (default 40).

**Statistical power functions**

NORM1_POWER(d, n, tails, α) |
power of a one sample normal test when d = Cohen’s effect size |

NORM2_POWER(mm, s1, s2, n1, n2, tails, α) |
power of a two sample normal test when mm = |μ_{2}−μ_{1}| and s1 and s2 are the corresponding population standard deviations |

BINOM_POWER(p0, p1, n, tails, α) |
power of a one sample binomial test when p0 = probability of success on a single trial based on the null hypothesis, p1 = expected probability of success on a single trial |

CORREL1_POWER(r0, r1, n, tails, α) |
power of a one sample correlation test using the Fisher transformation when r0 = population correlation (based on the null-hypothesis), r1 = effect size (observed correlation) |

T1_POWER(d, n, tails, α, iter, prec) |
power of a one sample t test when d = Cohen’s effect size |

T2_POWER(d, n1, n2, tails, α, iter, prec) |
power of a two sample t test when d = Cohen’s effect size; if n2 is omitted or set to 0 then n2 is considered to be equal to n1 |

VAR1_POWER(ratio, n, tails, α) |
power of a one sample variance test where ratio = (effect size) |

VAR2_POWER(ratio, n1, n2, tails, α) |
power of a two sample variance test where ratio = (effect size) |

CHISQ_POWER(w, n, df, α, iter, prec) |
power of a chi-square goodness of fit or independence test where w = Cohen’s effect size and df = degrees of freedom |

ANOVA1_POWER(f, n, k, type, α, iter, prec) |
power of a one-way ANOVA where k = # of groups, f = Cohen’s effect size if type = 1 (default), f = RMSSE effect size if type = 2 and f = noncentrality parameter if type = 0 |

REG_POWER(e, n, k, type, α, iter, prec) |
power of multiple regression where k = # of predictors, e = Cohen’s effect size f if ^{2}type = 1 (default), e = R^{2} effect size if type = 2 and e = noncentrality parameter if type = 0 |

CALPHA_POWER(ca0, ca1, n, k, tails, α) |
power of Cronbach’s alpha test where ca0 = Cronbach’s alpha under the null hypothesis, ca1 = Cronbach’s alpha under the alternative hypothesis and k = # of items |

ICC_POWER(ρ0, ρ1, n, k, α) |
power of ICC(1,1) test where ρ0 = ICC(1,1) under the null hypothesis, ρ1 = ICC(1,1) under the alternative hypothesis and k = # of items |

*n*, *n*1, *n*2 = the sample size, *tails* = # of tails: 1 or 2 (default), *α* = alpha (default = .05) and *m *and* prec *as for the noncentral distribution functions.

**Sample size functions**

NORM1_SIZE(d, 1−)β, tails, α |
minimum sample size required to obtain power of at least 1−β in a one sample normal test when d = Cohen’s effect size |

NORM2_SIZE(mm, s1, s2, 1−β, tails, α, nratio) |
minimum sample size required to obtain power of at least 1−β in a two sample normal test when mm = |μ_{2}−μ_{1}| and s1 and s2 are the corresponding population standard deviations |

BINOM_SIZE(p0, p1, 1−)β, tails, α |
minimum sample size required to obtain power of at least 1−β in a one sample binomial test when p0 = probability of success on a single trial based on the null hypothesis, p1 = expected probability of success on a single trial |

CORREL1_SIZE(r0, r1, 1−β, tails, α) |
minimum sample size required to obtain power of at least 1−β in a a one sample correlation test using the Fisher transformation when r0 = population correlation (based on the null-hypothesis), r1 = effect size (observed correlation) |

T1_SIZE(d, 1−β, tails, α, iter, prec) |
minimum sample size required to obtain power of at least 1−β in a one sample t test when d = Cohen’s effect size |

T2_SIZE(d, 1−β, tails, α, nratio, iter, prec) |
minimum sample size required to obtain power of at least 1−β in a two sample t test when d = Cohen’s effect size |

VAR1_SIZE(ratio, 1−β, tails, α) |
minimum sample size required to obtain power of at least 1−β in a one sample variance test where ratio = (effect size) |

VAR2_SIZE(ratio, 1−β, tails, α, nratio) |
minimum sample size required to obtain power of at least 1−β in a two sample variance test where ratio = (effect size) |

CHISQ_SIZE(w, df, 1−β, α, iter, prec) |
minimum sample size required to obtain power of at least 1−β in a chi-square goodness of fit or independence test where w = Cohen’s effect size and df = degrees of freedom |

ANOVA1_SIZE(f, k, 1−β, type, α, iter, prec) |
minimum sample size required to obtain power of at least 1−β for a one-way ANOVA where k = # of groups, f = Cohen’s effect size if type = 1 (default) and f = RMSSE effect size if type = 2 |

REG_SIZE(e, k, 1−β, type, α, iter, prec) |
minimum sample size required to obtain power of at least 1−β for multiple regression where k = # of predictors, e = Cohen’s effect size f if ^{2}type = 1 (default) and e = R^{2} effect size if type = 2 |

CALPHA_SIZE(ca0, ca1, k, 1−β, tail, α) |
minimum sample size required to obtain power of at least 1−β for Cronbach’s alpha test where ca0 = Cronbach’s alpha under the null hypothesis, ca1 = Cronbach’s alpha under the alternative hypothesis and k = # of items |

ICC_SIZE(ρ0, ρ1, k, 1−β, α) |
minimum sample size required to obtain power of at least 1−β for ICC(1,1) test where ρ0 = ICC(1,1) under the null hypothesis, ρ1 = ICC(1,1) under the alternative hypothesis and k = # of items |

*tails* = # of tails: 1 or 2 (default), *α* = alpha (default = .05), *iter *and* prec *as for the noncentral distribution functions, *nratio* = the size of the second sample divided by the size of the first sample (default = 1) or negative of a specific value for the size of the second sample; default for 1−*β* = .80.

**Weibull distribution functions**

WEIBULL_INV(p, α, β) |
inverse at p of the Weibull distribution with parameters α and β |

WEIBULL_MRL(x, α, β) |
the mean residual life of an element that follows a Weibull distribution with parameters α and β that has not yet failed at time x |

WEIBULL_CMEAN(R1, α, β, ncensor, censor) |
expected mean of data in R1 that follows a Weibull distribution with parameters α and β when augmented by ncensor censored elements at time censor. |

WEIBULL_CVAR(R1, α, β, ncensor, censor, iter) |
expected variance of data in R1 that follows a Weibull distribution with parameters α and β when augmented by ncensor censored elements at time censor based on a Monte Carlo simulation of iter random draws. |

**Other distribution functions**

MULTINOMDIST(R1, R2) |
value of multinomial distribution where R1 contains the number of successes and R2 contains the corresponding probabilities of success |

UNIFORM_DIST(x, α, β, cum) |
pdf value at x for uniform distribution with end points α and β when cum = FALSE and corresponding cdf value when cum = TRUE |

UNIFORM_INV(p, α, β) |
inverse at p of the uniform distribution with end points α and β |

POISSON_INV(p, λ) |
inverse at p of the Poisson distribution with mean λ |

NEGBINOM_INV(p, k, pp) |
inverse at p of the negative binomial distribution with parameters k and pp |

EXPON_INV(p, lambda) |
inverse at p of the exponential distribution with parameter lambda |

HYPGEOM_INV(p, n, k, m) |
inverse at p of the hypergeometric distribution with parameters n, k and m |

BETA(x, y) |
value of the beta function at x and y |

QDIST(q, k, df) |
studentized q cumulative distribution value for q with k independent variables and df degrees of freedom |

QINV(p, k, df, tails) |
inverse of the studentized q distribution, i.e. the critical value for the studentized q range; tails = 1 or 2 (default) |

PERMDIST(x,n, cum, slow) |
pdf of permutation distribution with n elements at x if cum = FALSE and the cdf value if cum = TRUE |

PERMINV(p, n, slow) |
inverse of the permutation distribution at p; i.e. the least value of x such that PERMDIST(x, n, TRUE) ≥ p |

PERM2DIST(x,n1, n2, cum, slow) |
pdf of two sample permutation distribution with n1 and n2 elements at x if cum = FALSE and the cdf value if cum = TRUE |

PERM2INV(p, n1, 2, slow)n |
inverse of the two sample permutation distribution at p; i.e. the least value of x such that PERMDIST(x, n12, TRUE) ≥ , np |

MANNDIST(x,n12, n, tails, slow) |
p-value of the Mann-Whitney exact distribution with n1, 2 elements at nx, where tails = 1 (default) or 2 |

MANNINV(p, n12, n, tails, slow) |
inverse of the Mann-Whitney exact distribution at p; i.e. the least value of x such that MANNDIST(x, n12, TRUE) ≥ , np |

RUNSDIST(r, n1, n2, cum) |
probability of getting r runs from a string with n1 T’s and n2 F’s if cum = FALSE (i.e. the pdf at r) and the probability of getting at most r runs from a string of n1 T’s and n2 F’s if cum = TRUE (i.e. the cdf at r) |

RUNSINV(p, n12), n |
inverse at p of the runs distribution with n1 T’s and n2 F’s |

F_DIST(x, df12, , dfcum) |
equivalent to F.DIST, except that df1 and2 can be non-integer df |

F_INV(p, df12), df |
inverse at p of the F distribution based on F_DIST |

IGAMMA_DIST(x, alpha, beta, cum) |
inverse gamma distribution at x |

IGAMMA_INV(p, alpha, beta) |
inverse of the inverse gamma distribution at p |

ICHISQ_DIST(x, df, cum) |
inverse chi-square distribution at x |

ICHISQ_INV(p, df) |
inverse at p of the inverse chi-square cdf |

DIRICHLET_DIST(pvector, avector) |
Dirichlet distribution at pvector (whose values add up to 1) based on alpha parameters in avector |

DIRICHLET_RAND(avector) |
an array pvector consisting of random values for the Dirichlet distribution with alpha values in avector |

F_DIST and F_INV are like Excel’s F.DIST and F.INV functions except that *df*1 and *df*2 don’t have to be integers. In addition, the following functions are available which have a similar advantage over their Excel counterpart: F**_DIST_RT** and F**_INV.RT**.

A default limit of *n*1* + n*2 = 28 (sum of the two sample sizes) has been set for **Perm2Dist**, **Perm2Inv**, **MannDist** and **MannInv**,** **and a limit of** ** *n* = 25 has been set for **PermDist** and **PermInv**.** **The *slow* argument is used to increase these limits.

**Distribution fitting functions**

BETA_FITM(R1, lab, pure) |
an array consisting of the estimated beta distribution alpha and beta values based on method of moments, plus MLE. |

GAMMA_FITM(R1, lab, pure) |
an array consisting of the estimated gamma distribution alpha and beta values based on the method of moments, plus MLE. |

UNIFORM_FITM(R1, lab, pure) |
an array consisting of the estimated uniform distribution alpha and beta values based on the method of moments, plus MLE. |

WEIBULL_FITM(R1, lab, pure, iter, bguess) |
an array consisting of the estimated uniform distribution alpha and beta values based on the method of moments, plus MLE. |

BETA_FIT(R1, lab, iter, aguess, bguess) |
an array consisting of the estimated beta distribution alpha and beta values based on MLE approach, plus actual and estimated mean and variance, and MLE. |

GAMMA_FIT(R1, lab, iter, aguess) |
an array consisting of the estimated gamma distribution alpha and beta values based on MLE approach, plus actual and estimated mean and variance, and MLE. |

EXPON_FIT(R1, lab) |
an array consisting of the estimated exponential distribution lambda value, plus actual variance (sample and population), estimated variance and MLE. |

UNIFORM_FIT(R1, lab, iter) |
an array consisting of the estimated uniform distribution alpha and beta values based on MLE approach, plus actual and estimated mean and variance, and MLE. |

WEIBULL_FIT(R1, lab, iter, bguess, ncensor, censor) |
an array consisting of the estimated Weibull distribution alpha and beta values based on MLE approach, plus actual and estimated mean and variance, and MLE; ncensor = the number of censored data elements (default 0) and censor = time of censoring |

WEIBULL_FITR(R1, lab, benard) |
an array consisting of the estimated Weibull distribution alpha and beta values based on regression, plus actual and estimated mean and variance, and MLE and R-square; if benard = TRUE (default) then the Benard approximation is used. |

POLYGAMMA(z, k) |
digamma function at z if k = 0 (default) and trigamma function at z if k = 1. |

If *lab* = TRUE, then an extra column of labels is appended to the output (default is FALSE). If *pure* = TRUE, then the pure method of moments is used (default is FALSE). MLE is the likelihood estimate for the parameters that are output.

*iter *is the number iterations used in calculating the solution (default 20); for UNIFORM_FIT if *iter* = 0 then alpha and beta are set to the minimum and maximum data values (biased MLE). *aguess* is the initial guess for alpha (use 0 for default initial guess) and *bguess* is the initial guess for beta (use 0 for default initial guess).

**Other functions**

FText(R1) |
formula in cell R1 as text |

VER() |
current version of the Real Statistics Resource Pack (e.g. 5.4 Excel 2013/2016) |

ExcelVer() |
Excel version (e.g. Excel 2010 is 14.0) |

ColLabel(n) |
label of the nth worksheet column (e.g. ColLabel(50) = “AX”) |

FirstBlank(R1) |
address (e.g. “B7″) of the first empty cell in range R1. If R1 contains no empty cells then the value “None” is returned |

FirstNonNum(R1) |
address of the first empty or non-numeric cell in range R1. If R1 contains no non-numeric cells then the value “None” is returned. |

RANDOM(a, b, type) |
non-volatile random number between a and b; if type = TRUE (default) then a decimal number is returned, if type =FALSE an integer is returned |

XGAMMA(x) |
gamma function at x, even when x is a non-integer negative number |

LowerGamma(x, a) |
lower incomplete gamma function at x, a |

UpperGamma(x, a) |
upper incomplete gamma function at x, a |

**Matrix functions**

DET(R1) |
same as MDETERM(R1). |

DIAG(R1) |
array function that returns a column vector with the values on the diagonal of the matrix in range R1 (esp. useful when R1 is a square matrix) |

DIAGONAL(R1) |
array function that returns a square matrix whose main diagonal consists of the elements in range R1 and has zeros elsewhere |

IDENTITY(k) |
array function that returns a k × k identity matrix; if k is omitted then the size of the highlighted range is used instead |

ISCELL(R1) |
TRUE if R1 is a single cell and FALSE otherwise |

ISSQUARE(R1) |
TRUE if R1 is a square range and FALSE otherwise |

LENGTH(R1) |
length of matrix in range R1 = the square root of the sum of the squares of all the elements in R1 (esp. useful for column or row vectors) |

NORM(R1) |
array function that returns the normalized version of the matrix in range R1 |

MERGE(R1, R2) |
array function which merges ranges R1 and R2, i.e. it returns a range containing the values in R2 to the right of those in R1 |

TRACE(R1) |
trace of matrix in range R1 |

SSCP(R1) |
equivalent to MMULT(TRANSPOSE(R1),R1) |

MPOWER(R1, k) |
array function that returns the square matrix in R1 multiplied by itself k times |

**Simultaneous linear equations**

ELIM(R1) |
array function which outputs the results of Gaussian Elimination on the augmented matrix found in the array R1. The shape of the output is the same as the shape of R1 |

LINEQU(R1) |
array function which returns a column vector with solution to linear equations defined by R1; returns an error if no solution or the solution is not unique |

By default, each of these functions assumes that an entry with absolute value less than 0.0001 is equivalent to zero. This value can be changed by using the optional *prec* argument in ELIM(R1, *prec*) and LINEQU(R1, *prec*).

**Table lookup**

Here R1 defines a table, including both the data and row/column headings.

INTERPOLATE(r, r1, r2, v1, v2, h) |
the value between v1 and v2 that are proportional to the distance that r is between r1 and r2, where v1 corresponds to r1 and v2 corresponds to r2 based on the interpolation defined by h where h = 0 means linear interpolation, h = 1 (default) means log interpolation and h = 2 means harmonic interpolation. |

MLookup(R1, r, c) |
the value in the table defined by range R1 in the row headed by r and the column headed by c. |

ILookup(R1, r, c, hc, hr) |
the value in the table defined by range R1 corresponding to row r and column c. If r or c can refer to some value that must be interpolated between row or column headings (using the INTERPOLATE function with h = hr for rows and h = hc for columns), provided those headings are numbers. If the first row (or column) heading is preceded by “>” it refers to values smaller than the next row (or column heading). If the last row (or column) heading is preceded by “>” it refers to values bigger than the previous row (or column heading). |

There are also the following lookup functions for specific tables. In the following *n, n*1*, n*2*, k* and *df* are generally positive integers, α is a decimal between 0 and 1 (default .05) and *tails* takes the value 1 or 2 (default). If *α* exceeds the largest value for alpha in the table or is smaller than the smallest value for alpha in the table then #N/A is returned. For values between those in the associated table, if *interp* = TRUE (default) the recommended interpolations are used, while if *iterp* = FALSE a linear interpolation is used.

TauCRIT(n, α, tails, interp) |
critical value in the Kendall’s Tau table |

RhoCRIT(n, α, tails, interp) |
critical value in the Spearman’s Rho table |

SRankCRIT(n, α, tails, interp) |
critical value in the Wilcoxon Signed Rank table |

KSCRIT(n, α, tails, interp) |
critical value in the One Sample Kolmogorov-Smirnov table |

LCRIT(n, α, tails, interp) |
critical value in the Lilliefors table |

WCRIT(n1, n2, α, tails, interp) |
critical value in the Wilcoxon Rank-Sum table |

MCRIT(n1, n2, α, tails, interp) |
critical value in the Mann-Whitney table |

QCRIT(k, df, α, tails, interp) |
critical value in the Studentized Range Q table |

DCRIT(k, df, α, tails, interp) |
critical value in the Dunnett’s test table |

DLowerCRIT(n, k, α, interp) |
lower critical value in the Durbin-Watson Table |

DUpperCRIT(n, k, α, interp) |
upper critical value in the Durbin-Watson Table |

ADCRIT(n, α, dist, k, interp) |
critical value in the Anderson-Darling Tables where dist and k are as for ADTEST. |

KS2CRIT(n, α, tails, interp) |
critical value in the Two Sample Kolmogorov-Smirnov table |

Related to the above critical values determined by one of the statistics tables are the following estimates of p-values based on a linear interpolation, if necessary, of the values in these statistics tables. The parameter *iter* = the # of iterations used to arrive at the p-value. If the p-value exceeds the largest value for alpha in the table then a value of 1 is returned; if the p-value is less than the smallest value for alpha in the table then 0 is returned. If *interp* = TRUE (default), then the recommended interpolations are used; otherwise linear interpolation is used.

WPROB(W, n1, n2, tails, iter, interp) |
p-value for the Wilcoxon Rank-Sum test |

MPROB(U, n1, n2, tails, iter, interp) |
p-value for the Mann-Whitney test |

SRankPROB(T, n, tails, iter, interp) |
p-value for the Wilcoxon Signed Rank test |

KSPROB(x, n, tails, iter, interp) |
p-value for the One Sample Kolmogorov-Smirnov test |

LPROB(x, n, tails, iter, interp) |
p-value for the Lilliefors test |

SWPROB(n, W, b, interp) |
p-value for the Shapiro-Wilk test for a sample of size n and statistic W; if b = TRUE then use the Royston algorithm is used, while if b = FALSE then a table look-up (table 2) is used |

ADPROB(x, dist, k, iter, interp) |
p-value for the Anderson-Darling test where dist and k are as in ADCRIT or ADTEST. |

KS2PROB(x, n1, n2, iter, interp) |
p-value for the Two Sample Kolmogorov-Smirnov test |

**Sorting and eliminating duplicates**

QSORT(R1, b) |
array function which fills highlighted range with data from R1 in sorted order (by columns); b is an optional parameter (default = TRUE); if b is TRUE then sort is in ascending order and if b is FALSE (or 0) sort is in descending order. |

QSORTRows(R1, col, ascend, head) |
array function which fills highlighted range with rows from R1 in sorted order based on the elements in the col^{th} column of R1; if ascend is TRUE (default) then sort is in ascending order and if ascend is FALSE (or 0) sort is in descending order; if head = TRUE (default is FALSE) then first row is presumed to contain column headings and is not sorted |

(R1, QSORT2Rowscol1, col2, ascend, head) |
array function which fills highlighted range with rows from R1 in sorted order using two sort keys col1 and col2; ascend and head are as for QSORTRows. |

(R1, QSORT2RowsMixedcol1, col2, ascend, head) |
array function which sorts R1 (containing only numeric data) as for QSORT2Rows, except that if ascend is TRUE (default) then primary sort on col1 is in ascending order and secondary sort on col2 is in descending order, while if ascend is FALSE then primary sort is descending and secondary sort is ascending |

SortUnique(R1, s) |
array function which fills highlighted range with unique elements from R1 in ascending sorted order; output is a single column |

ExtractUnique(R1, s) |
array function which fills highlighted range with unique elements from R1 in the order they appear first in R1; output is a single column |

NODUPES(R1, s, b) |
array function which fills highlighted range with data from R1 eliminating any duplicates (by columns); if b = TRUE then the data in R1 is sorted first (by column) and if b = FALSE then it is assumed that range R1 is in sorted order. |

SortRowsUnique(R1, s) |
array function which fills highlighted range with unique rows from R1 in ascending sorted order; output has same number of columns as R1. |

SortRowsCount(R1, s) |
array function which fills highlighted range with unique rows from R1 in ascending sorted order plus a count of the number of occurrence of that row in R1; output has one more column than R1. |

SortRowsSum(R1, s) |
array function which fills highlighted range with unique rows from R1 excluding last column in ascending sorted order plus a sum of the values in the last column of R1 for matching occurrences; output has same number of columns as R1. |

The string *s* is used as a filler in case the output range has more rows/cells than needed. This argument is optional and defaults to the error value #N/A.

**Sampling and reshaping**

REVERSE(R1, s) |
array function which fills highlighted range with data from R1 in reverse order (by columns) |

REV(R1) |
array function which reverses the elements in R1; resulting range should have the same size and shape as R1. |

RESHAPE(R1, s) |
array function which fills highlighted range with data from R1 (by columns) |

RESHAPE(R1, s, nrows, ncols) |
array function which fills an nrows × ncols range with data from R1 (by columns) |

SHUFFLE(R1, s) |
array function which fills highlighted range with a permutation of the data from R1 (sampling without replacement) |

RANDOMIZE(R1, s) |
array function which fills highlighted range with a random selection of data from R1 (sampling with replacement) |

DELBLANK(R1, s) |
array function which fills highlighted range with data from R1 (by columns) omitting any empty cells |

DELNonNum(R1, s) |
array function which fills highlighted range with data from R1 (by columns) omitting any non-numeric cells |

DELROWBLANK(R1, b, s) |
array function which fills highlighted range with data from R1 (by columns) omitting any row with one or more empty cells; if b is TRUE then the first row of R1 (presumably containing column headings) is always copied (even if it contains a blank cell); the second argument is optional defaults to b = FALSE. |

DELROWNonNum(R1, b, s) |
array function which fills highlighted range with data from R1 (by columns) omitting any row with one or more non-numeric cells; if b is TRUE then the first row of R1 (presumably containing column headings) is always copied (even if it contains a non-numeric cell); the second argument is optional and defaults to b = FALSE. |

DELErr(R1) |
array function which fill highlighted range with data from R1 except that any cells with an error value are replaced by the empty value “” |

SUBMATRIX(R1) |
array function that outputs an nrows × ncols subrange of R1 (nrows defaults to m – row1 + 1 and ncols defaults to n – col1 + 1) starting at the cell in the row1^{th} row (default 1) and col1^{th} column (default 1) of the m × n range R1. |

SelectCols(R1, ss, c) |
array function which fill highlighted range with the columns from R1 defined by the string ss, where ss contains a list of column numbers of the columns to be selected from R1; if c > 0 then also sort the rows in the output based on the elements of the cth column in the output |

The string *s* is used as a filler in case the output range has more cells/rows than needed. This argument is optional and defaults to the error value #N/A.

**Categorical Coding**

CATCODE(R1) |
array function which fills highlighted array with simple coding of values in range R1; if R1 is an m × n range, highlight an m × n range |

TAGCODE (R1, b) |
array function which fills highlighted array with dummy coding of values in column range R1; if R1 has m rows and k unique values then highlight an m × (k–1) range; if b = TRUE (default) normal dummy coding is used, if b = FALSE alternative dummy coding is used. |

Hi,

I use a lot of your functions in excel.

However, I find that while your two factor ANOVA supplemental function in excel gives the correct F value, the p-value, calculated using the ATestInt function is inconsistent with results from other statistical packages. Can you please check into this?

regards

Ram

Hi Ramkumar,

Thanks for bringing this issue to my attention. I have checked the software and found that the ATESTInt function just returns FDIST(F, dfInt, dfW). If the value of F is correct then the only places where the formula can give the wrong answer are if dfInt or dfW is incorrect or Excel’s FDIST function gives the wrong answer. These don’t seem so likely.

Can you check whether the value of the supplemental function dfInt(R1, r) or dfWF(R1, r) gives an answer different from the one obtained from other statistical packages? Alternatively can you send me a specific example where you obtained a value from ATESTInt(R1,r) which was different from other statistical packages? In this way I can check where the problem is.

Charles

Hi,

I’ve already downloaded RealStats.xlam, I can also activate the statistical package. However I do know how to operate the Supplemental Functions. It seems unavailable in the Function list neither it cannot work if I type the function in any cells. Someone please give me a clue.

Thanks

Hi Hilman,

Once you have downloaded and installed RealStats.xlam you should be able to type any of the functions in the statistical package in a cell (or range) in an Excel worksheet. You will know whether you have successfully installed the software by typing the formula =VER() into any cell. You should see the release value 2.8.1 (or something similar). If you don’t see this you need to review the instructions on how to install the software at http://www.real-statistics.com/free-download/real-statistics-resource-pack/.

The supplemental functions operate exactly as the standard Excel functions, except that when you enter a supplemental function in a cell you won’t be prompted for its arguments. To get an idea of what the arguments are you can type = followed by the name of the function and then Ctrl-Shift-A. You can also see a list of the supplemental functions in Excel’s Function List under the user defined functions category.

A clearer list of the supplemental functions is available on the website — e.g. http://www.real-statistics.com/excel-capabilities/supplemental-functions/ and http://www.real-statistics.com/excel-capabilities/real-statistics-multivariate-functions/.

Note that to use the data analysis tools in RealStats.xlam you need to press Ctrl-m.

Charles

Your package looks great.

I managed to make the vba Tools (like Regression) work, however I cannot use the supplemental functions like slopestest.

=RealStats.xlam!SlopesTest(A5:A19;B5:B19;D5:D20;E5:E20;FALSCH)

(German Excel 2013, Win 8.1)

RealStats.xlam is shown in the VBA code

=ver() Shows 2.8.1

but slopestest does not show up.

can you please help

best regards

Manfred

Manfred,

SlopesTest was introduced in Rel 2.10. If VER() works then probably all the other functions work too. You just need to download the latest release to get SlopesTest. I have tested that release with the Italian version of Excel and it works; let’s hope that the German version works as well.

Charles

This is a wonderful software resource! The 2 questions I have are (1) If you could specify what the specific formula is for the non-parametric p values that are generated using the supplemental MTEST or WTEST commands? The p values do not match what I obtain using other software – appearing about half as great (i.e., more significant). I’m trying to understand why.

Frank,

The MTEST and WTEST functions return the one-tail version of the p-value. You need to double these values to get the two-tail version of the p-value. I have just revised these functions today so that you can specify whether you want the one-tail or two-tail test. To use this capability you need to download the latest release of the software (Release 2.14.1).

Charles

The second question is whether you have any separate tutorial/explanatory material that you have prepared for the Examples worksheets? It is difficult to follow without some explanation of what the origin of the data sets are. Perhaps I am just not seeing this on the website.

Frank,

Everything in the Examples worksheets is on the website somewhere. The first two worksheets (TOC0 and TOC) try to link the Examples worksheets to the website. I will try to see if I can improve on this in the future.

Charles

This is great work. Many thanks, I look forward to sharing this with my students.

COV() function returns just a value, how do I get the whole matrix as the result ?

Charan,

COV() is an

arrayfunction, and so you need to highlight the appropriate range and then press Ctrl-Shift-Enter (i.e. while holding down the Control and Shift keys, press the Enter key). E.g. suppose your data in in the range A1:C10. Since this represents 3 variables (3 columns), you need a 3 x 3 matrix to hold the output. You could therefore highlight the range E1:G3, enter the formula =COV(A1:C10) and press Ctrl-Shift-Enter. The whole matrix will now be visible.Charles

Thanks Charles,

I have figured it out from other posts, selecting the Matrix Operations after clicking Ctlr+M. Have just started using other functionalities too, especially PCA. They look promising, keep up the good work…

thanks for the wonderful resource!

putting the list of functions in as images make it

impossible to search. i was searching for find something

about spearman’s rank correlation test and was

coming up empty. i was going to give up

until i noticed the images.

it would be very helpful to change them to text so

they can be searched.

Kirk,

Originally I had intended these lists to be in the form of tables, but they didn’t display very well, and so I replaced them by images, but I agree with you that this makes it hard to search. Over the next few days I will be modifying the webpages to support a new release of the software, which should be out later today. I will also modify the list of functions as part of this process.

Charles

Hi Charles,

I tried the Wilcoxon rank-sum test with Excel (the function WTEST), but the parameter “t” doesn’t work. If I just write “=WTEST(R1;R2)” it’s ok, but if I try to add the parameter “t” (1 or 2) at the end it doesn’t work. It says “#VALUE!”. Do you know why?

Thanks,

Julien

Julien,

I just retested the WTEST function and it works fine when the t parameter is added. What version of Excel are you using? Is it for English language or some other language?

Charles

Hi Charles,

I use Microsoft Excel for Mac 2011 in English

Here is my command: =WTEST(L2:L35;M2:M35;2)

Any idea?

Charles,

Thank you very much for making this tool available. After weeks of search, I found yours to get eigenvectors of square symmetric matrix.

My matrix is quiet large ( 200 by 200). I am using Excel 2010 with 2.60Ghz processor and 12GB Ram. When I run the eVectors command, it is taking over 20 minutes to execute and display results.

Is there limitations in the code for size or otherwise? Is there anything you are aware of that might cause slowness in the calculations?

Thank you

Sam,

The code that I use is not optimized for performance. There are various techniques for doing this that I have not yet implemented. I will be re-evaluating this procedure shortly. I would like to modify the code so that it works with non-symmetric matrices. I will also see if I can improve the performance.

Charles

Hi Charles,

Thank you for sharing this great resource. I have installed Real Statistics and using the tool successfully. Some reason, I couldn’t use Real Statistics Supplemental Functions. For example, I am trying to calculate AIC using”LogitRSquare(R1, lab, raw, iter)”, but it gives nothing or sometimes “LL”.

When I click on the cell where you calculated AIC in the “Sheet: Logit 6a” in the example work book that I download, it shows “=’http://www.real-statistics.com/Users/C/AppData/Roaming/Microsoft/AddIns/RealStats.xlam’!LogitRSquare(A4:D15,TRUE) “. I thought, it should show my local directory where I installed. Is this a problem?

If you have any information to use Real Statistics Supplemental Functions, that would be great.

Thanks,Rben

Rben,

The reason why you are only getting a strange result when you use LogitRSquare(R1, lab, raw, iter) is that this is an array function. To use this function, you need to highlight a range of cells (a 7 x 2 range for this function if lab is set to TRUE) and then press Ctrl-Shft-Enter (instead of just Enter). See the webpage http://www.real-statistics.com/excel-environment/array-formulas-functions/ for more details about array functions in Excel.

When using a spreadsheet that someone else has prepared which refers to an Excel add-in, you need to tell Excel where that add-in is on your computer. Fortunately you only need to do this once. This is the reason why you see the string =’http://www.real-statistics.com/Users/C/AppData/Roaming/Microsoft/AddIns/RealStats.xlam’!LogitRSquare(A4:D15,TRUE).

Please refer to webpage http://www.real-statistics.com/free-download/examples-installation/ for how to do this.

Charles

Thanks for this, it is really amazing and helpful. Cheers!

Hi,

Congratulations for your superb job that made me feel both surprised and gratified for having my name in the references section.

I found that, like me, you document some sheets with the formulas used in adjacent cells. I don’t know which method you use. I consider that using the following function is very useful. In models you can toggle between showing and hiding the result of the function.

Function GetFormula(cell As Range, Optional shw As Integer) As String

If IsMissing(shw) Then

shw = 0

End If

If shw 0 Then

GetFormula = cell.Formula

Else

GetFormula = cell.Formula

End If

End Function

Regards

António Teixeira

Hi António,

Thank you for your kind words. I also appreciated your paper, which helped me when I was researching how to calculate statistical power.

Thanks also for your suggestion regarding documenting formulas. I am using a more primitive version of the GetFormula formula, which is called FTEXT (i.e. formula text), but I may try to modify it along the lines you are suggesting.

Charles

The site does not accept the characters for different : what is intended is

shw different from zero where you find shw 0

shw <> 0

Great work done Charles. this is the first time i’m having something useful for my Mac excel. your package is going to help me a lot.

will be counting on you for more

Great to hear.

Charles

Charles,

Would like to ask if the Real Statistics volume has an initial count on a stress tester, please forgive me as i am still a newbie to the Excel programme 🙂

Would be really helpful if it was installed as well

Thank You 🙂

No, if you are referring to stress testing of banks or investments.

Charles

Hi,

I downloaded the Real Statistics pack for Excel 2007, and it does not contain Logistics Regression. May I know if there are any changes made to the pack?

Casper

Casper,

Logistic Regression is now an option of the Regression data analysis tool.

Charles

When I try to use the logitcoeff functions on my table of data, I am just getting a blank answer, any idea what I might be doing wrong? Or could I send you a copy of my sheet for you to take a look? Thanks!

Yes, you can send me a copy.

Charles

Thanks, but it turns out I was just being bad with Excel and array formulas. Your sample Excel file with all of the formulas showed me what I was doing wrong (just took several tries to figure it out).

As a side note, just wanted to say thanks for putting all of this together and being very responsive, it’s all extremely helpful!

I installed your pack into Excel. I found the Sample Size location, but all I want to find the sample size of, let’s say, 1,000 items for an attribute, such as “does documentation support the amount of an expense?” I am not sure what to put in the dialog box. I want 90% confidence, ±5% precision, expected rate 3% (A guess up front). What would I enter in the parameters to get a sample size?

John,

I can’t tell from “does documentation support the amount of an expense?”, which statistical test you are using. I need to know this to answer your question.

Charles

I just want to get the sample size needed for any given “universe” and I am looking for the easiest way to do that for a simple attribute sample. The attribute is a simple yes or no. As you can see, I am no expert at statistics, so I have to keep it as simple as possible – plug in the parameters and get a number.

John,

I understand that you want to keep it simple, but I still need to know what test you are conducting.

Charles

I don’t understand what you mean by what test I am doing. I am just looking at a sample and determining if the staff processed it right – yes or no.

John,

In this case, I don’t understand your question.

Charles

Dear sir,

Please first allow me to congratulate you on this REALLY NEEDED add-in.

I think that I have a simple question, but after 1 week on your website (and having downloaded the add-in), I still cannot find the solution, so I’m obliged to write to you:

I have 40 men and 60 women, to whom I asked “do you like Internet?”. The possible replies were “not at all”, “a bit”, “somehow”, “a lot” and “I can’t live without!” (I consider this Likert scales as equal distant and, of course, it is just an example with random numbers). I got the following results: Men [2, 6, 15, 10, 7], meaning that men tend to feel “medium” and Women [1, 3, 15, 25, 16], meaning that women seem more positive.

I want to prove (or not) that Men react differently than Women (at a=0.05). My Statistics professor insists that the result should be p=0.014 (by “linear-by-linear” chi-square), but I think that Pearson (p=0.108) is more correct (at least based on what you mention in your website).

May I ask:

a) What is your opinion,

b) If my professor is correct, how do I do it with your add on and

c) Which group likes internet more?

Thank you very much in advance, people like you tend to be rare nowdays…

Chris,

The chi-square test for independence is used with categorical data. The data that you have is ordered and so chi-square isn’t really the right test. Also you have too many cells with fewer than 5 elements.

Better to use a Likert scale with “not at all”, “a bit”, “somehow”, “a lot” and “I can’t live without!” becoming 1,2,3,4,5 (or something similar). You can then use the Real Statistics Frequency data analysis tool to create the raw data for each sample (frequency table to raw data conversion).

Next you need to conduct a t test or better yet a Mann-Whitney test since the data may not be normally distributed. These are provided by the Real Statistics software. Once you do this, you should get a p-value similar to that given by your professor.

Charles

It works!!! Thank you from the bottom of my heart! If you are ever in Greece for vacations, you’ll always have a place to stay (you can find my e-mail, if need be) !

Chris,

Good to hear.

Charles

Hi!

I tried to use the Functions. I can use some like LogiCoeff but I can’t use DRegResiduals and all that about Deming. It don’t recognise the Functions. Can you help me?

Thanks

Tury,

What do you see when you enter the formula =VER()

Charles

4.9 Excel 2010/2013/2016

I tried with Ecxel-Translator in Italian language but there are not traslation for DRegResiduals.

Tury

Tury,

This is a Real Statistics function and is not available in standard Excel. You can use this function when you download the Real Statistics software, which is available for free.

Charles

Hi,

I have installed Real Statistics many time ago, I can open the window “Real Statistics” with ctr+m.

Tury,

If you are not finding DRegResiduals in the version of Real Statistics, then perhaps you need to download a newer version.

Charles

Hi Charles,

Thanks for lots of learning materials. I have requirement to find a positive or negative trend exists or not. So I wanted to do a Mann Kendal Trend test . I request your help how to do this in excel. I have some information in nist.gov I have done that but the table has values more than 30 which is not in the site. I read some articles where it can be done without Mann’s table. Kindly share the method how it can be done in excel or R.

Thanking you

Best regards,

Habeeb

Habeeb,

Yes, with this big a sample, you can use the normal distribution approximation which doesn’t require the tables. See the following:

Mann-Whitney Test

Charles

Hi Charles, how do I do normality test for residuals in Excel, when I use Shapiro-Wilk test in SPSS? Thanks so much!

Des,

I don’t use SPSS, but the following webpage shows how to do this in Excel:

http://www.real-statistics.com/tests-normality-and-symmetry/statistical-tests-normality-symmetry/shapiro-wilk-test/

Charles

Hi,

I would like to call Real Statistics functions from Visual Basics to build my own program. Nevertheless, I am finding problems with it. Using the following code it does not work.

Dim MyRange As Range

Set MyRange = Range(Cells(2, 1), Cells(Ensayos + 1, 1))

Shap = Application.WorksheetFunction.Shapiro(MyRange)

Thank you.

Best regards,

Antonio.

Antonio,

See http://www.real-statistics.com/real-statistics-environment/calling-real-statistics-functions-in-vba/

Charles

I’ve reinstalled the most recent version of the Resource Package that you have on your site (and this program is amazing btw!). However, I still can’t find many of the functions, such as CRONALPHA or RELIABILITY. It seems like I followed the required steps given I have many of the other functions, such as CHISQ.DIST. Do you have any troubleshooting tips? Thanks!

Heather,

For a list of functions, including links to webpages that provide more details, see the following webpage

http://www.real-statistics.com/free-download/real-statistics-software/

This webpage also provides a list of Real Statistics data analysis tools.

Charles

Mr Charles, I (successfully) installed your *.xlam, but I would like to ask you how we may assess accuracy of functions, like KSDIST(x, n), and also accuracy of table at location:

https://i0.wp.com/www.real-statistics.com/wp-content/uploads/2012/11/one-sample-ks-table.png? I found on internet a “paper” written by Professor George Marsaglia and two Professors from the University of Hong Kong (Wai Wan Tsang and Jingbo Wang), where they, beyond alforithm, gave a C program for calculating this function, so I “copy/pasted” it, just a little reformatted it, and added parallel “my” functions (where I changed “double” into “long double”), and some really mine (long double) functions, like square root, exponential and logarithm sunctions, and changed their normalisation constant from 1e140 into 2^4096, so both that constant and its’ reciprocal value are representable in binary floating point variables. If you want examples you may contact me at my eMail…

The functions are evaluated in double precision (compatible with Excel).

Charles

Thanks for fast reply, but do values KSDIST(303485/1000000;25) and KSDIST(303495/1000000;25) have to straddle 0.02, because in table … for 2% and 25-sample test is value 0.30349? And there are more values where table … and functions from *.xlam “disagree”…

Both the table values and the Real Statistics function values (i.e the *.xlam values) are estimates and will likely differ a little for almost all entries. Statistics is, by definition, not an exact science. You are always dealing with estimates, probabilities and interpretations.

These small differences should not matter much.

Charles