significance | Real Statistics Using Excel

Significance Testing of the Logistic Regression Coefficients

Wald statistic Definition 1: For any logistic regression coefficient b whose standard error is seb, the Wald statistic is given by the formula Wald = b/seb Since the Wald statistic is approximately normal, by Property 3 of Chi-Square Distribution, Wald2 is approximately chi-square, and, in fact, Wald2 ~ χ2(df). Here df = k – k0 where k = the … Read More

Testing the significance of extra variables on the model

Basic Concepts In Example 1 of Multiple Regression Analysis we used 3 independent variables: Infant Mortality, White, and Crime, and found that the regression model was a significant fit for the data. We also commented that the White and Crime variables could be eliminated from the model without significantly impacting the accuracy of the model. … Read More

Testing the significance of the slope of the regression line

We now show how to test the value of the slope of the regression line. Basic Approach By Theorem 1 of One Sample Hypothesis Testing for Correlation, under certain conditions, the test statistic t has the property But by Property 1 of Method of Least Squares and by Definition 3 of Regression Analysis and Property … Read More

Multiple t-Tests

In Dealing with Familywise Error, we describe some approaches for handling familywise errors. We now show how to apply these techniques when multiple t-tests are performed. Example Example 1: A school district uses four different methods of teaching their students how to read and wants to find out if there is a significant difference between … Read More

Kendall-Theil-Sen Regression

Basic Concepts The Kendall-Theil-Sen estimator is a non-parametric method for fitting a line to a set of points (x1, y1), …, (xn, yn). It is a robust method in that it provides a better fitting line when the data contains outliers compared to ordinary least-squares regression. It also doesn’t require that the residuals are normally … Read More

Change Point Test for Binary Data

Basic Concepts When a time series x1 , …, xn contains only binary values, 0’s and 1’s, then we can use a change point test based on the two-sample Kolmogorov-Smirnov test. Such a time series can occur, for example, when xi = 1 if the stock market went up on day i and xi = … Read More

Siegel-Tukey Test for Equal Variability

Basic Concepts The Siegel-Tukey test (for equal variability) is a nonparametric test to determine whether two samples come from populations with equal variances. When data is clearly normally distributed, a parametric test is preferred. Assumptions: The two (random) samples are independent The data is at least ordinal The two populations have the same medians If … Read More

Somers’ d Measure of Asymmetric Association

Basic Concepts Somers’ d statistic is a measure of (asymmetric) association. d takes values between -1 and 1. A value of 1 or -1 means that the independent variable perfectly predicts the dependent variable: +1 when the relationship is positive and -1 when the relationship is negative. A value of 0 means that there is … Read More

Tc correlation between several judges and a criterion

Basic Concepts TC is the average of Kendall’s rank-order correlation coefficient between each judge and the criterion ranking. To calculate TC, first, create a preference matrix P = [aij] as described in Kendall’s u for Paired Rankings. TC can be calculated via any of the following equivalent formulas where k = the number of judges, … Read More

Kendall’s coefficient of agreement u for paired rankings

Basic Concepts We can also use Kendall’s u (as described in Kendall’s u for Paired Comparisons) when the data are based on ranks, although the significance test for agreement is different. Note that the minimum value for u is -1/(k-1) when k is even and -1/k when k is odd. We often prefer an agreement … Read More