We now define some statistics which are commonly used to characterize data and probability distributions. In particular, we define metrics of central tendency (e.g. mean and median), variability (e.g. variance and standard deviation), symmetry (i.e. skewness) and peakedness (i.e. kurtosis).

In addition we provide some important ways of graphically describing data and distributions, including histograms, box plots and QQ plots.

Topics:

- Measures of Central Tendency
- Measures of Variability
- Symmetry, Skewness and Kurtosis
- Ranking Functions in Excel
- Descriptive Statistics Tools
- Frequency Tables
- Histograms
- Creating Box Plots
- ROC Curve and Classification Table
- Outliers and Robustness
- Box Plots with Outliers
- Dealing with Missing Data
- Assumptions for Statistical Tests
- Data Transformations
- Diversity Indices

Uploaded your stat package to my MS Office 2011 for the Macintosh. Tried entering the MAD state (median absolute deviation) for some financial benchmarking data we are analyzing as it is not a normal distribution. When I enter the MAD state formula I get back error message “#value”? Does the data have to be unformatted? It is currently in currency formatted numbers in the distribution…. please advise.

Thank You,

Rod Warnick

UMass

Rod,

I just checked and for some strange reason when using the MAD function the data can’t be in Currency or Accounting format. It can be formatted as General, Number, Percentage, Fraction or Scientific. The software that calculates MAD uses Excel’s MEDIAN function. In Excel MEDIAN works fine with currency data, but produces an error inside the software program (VBA).

I have not checked all the various functions to see whether this problem appears for other functions, but it does not for the few that I have checked.

Thanks for identifying this problem. For now it is best not to use MAD with currency formatted data.

Charles

Dear Charles,

Thank you again for all your work on the Real-Statistics add-in. I am using the add-in for the Biostatistics course that I teach and the students all enjoy it, too. One item that has come up is the box plot function in the descriptive statistics tool. Customarily, the whiskers would only extend to the largest and smallest data points that lie within a distance 1.5 x the IQR from the Q3 and Q1 edges of the box, respectively. Data points outside these limits would then be plotted individually, and considered outliers. It appears, however, that Real-Statistics simply extends the whiskers to the max and min data points, whether or not they are outliers.

Dear Frank,

You are correct. When I implemented the box plot originally I found I couldn’t find a way to chart the outliers in VBA (the programming language for Microsoft Excel). I will look at this again to see whether I can find a way around this.

Charles

I want to extract numbers in set A and Set B with their corresponding frequencies. For example, 53, 67,78, 80,70 and their respective frequencies are 1,6,3,5,4 in set A and then in Set B,we have 35,70,80,49,43 with respective frequencies of 3,7,8,9,1.How do I extract a number with frequency 5 in set A and frequent 8 in set B ( i.e 80). Thank you

John,

Probably the easiest way is to use Excel’s filter capability by choosing Filter from the Data ribbon. Instructions on how to use this capability can be found on the following webpage:

https://support.office.com/en-us/article/Filter-data-in-a-range-or-table-01832226-31b5-4568-8806-38c37dcc180e

Charles