Excel provides a data analysis tool called **Descriptive Statistics** which produces a summary of the key statistics for a data set.

**Example 1:** Provide a table of the most common descriptive statistics for the scores in column A of Figure 1.

**Figure 1 – Output from Descriptive Statistics data analysis tool**

The output from the tool is shown in the right side of Figure 1. To use the tool, select **Data > Analysis|Data Analysis** and choose the **Descriptive Statistics** option. A dialog box appears as in Figure 2.

Now click on **Input Range** and highlight the scores in column A (i.e. cells A3:A14). If you include the heading, as is done here, check **Labels in first row**. Since we want the output to start in cell C3, click the **Output Range** radio button and insert C3 (or click on cell C3). Finally click the **Summary statistics** checkbox and press the **OK** button.

Note that if we had also checked the **Kth Largest** checkbox, the output would also contain the value for LARGE(A4:A14, *k*) where *k* is the number we insert in the box to the right of the label Kth Largest. Similarly, checking the **Kth Smallest** checkbox outputs SMALL(A4:A14, *k*). The option **Confidence Interval for Mean** option generates a confidence interval using the *t* distribution as explained in One Sample t Test.

**Real Statistics Data Analysis Tool**: The Real Statistics Resource Pack provides the supplemental **Descriptive Statistics and Normality** data analysis tool which outputs the above statistics plus GEOMEAN, HARMEAN, MAD, AAD and IQR. But instead of just generating the numerical value of each statistic, as is in Excel’s **Descriptive Statistics** data analysis tool, the Real Statistics tool outputs the appropriate Excel formula for computing each statistic (see Figure 4 below). Thus whenever the input data values change, the output values will change automatically as well.

Both Excel’s **Descriptive Statistics** and the Real Statistics **Descriptive Statistics and Normality** data analysis tools allow you to report on multiple sets of data at the same time, as shown in the following example.

**Example 2** – Use Excel’s** Descriptive Statistics** data analysis tool as well as the Real Statistics **Descriptive Statistics and Normality** data analysis tool to show the descriptive statistics for the two samples on the left side of Figure 3.

The output from the Excel **Descriptive Statistics** data analysis tool is given on the right side of Figure 3.

To use the Real Statistics data analysis tool, enter **Ctrl-m** and select the **Descriptive Statistics and Normality** option. A dialog box will now appear as shown in Figure 4. Insert A5:B18 into the **Input Range**, make sure **Column headings included with data** is checked and select the **Descriptive Statistics** option.

**Figure 4 – Real Statistics Descriptive Statistics and Normality dialog box**

After clicking on the **OK** button the following output will be displayed:

**Figure 5 – Real Statistics Descriptive Statistics data analysis tool**

As described above, the tool actually generates formulas instead of the numerical values.

Hello Charles,

Thanks for your web site and tools. Very very useful !!

I have 2 questions re QQ plots, that I often use when I want to compare shape/position and spread of n variables (“living” either in n columns, or in 1 column with subscripts in another column):

– Is there an easy way of plotting each variable separately on the same QQ plot ? (I can do it by splitting the data, creating separate QQ plots and then creating a table with all the Data / Std Norm pairs, but it requires some manual work).

– Is there an easy way of replacing the Z-score vertical axis label/scale by a % probability?

Thanks a lot !!!

Emilio,

I am pleased that you are getting value from the website and tools.

1. My main concern about using the same QQ plot for multiple variables is that it may get confusing. It is certainly possible to combine the plots as you have suggested (especially by changing the underlying software), but I haven’t felt the need to do this.

2, Since z scores map into p-values one-to-one it is certainly possible to do this. Is there some reason why you want to do this?

Charles

Thanks Charles,

1- I use it often (using Minitab) because you can very easily compare a few (up to 4 or 5) variables in the same QQ plot. I use it to compare the behaviour of different machines, shifts, locations, raw materials, settings, etc. This allows you to compare at a glance their shape (vs. a normal reference), position (along the horizontal axis), and spread (slope). But I won’t complain if you don’t feel it is a priority ;-).

2- I prefer that view because it allows me to visually estimate probabilities (i.e. areas under the curve between 2 values of the variable).

Thanks again !!, Emilio.

I notice in the Excel descriptive tool there is the option to designate the data is organized in columns or rows. Can you incorporate the same option for your tool?

Scott,

I will add this to my list of enhancements. In the meantime, just transpose the data making the rows into columns. You can transpose data by copying the range of data via

Ctrl-Cand then usingHome>Clipboard|Paste>Transpose.Charles

Hi Charles!

I’m currently using the ms 2013 version. When i want to produce the descriptive statistics of a frequency table, it always generates 2, for each column. My first column is the scores and my second column is the frequency count. I want to generate a table that integrates both, just like what i was doing in the older ms version. Can you help me how? Thanks! great tutorials!

Jasmine,

I don’t recall that Excel produced descriptive statistics for frequency tables even in past versions (Excel 2002, 2003, 2007 or 2010). In any case, you can use the Real Statistics Frequency Table data analysis tool for this purpose. On the dialog box that appears, be sure to choose Frequency Table as the Input format option and click on the Descriptive stats radio button.

You can get more information on the webpage

Histograms

Charles

I am using this tool at the moment to analyse a set of data from 10 M&As but the problem is, it is hard for me to explain/interpret the results in my analysis.

In what way is it hard for your to explain/interpret the results?

Charles

I am attempting to use Realstats add-in for an assignment. When trying to use the Descriptive Statistics, I cannot get it to work. I don’t know where I’m going wrong. I type in my Input Range (D11:D75), I make sure the ‘Column headings’ is ticked and the Descriptive Statistics option is ticked. I then enter my output range. I have tried numerous output ranges. Every time I click ‘OK’ it comes up with ‘Compile error in hidden module: Analysis’. The only time it has given an indication that it may work is when I experimented and made the output range the same as the input, and it then tells me “output will overwrite existing data” which I obviously don’t want. Please help.

Jess,

I am not sure what is going wrong. If you send me an Excel file with your data I can see whether it works properly on my computer. You can get my email address at Contact Us.

Also please answer the following questions:

1. Which version of Excel and Windows or MAC OC are you using?

2. What answer do you get when you enter the formula =VER() in any cell in a spreadsheet?

Charles

Question regarding the MAD:

MAD can be Mean Absolute Deviation or Median Absolute Deviation? With the sample data set this function returns 12. Using =SUM(ABS(K2:K14-AVERAGE(K2:K14)))/COUNT(K2:K14) Excel returns 13.7988165680473 and using =SUM(ABS(K2:K14-MEDIAN(K2:K14)))/COUNT(K2:K14) it returns 13.4615384615385 (both array entered.)

Obviously I am not following something, but what?

Shane,

I am using the following formula for median absolute deviation:

=MEDIAN(ABS(R–MEDIAN(R)))

See the following webpage: Measures of Variability.

Charles

Suggestions: (I am in no way being critical of you wonderful add-in, only making, hopefully constructive, suggestions)

1. No need for a drop-down for the Real Statistics command since there is only one choice.

2. In the Descriptive Statistics module Med-Q1 uses QUARTILE for the box plot output while all the other use QUARTILE.EXC. Probably would be best to go with QUARTILE.EXC for consistency and we never know what Microsoft will do with these semi-depreciated functions.

Shane,

1. I agree, but I don’t know how to write the program which avoids the drop-down.

2. I agree, but since a lot of people still use Excel 2007, which does not support QUARTILE.EXC, I have decided, at least for now, to use QUARTILE as the default.

Charles

Very good teaching but i have not exactly understood statistical tools used to analyse both quantitative and qualitative data