# Real Statistics Data Analysis Tools

Although all the statistical analyses described in this website can be done with standard Excel capabilities, it is often easier to use the supplemental functions and data analysis tools provided in the Real Statistics Resource Pack.

The functions provided in the Real Statistics Resource Pack are summarized in Real Statistics Functions. Here we briefly review the available supplemental data analysis tools.

### Accessing Real Statistics Data Analysis Tools

You can access the Real Statistics data analysis tools by pressing Ctrl-m or via the Add-Ins ribbon (as described in Accessing Real Statistics Tools). One of two dialog boxes will appear which lists all the available supplemental data analysis tools (see Figure 1 and 2).

Figure 1 – Original Real Statistics data analysis tools main menu

Figure 2 – Real Statistics multipage data analysis tools main menu

You next choose one of the data analysis tools from this list. A dialog box will now appear which is similar to that presented in Figure 2 of Excel’s Data Analysis Tools, as described in detail in Using Real Statistics Data Analysis Tools.

The following are the currently supported Real Statistics data analysis tools.

### List of Real Statistics Data Analysis Tools

You can click on any of the data analysis tools listed below (or the listed options within these tools) to get additional information about that tool (or option). The data analysis tools are listed in the tab order of the multipage interface.

### Descriptive Statistics Tab

Descriptive Statistics and Normality, includes:

Frequency Table – creates a frequency table from data

Diversity Indices

ROC Curve and Classification Table

Step Chart

Solve Set of Linear Equations, includes

Extract Columns from Data Range – allows you to create a new data range by selecting certain columns from an existing data range

Reformatting a Data Range

• Reformat Data Range, includes:
• Reshape – reproduce input data in a range of a different shape
• Reverse – reverse the order of the input data range
• Sort – reproduce the data in ascending sorted order
• Sort without duplicates – reproduce the data in sorted order removing duplicates
• Shuffle – randomly permute the data (selection without replacement)
• Randomize – randomly select data from data range (with replacement)
• Remove blank cells – remove all empty cells
• Remove non-numeric cells – remove all cells with non-numeric data
• Remove error cells – replace any error cells by empty cells
• Copy exact formulas – copy a range to a new location without changing the cell addresses

Reformatting a Data Range by Rows

Matrix Operations

• Matrix Operations, includes:
• Transpose – transposes a matrix
• Correlation Matrix – creates the correlation matrix for the specified data range
• Covariance Matrix – creates the population or sample covariance matrix for the specified data range
• Inverse – inverts a square matrix
• Diagonal – creates a diagonal matrix with specified diagonal
• Eigenvalues and Eigenvectors – finds the eigenvalues and eigenvectors of a symmetric square matrix
• Eigenpairs (non-sym) – finds the real eigenvalues and eigenvectors of any square matrix, including non-symmetric matrices
• QR Factorization – finds orthogonal matrix Q and upper triangular R where input matrix = QR
• Schur Factorization – finds orthogonal matrix Q and upper triangular T such that input matrix = QTQT.
• SVD Factorization – finds the Singular Value Decomposition (SVD) consisting of diagonal matrix D and orthogonal matrices U and V where input matrix = UDVT.
• Spectral Factorization – finds the Spectral Decomposition of a symmetric input matrix which is equal to QDQT where D is a diagonal matrix and Q is orthogonal
• Identity Matrix – creates an identity matrix of the specified size
• Empty Box – creates an empty matrix of the specified size and shape

### Regression Tab

Regression, includes

Multiple Linear Regression, includes

Other types of Regression

Binary Logistic and Probit Regression

Multinomial Logistic Regression

Survival Analysis, includes

Confidence and Prediction Interval Plots

### ANOVA Tab

Single Factor Anova, supports Excel and standard formats and includes:

Two Factor Anova, supports Excel and standard formats and includes:

Three Factor Anova, supports standard formats by row and by column and includes:

One Factor Repeated Measures Anova, includes

Two Mixed Repeated Measures Anova, includes

Nested Anova

Randomized Complete Block Design, supports Excel and standard formats and includes:

Split-plot Design, supports Excel and standard formats and includes:

Latin Squares Design, supports Excel and standard formats and includes:

2^k Factorial Design, supports Excel and standard formats and includes:

Two Factor Anova Follow-up

Anova Follow-up (esp. used after Three Factor ANOVA)

Analysis of Covariance (Ancova)

Single Factor MANOVA

• See Multivariate tab below

### Time Series Tab

Basic Time Series Forecasting

ARIMA Models and Forecasting

Time Series Testing

Forecast Accuracy

Correlogram

### Multivariate Analysis Tab

Hotelling’s T-square Test

Single Factor MANOVA

Factor Analysis – includes correlation matrix, eigenvalues/vectors, factor matrix, Varimax rotation, factor scores

Cluster Analysis

Discriminant Analysis

Correspondence Analysis

### Correlation and Reliability Tab

Correlation (one-sample), includes

Internal Consistency Reliability

Interrater Reliability

Item Analysis

• Item analysis (item difficulty, item discrimination and point-serial correlation)

Polychoric Correlation

### Miscellaneous Tab

T Tests and Non-parametric equivalents, includes:

Non-parametric Tests

Chi-square Test for Independence – tests m × n contingency tables for independence, data can alternatively be in stacked format

Cochran-Mantel-Haenszel Test

Goodness of Fit Testing

Distribution Fitting

Resampling

Multiple Imputation (MI)

Full Information Maximum Likelihood (FIML)

Statistical Power and Sample Size Requirements

This data analysis tool supports the following tests:

• One-sample normal test
• Two-sample normal test
• One-sample and paired-sample t test
• Two-sample t test
• One-sample binomial test
• One-sample correlation test
• One-sample variance test
• Two-sample variance test
• Chi-square test (goodness of fit and independence tests)
• One-way ANOVA
• Multiple regression
• Cronbach’s alpha test
• Intraclass correlation ICC(1,1)

### 47 Responses to Real Statistics Data Analysis Tools

1. Jaclyn Norberg says:

Jaclyn

• Charles says:

Jaclyn,

I am not sure what is causing this problem in your case, but the most common cause is trying to open the file containing the software pack. In this case you will indeed be requested to provide a password, but you don’t need to open the file to use the software. It is important that you tell Excel that the software should be considered to be an “add-in”. This is done as described on the website, namely:

1. Move the Resource Pack file to where you want it located on your computer (see our recommendation on the website). Caution: once you install the resource pack at a particular location it will be more difficult to move it later.
2. Select Office Button > Excel Options > Add-Ins in Excel 2007 or File > Help|Options > Add-Ins in Excel 2010/2013, and click on the Go button at the bottom of the window.
3. Check the Realstats option on the dialog box that appears and click the OK button.
4. If this option doesn’t appear, click on Browse to find and choose the realstats.xlam file. Then complete step 2 as described above.

Once you complete these steps you should be able to use the software without providing a password.

Charles

2. Curt Snyder says:

First, I would like to thank you for this project. It is truly wonderful that someone would offer functionality that appears to be this good for free.

I say, “appears to be,” because I have yet to actually be able to run the software. I am on a Mac and continue to encounter the error message of “data library not found” (or something to that effect) and the prompting for a password which was spoken about by Donald McDonald in his post elsewhere on your web site. I wish I had read the post that your add-in was not yet ready for Mac before I willy-nilly proceeded with the install, but, alas, such was not to be the case.

I tried to delete the add-in through the Visual Basic editor, but it prompts me for a password. So, for now, I will just live with the occasional annoyance of my spreadsheets thinking I am trying to invoke your analysis macro when I’m really not.

In any event, I would again like to thank you for your work on this. I found my way to your web site because I am a long-time user of Microsoft’s Analysis Toolpak at work and fully expected it to be present when I purchased Microsoft Office’s new “annual family license” (\$100 per year for five machines) for my small army of Macs-users at home. Regrettably, as you’re probably aware, Microsoft has pulled the Analysis Toolpak from the Mac version (as far as I know, the Windows annual license was unaffected) and is referring those who are affected by this to something called StatsPlus LE.

StatsPlus LE is ok, but does not have the integration that the Analysis Toolpak (or your add-in) possesses: when you run the add-in, it opens a new workbook with the results and leaves you wondering how to link back to your original data so you can make changes to the data and re-run the analysis without going through the whole routine again. To get full integration of StatsPlus LE that the Analysis Toolpak provides (as well as some of the more useful statistical routines), you are required to buy a \$200 license.

One more time, thank you for your work on this. I will certainly be looking forward to the Mac version.

• Charles says:

Hi Curt,
Unfortunately the Real Statistics Resource Pack currently only works in the Microsoft environment. It is not yet available for the Mac. The main problem I have is that I don’t have a Mac myself on which to create the Mac version. I am still looking for someone to give me access to a Mac so that I create and test a Mac version.
Charles

Update: The software now is available for use on the Mac

3. Usman Khalid says:

It is a very useful website. How do I calculate the CORNALPHA using the Resource Pack? I do not think there is an option if and when I press Ctrl+m

Regards

• Charles says:

Usman,
Just enter the function CRONALPHA in any cell in the worksheet, just like any other worksheet function (AVERAGE, MAX, STDEV, etc.). You can also access the Cronbach alpha test by pressing Ctrl-m and choosing the Descriptive Statistics option.
Charles

4. Charles says:

I followed your directions and got to the dialogue box in fig. 2 but each time I send a command it shows a notice; “Compile error in hidden module: Analysis”. How can I solve this problem Sir.

• Charles says:

Charles
Which version of Excel are you using? Which version of the Real Statistics Resource Pack are you using? (you can find this out by enterring the formula =VER() in any cell).
Charles

5. Marco V. says:

Thank You for your great job. I get “Compile error in hidden module: Analysis” while using the “ctrl-m” part of the module. My excel is 2007, SO is Windows XP.
Marco

• Charles says:

Marco,
What do you see when you enter the formula =VER() in any cell?
Charles

6. Marco V. says:

Dear Charles, VER() gives 3.1.2. I have installed the 2010 version on another pc, windows server 2008 r2, and it seem to work. Thank you.

7. kalyani says:

I followed your directions and got to the dialogue box in fig. 2 but each time I send a command it shows a notice; “Compile error in hidden module: Analysis”. How can I solve this problem Sir. Dear Charles, VER() gives 3.6.2. My excel is 2007.

8. Andre says:

Hi, I’m using a brazilian portughese version of excell and installing the realstat2007 version. But when I try to access the resource pack it returns a non specified error and indicates that it may probably be due to a problem running the hidden module. Do you have any clue on how to solve this? Thanks a lot.
Andre

• Charles says:

Andre,
I don’t know why you are getting this error message. What do you get when you enter the formula =VER()? In the error message does it state which hidden module is causing the problem?
Charles

9. Andre says:

Charles, I get the software version: 3.8.1 Excel 2007. When I try to access the realstat tool bar it returns a system error &H80004005 (-21476467229). Non specified error and then a compilation error in the hidden module frmlnput.

In fact, I can use the functions directly (and in fact I do use them this way), but turns the process unproductive.
Andre

• Charles says:

Andre,
What happens if you press Crtl-m instead of accessing the realstats toolbar?
Charles

• Charles says:

Andre,

This seems to be a known bug in Excel. Most people are using the Excel 2007 version of the Real Statistics software without any problems. A few people have reported some problems getting access to the software, but as far as I am aware this the first time this bug has been reported.

If you google “Excel error message &H80004005 (-21476467229)” you will see information about the bug. The webpage http://peltiertech.com/unspecified-painfully-frustrating-error/ seems to summarize the situation well.

Since the bug is only indirectly related to the Real Statistics software, if at all, I don’t know what to suggest. Please let know whether you were using a previous release of the software, in which I can send a copy of that version to you.

Charles

10. Andre says:

Charles, the toolbar opens, but whe I try to run any function it returns the same error as if I access trough the supplement.

11. Lee says:

Hi Charles. Thanks for the availability of this excel add-in. I have attempted to run the SlopeTest function (Excel 2013) and the output is only on cell, the Std Err value. There are no other cells reported. Any suggestions? Thanks

• Charles says:

Lee,
The SlopeTest function is an array function; i.e. the output requires more than one cell. You need to follow the instructions on the webpage Array Functions and Formulas.
Charles.

• Lee says:

Thanks for the quick reply Charles. An oversight on my part. Everything is now working perfectly. Sorry for the inconvenience.

12. Shane Devenshire says:

I downloaded your add-in and I playing with it now. I have a question, not directly related to the add-in – Excel contains the following functions SUMSQ, SUMX2MY2, SUMX2PY2 and SUMXMY2, what practical applications do these functions have – where would on use them?

Thanks,
Shane

• Charles says:

Shane,

SUMSQ is used very often in statistics: to compute variance, Durbin-Watson, regression, length of vector, etc.
SUMXMY2 is used for calculating Durbin-Watson and in K-means cluster analysis

Since SUMX2MY2(R1, R2) = SUMSQ(R1) – SUMSQ(R2) and SUMX2PY2(R1, R2) = SUMSQ(R1) + SUMSQ(R2), they seem much less useful to me.

Charles

13. Egard M. says:

The option of logistic regression does not appear in the dialog box .
I have the 2013 Excel.
Thank you again.

• Charles says:

Choose the Regression option. On the next dialog box that appears, choose the Logistic Regression option.
Charles

14. Nicole says:

Hi,
I am seeing in other posts there is a cluster analysis included but when I download this version I do not see it? Do I need an upgrade?
Nicole

• Charles says:

Nicole,
You should find it under the Multivariate menu choice. If you don’t see it there then you will need to upgrade.
Charles

15. Rajesh Rajamani says:

Came across this hidden gem . I’m a data scientist and mostly play around with R. But this is helpful for my colleagues who are not comfortable with R and not fortunate enough to have Minitab or SPSS in their machine. At least its a stop-gap arrangement where we could run statistical tests locally before communicating findings.

Thanks

16. zineb says:

Merci beaucoup

17. Aakanskha says:

This is very beneficial for me as I am not fortunate enough to have Minitab or SPSS in my official system.
I have installed Real Statistics as per instruction given on site. but when I am trying to use a function by pressing Ctrl +m and selecting the function and data an error comes “Compile error in hidden module: frmMatrix ”
This is happening with each and every function.

• Aakanskha says:

windows 7 ultimate, 32- bit OS, =VER() turns out to 4.5 EXCEL 2007

• Charles says:

I am sorry, but I don’t know what is causing this problem. I am using Real Statistics with Excel 2007 without any problems, but other Excel 2007 users have run into problems similar to yours. I suggest that you look at the comments on the following webpage to see what others have suggested.

In particular, the following suggestion may be useful

“Copy the downloaded file to /Microsoft Office/Office12/Library/Analysis , then restart excel. Then install in normal way.”

Charles

18. Jose says:

Dear Professor,

First of all thanks for this great add-in.
But, who write a Tesis, they have to put/write (at least in APA) the p-value.
But, in some tables, special Contrast, Tukey HSD (I use .. )etc, you only put the “sig” and “no” or “yes”.

It is not possible to put the real p-value in next version?

Thank you, very, very much!
José

• Charles says:

Jose,
That is a reasonable request. I am in the process of testing the next release and I will try to add this information. In the meantime, for Tukey HSD, you can use the QDIST function to get the p-value.
Charles

19. Mike G says:

Thanks for a great tool. I am a NOVICE stat guy just trying to crunch some numbers for fantasy football. The Jenks Natural Breaks algorithm is great for determining the cutoff value for tiers, or what I think you call classes. However, it relies on me telling it how many classes (k) that I want. Is there a way to determine the optimal k? Is it always the case that the larger k is, the better the fit would be? So I guess I would be looking for the lowest k that does not significantly improve with a larger k. If that makes sense, how could I statistically calculate that. I am looking at less than 100 samples, so the number of permutations will not be impossible. Thanks again.

• Charles says:

Mike,
Yes, the larger the number of classes the better the fit would be. In fact, the best fit would occur when each data element is in its own class, but this defeats the purpose of the whole exercise.
I don’t know of a statistical test which determines whether a higher number of classes no longer yields a significant difference. With a small sample, you can experiment until you see that a higher number of classes doesn’t seem to matter much. Usually after about 6 classes, the Jenks Natural Breaks algorithm slows down considerably (assuming that you want to test all possible partitions) and so this may be the determining factor in how many classes to consider.
Charles

20. Nisha Jagasia says:

The control M produces the real stats menu but it does no contain a survival analysis option.. I am trying to perform cox proportional hazards model – why can’t i access survival analysis from menu??

• Charles says:

Nisha,
Sorry, but the Mac version of the software doesn’t yet support these capabilities. Currently, they are only supported in the Windows version.
Charles

• Akash says:

Dear Charles,

Thanks for an amazing tool. Any idea when you might release a Mac version with survival analysis?

Akash

• Charles says:

Akash,
I have the release ready, but I don’t have a Mac to test it on. As soon as I can get a Mac to test on, I will issue the new release.
Charles

• Akash says:

That is great news Charles. Here’s hoping you can get to a Mac soon!

• Akash says:

Charles I have a Mac by the way. Anyway I can help out remotely with testing your new release?

• Charles says:

Akash,
Thanks. I appreciate that. If I can find a Mac to create the release on, then I would be happy to send you the release software to test.
Charles

21. Mulla D I says:

Respected sir,
I want find the “relationship between emotional intelligence and math abilities of secondary school students.”
how can i compare emotional intelligence componets &with what type of math abilities get little confusion. piz help in this regards.

• Charles says:

Mulla,
You haven’t provided enough information for me to respond in any detail. You could start by exploring the correlation coefficient. Perhaps a t test would be useful.
Charles

22. Matha says:

Thank you very much for making this suite available to us. Appreciate your efforts (and the “educational” spirit) in crafting a value added option to a popular workhorse for the masses.

23. Toyang Angco says: