Analysis of Skewness and Kurtosis

Guidelines

Since the skewness and kurtosis of the normal distribution are zero, values for these two parameters should be close to zero for data to follow a normal distribution.

A rough measure of the standard error of the skewness is $\sqrt{6/n}$ where n is the sample size.
A rough measure of the standard error of the kurtosis is $\sqrt{24/n}$ where n is the sample size.

If the absolute value of the skewness for the data is more than twice the standard error this indicates that the data are not symmetric, and therefore not normal. Similarly, if the absolute value of the kurtosis for the data is more than twice the standard error this is also an indication that the data are not normal.

Example

Example 1: Use the above guidelines to gain more evidence as to whether the data in Example 1 of Graphical Tests for Normality and Symmetry are normally distributed.

As we can see from Graphical Tests for Normality and Symmetry, the skewness is SKEW(A4:A23) = .23 (cell D13) with standard error SQRT(6/COUNT(A4:A23)) = .55 (cell D16). Since .23 < 2*.55 = 1.10, the skewness is acceptable for a normal distribution. Also the kurtosis is KURT(A4:A23) = -1.53 (cell D14) with standard error of SQRT(24/COUNT(A4:A23)) = 1.10 (cell D17). Since 1.53 < 2*1.10 = 2.20, the kurtosis is also acceptable for a normal distribution.

Jarque-Barre Test

Related to the above approach is the Jarque-Barre (JB) test for normality which tests the null hypothesis that data from a sample of size n with skewness skew and kurtosis kurt. This test is based on the following property when the null hypothesis holds.

For Example 1

based on using the Excel worksheet functions SKEW and KURT to calculate the sample skewness and kurtosis values. Since CHISQ.DIST.RT(2.13, 2) = .345 > .05, we conclude there isn’t sufficient evidence to rule out the data coming from a normal population.

The JB test can also be performed using the population values of skewness and kurtosis, SKEWP (or SKEW.P) and KURTP functions (instead of SKEW and KURT).

Since CHISQ.DIST.RT(1.93, 2) = .382 > .05, once again we conclude there isn’t sufficient evidence to rule out the data coming from a normal population.

Worksheet Functions

Real Statistics Functions: The Real Statistics Resource Pack supplies the following functions.

JARQUE(R1, pop) = the Jarque-Barre test statistic JB for the data in the range R1

JBTEST(R1, pop) = p-value of the Jarque-Barre test on the data in R1

If pop = TRUE (default), the population version of the test is used; otherwise the sample version of the test is used. Any empty cells or cells containing non-numeric data are ignored.

For Example 1, we see that JARQUE(A4:A23) = 1.93 and JBTEST(A4:A23) = .382. Similarly, JARQUE(A4:A23, FALSE) = 2.13 and JBTEST(A4:A23, FALSE) = .345.

d’Agostino-Pearson Test

The d’Agostino-Pearson test of normality is also based on testing the skewness and kurtosis. This test is more accurate and so is more commonly used than the JB test. See D’Agostino-Pearson Test for details.

Reference

Wikipedia (2012) Jarque-Bera test
https://en.wikipedia.org/wiki/Jarque%E2%80%93Bera_test

39 thoughts on “Analysis of Skewness and Kurtosis”

Peter Steele

December 23, 2021 at 12:29 pm

In every statistics book, JB is calculated with ((C-3)^2)/24, not just (C^2)/24.
The same is true for the JB test statistic on the wikipedia page you refer to.
Reply
Adrian

September 12, 2021 at 10:02 pm

dear charles,
could you tell me a reference for the calculation of the standard error of the skewness and of the kurtosis?
kind regards,
Adrian
Reply
- Charles
  
  September 14, 2021 at 4:09 pm
  
  Hello Adrian,
  I have just added a reference to the webpage, namely the Wikipedia webpage for the JB test.
  You can get a more exact estimate of the standard errors at
  https://www.real-statistics.com/dagostino-pearson-test/
  Charles
  Reply
K

April 23, 2021 at 1:25 am

Hi, could someone tell me what the ‘absolute’ skew and kurtosis values are in terms of SPSS output please? I understand you get the z-scores from doing skew/skew.error and same with kurtosis, but I do not know what the ‘absolute’ values are, and I am trying to follow guidance form Kim 2013) where it says to use absolute skew and kurtosis values with a cut off of >2 and >7 respectively for normality.

Thank you!

Reference:
Kim, H. Y. (2013). Statistical notes for clinical researchers: assessing normal distribution (2) using skewness and kurtosis. Restorative dentistry & endodontics, 38(1), 52.
Reply
- Charles
  
  April 23, 2021 at 9:42 am
  
  Mostly likely the reference is to the absolute value function, namely ABS(7) = 7 and ABS(-7) = 7.
  Charles
  Reply
Tânia

February 18, 2021 at 12:33 pm

Hello!
Are skewness and kurtosis dependent? How to interpret the results when skewness says it is a normal distribution but kurtosis says the opposite?
Reply
- Charles
  
  February 18, 2021 at 9:27 pm
  
  Tania,
  There are distributions where the skewness is near zero but the kurtosis is significantly different from zero and there are other distributions where the kurtosis is near zero but the skewness is significantly different from zero. For data that comes from a normally distributed population both the skewness and kurtosis can’t be significantly different from zero.
  Charles
  Reply
Stefano

October 23, 2020 at 5:58 pm

hello and thank you!
I only have one more, general, question:
are there rough measures of the standard errors of mean value and std. deviation too? (i mean is there a proportionality formula like: sqrt(c/n)? )
Reply
- Charles
  
  October 25, 2020 at 7:27 pm
  
  Stefano,
  The standard error of the mean is the standard deviation divided by the square root of the sample size.
  For normal distributions, you can use Property 3 at
  https://www.real-statistics.com/chi-square-and-f-distributions/chi-square-distribution/
  The standard error of the variance and standard deviation for any distribution are given in
  https://stats.stackexchange.com/questions/156518/what-is-the-standard-error-of-the-sample-standard-deviation
  Charles
  Reply
Rashmi

October 14, 2020 at 3:16 pm

What does it indicate if the skewness and kurtosis value is given as …. and no value… can that be ignored or what can be done
Reply
- Charles
  
  October 14, 2020 at 5:21 pm
  
  Sorry, but I don’t understand your question.
  Charles
  Reply
Zahra

October 2, 2020 at 9:33 am

hi
could you please tell me that how we can use skewness and standard error of skewness for normal distribution?
Reply
- Charles
  
  October 2, 2020 at 9:55 am
  
  Zahra,
  This value is probably more relevant when you have some data and you want to determine whether it is normally distributed. For data to be normally distributed, its skewness value should be close to what is expected of a normal distribution. This is the basis of the d’Agostino-Pearson test for normality. See
  https://www.real-statistics.com/tests-normality-and-symmetry/statistical-tests-normality-symmetry/dagostino-pearson-test/
  Charles
  Reply
Gberindyer Abraham

May 31, 2020 at 10:19 pm

I have carry out a study and test for normality of my data which i discovered that one of the variable has a missing value in the Kurtosis. What is the problem or interpretation to this missing variable in kurtosis?
Reply
- Charles
  
  June 1, 2020 at 4:14 pm
  
  How big is the data set for this variable? If the one missing value is missing at random, then you should be able to ignore this issue and simply test for kurtosis without the missing value.
  Charles
  Reply
John

February 28, 2020 at 5:54 am

Hello, for example the value of my skewness is -0.93, and kurtosis of -1.507 is it normally distributed? Thank you!
Reply
- Charles
  
  February 28, 2020 at 7:18 am
  
  John,
  Skewness and kurtosis for normally distributed data should be zero. You can test the likelihood that the data is normally distributed at
  https://real-statistics.com/tests-normality-and-symmetry/analysis-skewness-kurtosis/
  https://real-statistics.com/tests-normality-and-symmetry/statistical-tests-normality-symmetry/dagostino-pearson-test/
  Charles
  Reply
Antonio

June 26, 2019 at 10:01 am

Thanks for this, Charles it is really useful.

One quick thing I don’t really understand, sorry if it’s too basic. To reject the null hypothesis (Ho) we look at the value of the chi square distribution with two degrees of freedom of the JB statistic i.e. CHISQ.DIST.RT(JB statistic, 2). If this figure is bigger than the significance level then we can’t reject Ho. In the example for a 5% significance level (or 95% confidence interval) we can’t reject the distribution follows a normal distribution as CHISQ.DIST.RT(1.93, 2) = .382 > .05.

If rather than using a 5% significance level we use a 95% we will reject Ho. Does this make sense? With a lower confidence interval we reject Ho?

Thanks in advance for your help
Reply
- Charles
  
  June 26, 2019 at 7:05 pm
  
  Hello Antonio,
  If CHISQ.DIST.RT(1.93, 2) = .382 > .05, then don’t reject the null hypothesis that the data comes from a normally distributed population. This means that we are sufficiently satisfied that we have a normal distribution.
  We never use an alpha value bigger than or equal to 50%, and so 95% is not used (except that a confidence level of 95% is the same as a significance level of 1-.95 = .05).
  Charles
  Reply
soharb

July 11, 2016 at 3:07 pm

I think there is some thing wrong with this formula
for example for this series
26.83946269
26.95131935
8.371060164
10.40495872
18.38858378
20.12905135
24.2843167
1.76670796
20.19191695
41.06557085
16.09877032
13.34390071
0.426210193
28.31166689
11.89051087
109.3641761
25.50859431
61.26802436
32.5178008
66.58119511
41.27546773
14.67351611
2.048435245
28.01590722
44.93746991

the JARQUE(R1)=38.28239095
but if we use an array formula like this:
=COUNT(A2:A26)*(((((SUM((A2:A26-AVERAGE(A2:A26))^3)/COUNT(A2:A26))/((SUM((A2:A26-AVERAGE(A2:A26))^2)/COUNT(A2:A26))^1.5))^2)/6)+((((SUM((A2:A26-AVERAGE(A2:A26))^4)/COUNT(A2:A26))/((SUM((A2:A26-AVERAGE(A2:A26))^2)/COUNT(A2:A26))^2)-3))^2)/24)
+ CTRL + SHIFT + ENTER
the answer will be: 26.69155055
not to mention, I completely sure about this formula to be the Jarque–Bera test coefficient.
Reply
- Charles
  
  July 12, 2016 at 8:57 am
  
  I am using the following Excel formula =COUNT(A2:A26)*(SKEW(A2:A26)^2/6+KURT(A2:A26)^2/24)
  Charles
  Reply
  - soharb
    
    July 12, 2016 at 5:17 pm
    
    Then there is some thing wrong (bug) in excel formula, since I calculated the SKEW, KURT and JB with “EViews 9.5” and my array formula turn up to be the correct answer!
    Reply
    - Charles
      
      July 12, 2016 at 8:58 pm
      
      What value did you get for SKEW and KURT_
      Charles
      Reply
      - soharb
        
        July 13, 2016 at 10:01 am
        
        EViews 9.5:
        SKEW= 1.769081
        KURT= 3.620125
        JB= 26.69155
        
        Excel regular formula:
        =SKEW(A2:A26) = 1.884063081
        =SKEW.P(A2:A26) =1.769080723
        =KURT(A2:A26) = 4.748928357
        Note: there is no KURT.P!!!
        
        Excel array formula:
        for SKEW
        =((SUM((A2:A26-AVERAGE(A2:A26))^3)/COUNT(A2:A26))/((SUM((A2:A26-AVERAGE(A2:A26))^2)/COUNT(A2:A26))^1.5))
        + CTRL + SHIFT + ENTER
        =1.769080723
        
        for KURT
        =((SUM((A2:A26-AVERAGE(A2:A26))^4)/COUNT(A2:A26))/((SUM((A2:A26-AVERAGE(A2:A26))^2)/COUNT(A2:A26))^2)-3)
        + CTRL + SHIFT + ENTER
        =3.620124598
      - Charles
        
        July 13, 2016 at 10:42 am
        
        Soharb,
        Thanks for sending me this information. It looks like if we use the population values of skewness and kurtosis then we get the result that you have seen from EViews.
        In particular, the Real Statistics Resource Pack has functions SKEWP and KURTP. If these functions are used then the formula =COUNT(A2:A26)*(SKEWP(A2:A26)^2/6+KURT(A2:A26)^2/24) yields the result 26.69155.
        Thanks for bringing this up. I will revise the JARQUE and JBTEST functions in the next release of the software.
        Charles
Jpso

June 8, 2016 at 1:52 pm

Hi and congrats for the great initiative.

When you refer to Kurtosis, you mean the Excess kurtosis (i.e. kurt-3) or the outright kurtosis? For example when I perform the “D’Agostino-Pearson Test” as described in the relevant section (i.e. using outright kurtosis) I get results suggesting rejection of the null hypothesis, even if I use Kurt=3, Skew=0, which is the ND standards stats.

Thank you.
Reply
- Charles
  
  June 8, 2016 at 2:27 pm
  
  Jpso,
  I am using excess kurtosis (as does Excel).
  Charles
  Reply
david oluyole ajekigbe

June 8, 2016 at 12:08 pm

thank you very much for this information. i have gained a lot from it. it will be appreciated if you can please attend to the question of zohreh of february 28, 2016 @ 9.31pm . i also will like to name of the person for reference. thank you .
david
Reply
- Charles
  
  June 8, 2016 at 2:33 pm
  
  David,
  
  As I wrote in response to that comment
  
  “We often use alpha = .05 as the significance level for statistical tests. The critical value for a two tailed test of normal distribution with alpha = .05 is NORMSINV(1-.05/2) = 1.96, which is approximately 2 standard deviations (i.e. standard errors) from the mean. This is source of the rule of thumb that you are referring to.
  
  The Jarque-Barre and D’Agostino-Pearson tests for normality are more rigorous versions of this rule of thumb.”
  
  Thus, it is difficult to attribute this rule of thumb to one person, since this goes back to the beginning of statistics, or at least the use of the value 1.96. You will find this value of 1.96 in any elementary book on statistics.
  
  Charles
  Reply
Denny Yu

March 25, 2016 at 3:49 pm

Thank you very much! The Real Statistics Functions are really of great help.
However, I came across a problem that JBTEST, as well as DPTEST, doesn’t allow ranges expressed in array form. For example, the expression: =jbtest(IF(INDIRECT(“G”&6):INDIRECT(“G”&10)0,INDIRECT(“AE”&6):INDIRECT(“AE”&10))) cannot be recognized by Excel and the result is #VALUE!. By comparing with another expression: =jbtest(INDIRECT(“AE”&6):INDIRECT(“AE”&10)) in Evaluating Fomula, I found that JBTEST can only read data with form of “Am:Bn”, not expressed in a set of data like “0.1, 0.2, …”. Is there any solution to it? I have to deal with ranges within which there are certain values that should not be included in the test.
Thank you again!
Reply
- Charles
  
  March 26, 2016 at 10:41 pm
  
  Denny,
  The current implementation of these functions supports only arrays which are ranges. I have just changed this so that they should support any arrays. I will include these changes in the next release of the software. I hope to issue this release in the next few days.
  Charles
  Reply
  - Denny Yu
    
    March 27, 2016 at 6:49 am
    
    Thanks for replying.
    I’m really looking forward to it.
    Reply
Zohreh

February 28, 2016 at 2:28 pm

Salaam
May you please cite the reference for “If the absolute value of the skewness for the data is more than twice the standard error this indicates that the data are not symmetric, and therefore not normal”. I need it. Thanks.
Reply
- Charles
  
  February 28, 2016 at 5:29 pm
  
  We often use alpha = .05 as the significance level for statistical tests. The critical value for a two tailed test of normal distribution with alpha = .05 is NORMSINV(1-.05/2) = 1.96, which is approximately 2 standard deviations (i.e. standard errors) from the mean. This is source of the rule of thumb that you are referring to.
  
  The Jarque-Barre and D’Agostino-Pearson tests for normality are more rigorous versions of this rule of thumb.
  
  Charles
  Reply
  - Zohreh
    
    February 28, 2016 at 9:31 pm
    
    Thanks for replying. I’ve heard that one way to check normality is to divide skewness by standard error, if the results falls between the range +-1.96, then normality will be satisfies. Using this formula my data was proved to be not normal. I used another formula to which you referred “If the absolute value of the skewness for the data is more than twice the standard error this indicates that the data are not symmetric, and therefore not normal”, then my data revealed to be normal. As I want to use the latter procedure in my study I need to cite the name of the person whose opinion I will use. By reference I meant based on whose opinion “If the absolute value of the skewness for the data is … Will you please provide the name of the person?
    Many thanks…
    Reply
Rajesh

January 6, 2016 at 2:44 pm

Data distribution free how to apply 2 way anova
Reply
- Charles
  
  January 7, 2016 at 10:38 am
  
  Sorry, but I don’t understand your question.
  Charles
  Reply