Studentized Range Distribution

Definition 1: Suppose that we take a sample of size n from each of k populations with the same normal distribution N(μ, σ) and suppose that min is the smallest of these sample means and max is the largest of these sample means, and suppose s2 is the pooled sample variance from these samples. Then the following random variable has a Studentized range distribution.

image7245

This distribution is related to the t distribution and is very useful especially in follow up testing for ANOVA such as Tukey’s HSD (see Unplanned Comparisons).

Tables of critical values for this distribution can be found in Studentized Range q Table.

Real Statistics Functions: The following functions are supplied by the Real Statistics Resource Pack

QDIST(q, k, df) = value at q of the studentized range distribution with k independent variables and df degrees of freedom

QINV(p, k, df, tails) = inverse of the studentized q distribution, i.e. the critical value for the studentized q range; tails = the number of tails and takes on the value 1 or 2 (default); QINV(p, k, df) has the value q such that QDIST(q, k, df) = p.

QCRIT(k, df, α, tails) = the critical value for the studentized q range based on the entries in the tables found in Studentized Range q Table, making a linear interpolation for entries between entries in the table; α is a number between 0 and 1 (default .05); tails = the number of tails and takes on the value 1 or 2 (default).

Note that theoretically QINV(p, k, df) = QCRIT(k, df, p), but whereas QCRIT does a table lookup, QINV makes a calculation of the critical value. Generally for values shown in the tables QCRIT is more accurate, while for values outside the table QINV is usually preferred.

16 Responses to Studentized Range Distribution

  1. Kevin Nowaczyk says:

    Would you mind sharing the equation behind the Qdist() function? I can not find much documentation on this distribution. I though I had found an expression, but when I perform numerical integration, my results do not match table values exactly. They are off by about 10%.

    When k=2, I know it is 2*(t.dist(q/sqrt(2),df) – 1). and when df approaches infinity, it’s:
    k * ∫ norm.s.dist(z,false)[norm.s.dist(z+w,true) – norm.s.dist(z,true)]ᵏ⁻¹ dz

  2. Kevin Nowaczyk says:

    Would you mind sharing the equation behind the Qdist() function? I can not find much documentation on this distribution. I though I had founf an expression, but when I perform numerical integration, my results do not match table values exactly. They are off by about 10%.

    • Charles says:

      Kevin,
      It is not an equation, but a fairly complicated program.
      Charles

      • Kevin Nowaczyk says:

        Is it more complicated than a numerical integration? Would you mind explaining the method? Besides providing table values, literature is very spotty on how the values arise.

        When k=2, I know it is 2*(t.dist(q/sqrt(2),df) – 1). and when df approaches infinity, it’s:
        k * ∫ norm.s.dist(z/sqrt(2),false)[norm.s.dist((z+w)/sqrt(2),true) – norm.s.dist(z/sqrt(2),true)]ᵏ⁻¹ dz. I can integrate the second equation in Excel and get perfect matches between literature and spreadsheet. Replacing norm.s.dist with t.dist is close, but not perfect (unless df approaches infinity of course).

  3. Myra Villaflor Gutierrez says:

    hello! how can i duplicate the studentized range distribution? and where can i get the different functions you listed? thankyou for your response 😉

  4. Myra Villaflor Gutierrez says:

    is the formula given above can solve the critical value that exists on Tukey distribution table? Thankyou for your response.

  5. rod grisham says:

    QINV as downloaded this week has a last parameter B. I see no documentation about what that is.

    • Charles says:

      Rod,

      If b = TRUE then sometimes instead of using the calculation based on QDIST, the function uses the critical value found in the table, i.e. the value returned by QCRIT. This is because in the extreme situations the calculations using QDIST are not very accurate. The table lookup is used instead when (1) df = 1 And (p <= 0.025 Or p = 0.05 Or p = 0.1) or when (2) df = 2 and .001 <= p <= .01 or when df = 3 and p = .001. I have not explained this since I don't expect anyone to set b to FALSE (the default is TRUE). I plan to revisit this issue sometime in the future and try to make the calculation for QDIST and QINV more accurate to that the table lookup approach is not necessary. Since I have now improved the approach used for interpolation, I may instead simply use the table values for QINV and QDIST when df = 1 or (df = 2 and p <= .01) or (df = 3 and p <= .001), or something similar. For now, I would simply ignore the b argument.

  6. Daryl says:

    Hello – great Add-In. Thank you. I’m trying to duplicate the Studentized range table, for alpha = 0.05, using your QINV function. All values seem pretty accurate to other tables that I’ve seen, except for the first few values in the first row (v=1). For example, here are the first few values I’m obtaining for that row:
    k=2: 17.969 k=3: 26.227 k=4: 31.998 (they start to match other tables that I’ve seen around k=12 (and beyond).

    Do you know why this might be happening? But again, thanks!

    • Charles says:

      Daryl,
      I don’t know why the algorithm I used gives such poor values for df = 1. In fact for small values of alpha less than or equal to .01 the algorithm gives poor values for df = 2 as well. I added an option to simply use the table values in certain cases, but it sounds like I should consider expanding the cases where I use the table values.
      Charles

  7. Colin says:

    Sir

    Does the df = n – k or n*k -k

    Colin

    • Charles says:

      Colin,
      df is simply dfW for single factor Anova. df = n – k where n = total sample size and k = number of levels. If each level has m elements in the sample then n = km and so df = mk – k.
      Charles

Leave a Reply

Your email address will not be published. Required fields are marked *