Noncentral t Distribution

The t distribution characterizes how the t test statistic is distributed when the null hypothesis is assumed to be true. The noncentral t distribution instead shows how the t test statistic is distributed when the alternative hypothesis is assumed to be true (i.e. when the null hypothesis is assumed to be false). As such it is useful in calculating the power of the usual t tests.

Definition 1: The noncentral t distribution, abbreviated as T(k,δ) has the cumulative distribution function F(t), written as Fk,δ(t) when necessary, where k = the degrees of freedom and δ = the noncentrality parameter.

image7302

when t ≥ 0, where Φ is the distribution function of the standard normal distribution, i.e.

Φ(z) = NORMSDIST(z)

and Ir(a,b) is the distribution function of the beta distribution

Ir(a,b)= BETADIST(r,a,b)

where
image7303

When t < 0, the noncentral t distribution is

image7304

Observation: The probability density function (pdf) of the noncentral t distribution can be calculated as follows:

image7305

where Γ(k) is the gamma function. The mean and variance of the distribution are

image7306 image7307

Observation: The noncentral t distribution has shape similar to the central t distribution (i.e. the ordinary t distribution). The noncentrality parameter indicates how much the distribution is shifted to the right (when  δ > 0) or to the left (when δ < 0). When δ = 0, the noncentral t distribution is identical to the central t distribution, and so T(k,0) = T(k).

Observation: The following chart shows the graphs of the noncentral t distribution with 10 degrees of freedom for δ = 0, 2, 4, 6.

noncentral-t-distribution

Figure 1 – Noncentral t pdf by noncentrality parameter

The following chart shows the graphs of the noncentral t distribution with δ = 2 and the degrees of freedom = 1, 3, 5, 10.

Noncentral t distribution df

Figure 2 – Noncentral t pdf by degrees of freedom

Real Statistics Functions: The following function is provided in the Real Statistics Resource Pack:

NT_DIST(t, df, δ, cum, m). If cum = TRUE then the value of the noncentral t distribution T(k,δ) at t is returned, while If cum = FALSE then the value of the noncentral  pdf at t is returned.

NT_INV(p, df, δ, m, iter) = the inverse of the cdf of the noncentral t distribution T(k,δ) at p, i.e. the value of t such that NT_DIST(t, df, δ, TRUE, m) = p.

NT_NCP(p, df, t, m, iter) = the value of the noncentrality parameter δ such the cdf of the noncentral distribution T(k,δ) at t is p, i.e. NT_DIST(t, df, δ, TRUE, m) = p.

Here m = the upper limit in the infinite sum (1−170, default 120) and iter = the number of iterations used in calculating NT_INV or NT_NCP.

Note that NT_DIST(4.5,10,4,FALSE) = .25496 and NT_DIST(4.5,10,4,TRUE) = .603675, which is consistent with the values shown in the green curve of Figure 1.

9 Responses to Noncentral t Distribution

  1. Jonathan Bechtel says:

    Charles,

    What does the “t” stand for in NT_DIST(t, df, sigma, cum, m).

    Is it the Tstat derived from the problem? # of tails? You never specify.

    Also, if it’s the Tstat derived from the problem, from what I can tell this ought to be the same as the NCP, which most of the time will give an answer of approximately 0.5.

    Thank you

    • Jonathan Bechtel says:

      Or rather, if you use the same value for t and sigma in the NT_DIST function then it will usually give an answer that’s approximately 0.5.

    • Jonathan Bechtel says:

      Okay, sorry if I’m overcrowding things here, but upon further inspection it looks like if you use the 2-tailed Tcrit for the NT_DIST function this gives answers that are almost identical to the statistical power option in the RealStats plug in.

      Am I correct here?

      Thanks.

    • Charles says:

      Jonathan,
      The formula is NT_DIST(t, df, ncp, cum, m), where ncp is the noncentrality parameter. The t is the same as the t in T.DIST(t,df,cum). In fact, when ncp = 0 then NT_DIST(t,df,0,cum) = T.DIST(t,df,cum).
      Charles

  2. Sam Deem says:

    Hi, Charles

    I have tried to use the function NT_INV for the purpose of calculating one-side tolerance intervals, with the parameters p=0.95, df=50, delta=11.63, m=170. It did not provide a solution. Actually it did not work with df larger than 50 and delta larger than 11. Please help.

    Sam

    • Charles says:

      Sam,

      Some observations:

      1. For some values of p, df and delta the value for m must be less than 170. E.g. for your example, if you change m = 170 to m = 168, the function will generate the correct answer. I need to improve the function to avoid this problem.

      2. There are solutions for values of df larger than 50 or delta larger than 11. E.g. NT_INV(.95,60,12,165) = 14.79959488, NT_INV(.95,100,12,150) = 14.3688301.

      3. But the function doesn’t seem to be able to find all such solutions. E.g. NT_INV(.95,60,20,m) does not find the right value, which I believe is about 24. I need to fix this.

      Thanks for identifying this problem.

      Charles

  3. António Teixeira says:

    Hi,

    I have the following doubt about this distribution:
    How small should we consider t to calculate the pdf as t=0. For example, if we have t=1E-10 and use the algorithm for x not zero we ca<n introduce a distortion in the graphic that bis noticeable in certain cases. We can say that the second algorithm must be used not only for t=o but 'in the vicinity of 0'.
    Did you felt this this probçem and Have you any idea of how to define its limits?

    António Teixeira

    • Charles says:

      António,

      Excellent point. I have checked the pdf values for t = E-5, E-6, E-7, E-8, E-9, E-10, 0 with df = 1 to 20 and ncp = 4, 3, 2, 1, .5, .1, .01, all carried out to 8 decimal places.

      For ncp = 3, the value at t = 0 is always the same as the value at E-9. For 7 values of df the pdf values at t = E-10 is higher than that at t = E-9 (this theoretically shouldn’t happen), the difference is at most .00000003. For 5 values of df the pdf value at t = E-10 is lower than that at t = E-9, the difference is at most .00000002.

      For the other values of ncp, usually the pdf value at t = 0 is equal to that of t = E-8, E-9 or somewhere in between, although occasionally at E-7. There seems to be more distortion at E-10 where fairly often the pdf value at E-10 is higher than at E-9, although sometimes this starts to happen (although not for all values of df) at E-9 or E-8.

      Based on this analysis, I would say that for ncp >= 1 the second value of the pdf could be used for t < E-8. For ncp < 1 perhaps this should be for t < E-7.

      Charles

Leave a Reply

Your email address will not be published. Required fields are marked *