Loading Web-Font TeX/Math/Italic
  1. Confidence intervals for linear combinations of variances

    2016-03-19
    Source

    Let x_i’s be some observations of independent random variables X_i \sim \theta_i \dfrac{\chi^2_{d_i}}{d_i}. In this article we will take a look at some methods to get a confidence interval about a linear combination of the \theta_i’s. This situation occurs when we are interested in the variances of interest in an ANOVA model with random effects.

    Satterthwaite method

    In order to get a confidence interval about a linear combination \sum a_i\theta_i with nonnegative coefficents a_i, the Satterthwaite approximation consists in doing as if \sum a_i X_i \sim \left(\sum a_i\theta_i\right) \frac{\chi^2_\nu}{\nu} \quad \textrm{with }\; \nu = \frac{{\left(\sum a_ix_i\right)}^2}{\sum\dfrac{{(a_i x_i)}^2}{d_i}}.

    Thus, taking a 100(1-\alpha)\%-dispersion interval [b^-, b^+] of the \chi^2_\nu distribution, one gets the approximate 100(1-\alpha)\%-confidence interval about \sum a_i\theta_i: \left[\nu\frac{\sum a_ix_i}{b^+}, \nu\frac{\sum a_ix_i}{b^-}\right].
    This interval is returned by the ciSatt function below, taking the quantiles \chi^2_\nu\bigl(\frac{\alpha}{2}\bigr) and \chi^2_\nu\bigl(1-\frac{\alpha}{2}\bigr) for b^- and b^+ respectively.

    1. ciSatt <- function(x, dofs, a, alpha=0.05){
    2. s <- sum(a*x)
    3. nu <- s^2/sum((a*x)^2/dofs)
    4. lwr <- s*nu/qchisq(1-alpha/2, nu)
    5. upr <- s*nu/qchisq(alpha/2,nu)
    6. return(c("lwr"=lwr, "upr"=upr))
    7. }

    Graybill & Wang’s method

    Graybill & Wang provided another method for nonnegative linear combinations. Their approximate 100(1-\alpha)\%-confidence interval about \sum a_i\theta_i is \left[\sum a_i x_i - \sqrt{\sum{(G_ia_ix_i)}^2}, \sum a_i x_i + \sqrt{\sum{(H_ia_ix_i)}^2}\right]

    where G_i = 1 - \frac{d_i}{\chi^2_{d_i}\bigl(1-\frac{\alpha}{2}\bigr)} \quad \text{and }\; H_i = \frac{d_i}{\chi^2_{d_i}\bigl(\frac{\alpha}{2}\bigr)} - 1.

    Ting & al’s generalization

    Graybill & Wang’s confidence interval has been generalized to the case when some a_i are negative by Ting & al (see also Burdick & al). It is returned by the function ciTing given below.

    1. ciTing <- function(x, dofs, a, alpha=0.05){
    2. G <- 1 - sapply(dofs, function(d){ d/qchisq(1-alpha/2,d) })
    3. H <- sapply(dofs, function(d){ d/qchisq(alpha/2,d) }) - 1
    4. s <- sum(a*x)
    5. if(all(a>0)){ # this is Graybill & Wang's confidence interval
    6. lwr <- s - sqrt(sum((G*a*x)^2))
    7. upr <- s + sqrt(sum((H*a*x)^2))
    8. return(c("lwr"=lwr, "upr"=upr))
    9. }
    10. pos <- which(a>0); neg <- which(a<0)
    11. A12 <- sum(sapply(pos, function(q){
    12. sapply(neg, function(r){
    13. Fqr <- qf(1-alpha/2, dofs[q], dofs[r]) # alpha/2 in the article is an error
    14. Lqr <- ((Fqr-1)^2 - G[q]^2*Fqr^2 - H[r]^2)/Fqr
    15. return(Lqr*a[q]*x[q]*a[r]*x[r])
    16. })
    17. }))
    18. B12 <- sum(sapply(pos, function(q){
    19. sapply(neg, function(r){
    20. Fqr <- qf(alpha/2, dofs[q], dofs[r]) # 1-alpha/2 in the article is an error
    21. Lqr <- ((Fqr-1)^2 - H[q]^2*Fqr^2 - G[r]^2)/Fqr
    22. return(Lqr*a[q]*x[q]*a[r]*x[r])
    23. })
    24. }))
    25. lwr <- s - sqrt(sum((G[pos]*a[pos]*x[pos])^2) + sum((H[neg]*a[neg]*x[neg])^2) - A12)
    26. upr <- s + sqrt(sum((H[pos]*a[pos]*x[pos])^2) + sum((G[neg]*a[neg]*x[neg])^2) - B12)
    27. return(c("lwr"=lwr, "upr"=upr))
    28. }

    We will study the performance of this confidence interval on an example. Some improvements of this interval are given in Ting & al’s paper, but we do not provide them here.

    Example

    Consider a balanced three-way ANOVA model with one fixed factor and two random factors: y_{hijk} = \underset{\mu_h}{\underbrace{\mu + A_h}} + B_i + C_j + {(AB)}_{hi} + {(AC)}_{hj} + {(BC)}_{ij} + {(ABC)}_{hij} + \epsilon_{hijk}, \\ \quad h = 1,\ldots, H, \quad i = 1,\ldots, I, \quad j = 1, \ldots, J, \quad k = 1, \ldots, K.

    \scriptsize \begin{array}{lccc} \text{Source} & \text{dof} & \text{Mean square} & \text{Expected mean square} \\ B & I-1 & S^2_B & \theta_B = \sigma^2_E + K \sigma^2_{ABC} + HK \sigma^2_{BC} + JK \sigma^2_{AB} + HJK \sigma^2_B \\ C & J-1 & S^2_C & \theta_C = \sigma^2_E + K \sigma^2_{ABC} + HK \sigma^2_{BC} + IK \sigma^2_{AC} + HIK \sigma^2_C \\ A \times B & (H-1)(I-1) & S^2_{AB} & \theta_{AB}= \sigma^2_E + K \sigma^2_{ABC} + JK \sigma^2_{AB} \\ B \times C & (I-1)(J-1) & S^2_{BC} & \theta_{BC} = \sigma^2_E + K \sigma^2_{ABC} + HK \sigma^2_{BC} \\ A \times C & (H-1)(J-1) & S^2_{AC} & \theta_{AC} = \sigma^2_E + K \sigma^2_{ABC} + IK \sigma^2_{AC} \\ A \times B \times C & (H-1)(J-1)(K-1) & S^2_{ABC} & \theta_{ABC} = \sigma^2_E + K \sigma^2_{ABC} \\ E & n - HIJ & S^2_E & \theta_E = \sigma^2_E \end{array}

    In matricial form, the variance components and the expected mean squares are related by \small \begin{pmatrix} \theta_B \\ \theta_C \\ \theta_{AB} \\ \theta_{BC} \\ \theta_{AC} \\ \theta_{ABC} \\ \theta_E \end{pmatrix} = \begin{pmatrix} HJK & 0 & JK & HK & 0 & K & 1 \\ 0 & HIK & 0 & HK & IK & K & 1 \\ 0 & 0 & JK & 0 & 0 & K & 1 \\ 0 & 0 & 0 & HK & 0 & K & 1 \\ 0 & 0 & 0 & 0 & IK & K & 1 \\ 0 & 0 & 0 & 0 & 0 & K & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} \sigma^2_B \\ \sigma^2_C \\ \sigma^2_{AB} \\ \sigma^2_{BC} \\ \sigma^2_{AC} \\ \sigma^2_{ABC} \\ \sigma^2_E \end{pmatrix}
    and conversely, \small \begin{pmatrix} \sigma^2_B \\ \sigma^2_C \\ \sigma^2_{AB} \\ \sigma^2_{BC} \\ \sigma^2_{AC} \\ \sigma^2_{ABC} \\ \sigma^2_E \end{pmatrix} = \begin{pmatrix} \frac{1}{HJK} & 0 & -\frac{1}{HJK} & -\frac{1}{HJK} & 0 & \frac{1}{HJK} & 0 \\ 0 & \frac{1}{HIK} & 0 & -\frac{1}{HIK} & -\frac{1}{HIK} & \frac{1}{HIK} & 0 \\ 0 & 0 & \frac{1}{JK} & 0 & 0 & -\frac{1}{JK} & 0 \\ 0 & 0 & 0 & \frac{1}{HK} & 0 & -\frac{1}{HK} & 0 \\ 0 & 0 & 0 & 0 & \frac{1}{IK} & -\frac{1}{IK} & 0 \\ 0 & 0 & 0 & 0 & 0 & \frac{1}{K} & -\frac{1}{K} \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} \theta_B \\ \theta_C \\ \theta_{AB} \\ \theta_{BC} \\ \theta_{AC} \\ \theta_{ABC} \\ \theta_E \end{pmatrix}

    We look for a confidence interval about the reproductibily variance of factor B: \sigma^2_{\textrm{repro}, B} = \sigma^2_B + \sigma^2_{AB} + \sigma^2_{BC} + \sigma^2_{ABC}

    which is the linear combination of the expected mean squares \small \begin{multline} \frac{1}{HJK} \theta_B + \left(\frac{1}{JK}-\frac{1}{HJK}\right)\theta_{AB} + \left(\frac{1}{HK}-\frac{1}{HJK}\right)\theta_{BC} \\ + \left(\frac{1}{HJK}-\frac{1}{JK} + \frac{1}{K}-\frac{1}{HK}\right)\theta_{ABC} - \frac{1}{K} \theta_E \\ = \frac{1}{HJK}\left(\theta_B + (H-1)\theta_{AB} + (J-1)\theta_{BC} + (1-H-HJ-J)\theta_{ABC} - HJ\theta_E \right) \quad (\ast) \end{multline}
    and is estimated by \frac{1}{HJK}\left(S^2_B + (H-1)S^2_{AB} + (J-1)S^2_{BC} + (1-H-HJ-J)S^2_{ABC} - HJS^2_E \right).

    Checking the coverage

    Let us check the frequentist coverage of the confidence interval. We firstly write a function to sample the mean squares:

    1. rMeanSquares <- function(nsims, H, I, J, K, sigma2B=1, sigma2C=1, sigma2AB=1, sigma2BC=1, sigma2AC=1, sigma2ABC=1, sigma2E=1){
    2. VCnames <- c("B", "C", "AB", "BC", "AC", "ABC", "E")
    3. VC <- setNames(c(sigma2B, sigma2C, sigma2AB, sigma2BC, sigma2AC, sigma2ABC, sigma2E), VCnames)
    4. dofs <- setNames(c(I-1, J-1, (H-1)*(I-1), (I-1)*(J-1), (H-1)*(J-1), (H-1)*(J-1)*(K-1), H*I*J*(K-1)), VCnames)
    5. M <- rbind(
    6. c(H*J*K, 0, J*K, H*K, 0, K, 1),
    7. c(0, H*I*K, 0, H*K, I*K, K, 1),
    8. c(0, 0, J*K, 0, 0, K, 1),
    9. c(0, 0, 0, H*K, 0, K, 1),
    10. c(0, 0, 0, 0, I*K, K, 1),
    11. c(0, 0, 0, 0, 0,K, 1),
    12. c(0, 0, 0, 0, 0, 0, 1)
    13. )
    14. thetas <- setNames(as.vector(M %*% VC), VCnames)
    15. sims <- matrix(numeric(1), nrow=nsims, ncol=7)
    16. colnames(sims) <- VCnames
    17. for(j in VCnames){
    18. sims[,j] <- thetas[j]/dofs[j]*rchisq(nsims, dofs[j])
    19. }
    20. attr(sims, "VC") <- VC
    21. attr(sims, "dofs") <- dofs
    22. return(sims)
    23. }

    Here we simulate the mean squares using not too small values of the degress of freedom.

    1. # simulations
    2. H <- 10; I <- 15; J <- 10; K <- 5
    3. nsims <- 10000
    4. sims <- rMeanSquares(nsims, H=H, I=I, J=J, K=K)
    5. dofs <- attr(sims, "dofs")
    6. VC <- attr(sims, "VC")
    7. sigma2Brepro <- sum(VC[c("B", "AB", "BC", "ABC")])
    8. # linear combination
    9. a <- 1/K*c(1/H/J, 1/J-1/H/J, 1/H-1/H/J, 1/H/J-1/J+1-1/H, -1)
    10. # confidence intervals
    11. Bounds <- matrix(numeric(1), nrow=nsims, ncol=2)
    12. colnames(Bounds) <- c("lwr", "upr")
    13. for(i in 1:nsims){
    14. Bounds[i,] <- ciTing(sims[i,c("B", "AB", "BC", "ABC", "E")], dofs[c("B", "AB", "BC", "ABC", "E")], a=a)
    15. }
    16. # coverage of the upper one-sided interval:
    17. mean(Bounds[,"lwr"] < sigma2Brepro)
    18. ## [1] 0.9697
    19. # coverage of the lower one-sided interval:
    20. mean(Bounds[,"upr"] > sigma2Brepro)
    21. ## [1] 0.981
    22. # coverage of the two-sided interval:
    23. mean(Bounds[,"lwr"] < sigma2Brepro & Bounds[,"upr"] > sigma2Brepro)
    24. ## [1] 0.9507

    As we observe, the difference between the coverage obtained from the simulations and the nominal coverage does not exceed 1\% for each of the three confidence intervals (upper one-sided, lower one-sided and two-sided).

    A small degrees of freedom example

    Now let us assess the frequentist coverage with H=3, I=3, J=3 and K=5.

    1. # simulations
    2. H <- 3; I <- 3; J <- 3; K <- 5
    3. nsims <- 10000
    4. set.seed(666)
    5. sims <- rMeanSquares(nsims, H=H, I=I, J=J, K=K)
    6. dofs <- attr(sims, "dofs")
    7. VC <- attr(sims, "VC")
    8. sigma2Brepro <- sum(VC[c("B", "AB", "BC", "ABC")])
    9. # linear combination
    10. a <- 1/K*c(1/H/J, 1/J-1/H/J, 1/H-1/H/J, 1/H/J-1/J+1-1/H, -1)
    11. # confidence intervals
    12. Bounds <- matrix(numeric(1), nrow=nsims, ncol=2)
    13. colnames(Bounds) <- c("lwr", "upr")
    14. for(i in 1:nsims){
    15. Bounds[i,] <- ciTing(sims[i,c("B", "AB", "BC", "ABC", "E")], dofs[c("B", "AB", "BC", "ABC", "E")], a=a)
    16. }
    17. # coverage of the upper one-sided interval:
    18. mean(Bounds[,"lwr"] < sigma2Brepro)
    19. ## [1] 0.9496
    20. # coverage of the lower one-sided interval:
    21. mean(Bounds[,"upr"] > sigma2Brepro)
    22. ## [1] 0.9995
    23. # coverage of the two-sided interval:
    24. mean(Bounds[,"lwr"] < sigma2Brepro & Bounds[,"upr"] > sigma2Brepro)
    25. ## [1] 0.9491

    This time, the coverage is not so close to the nominal level. The upper one-sided confidence interval ([lwr, Inf[) is too short, and the lower one-sided confidence interval (]-Inf, upr]) is too long.
    In other words, the lower bound and the upper bound are higher than desired.
    Let’s have a look to the bounds:

    1. head(Bounds, 10)
    2. ## lwr upr
    3. ## [1,] 2.7472674 86.87765
    4. ## [2,] 3.9919586 212.14470
    5. ## [3,] 0.6890338 20.96861
    6. ## [4,] 2.1775242 40.13811
    7. ## [5,] 1.8697206 85.46661
    8. ## [6,] 1.9688034 155.66416
    9. ## [7,] 1.4805632 108.91109
    10. ## [8,] 1.4817979 26.15695
    11. ## [9,] 1.6689184 57.09501
    12. ## [10,] 1.3509618 18.37216

    The upper bound is quite big (\sigma_{\textrm{repro},B}=4 here).

    Shortening the intervals with the Satterthwaite approximation

    Recall our linear combination of the mean squares:

    \begin{align*} & a_1 S^2_B + a_2 S^2_{AB} + a_3 S^2_{BC} + a_4 S^2_{ABC} + a_5 S^2_E \\ = & a_1 x_1 + a_2 x_2 + a_3 x_3 + a_4 x_4 + a_5 x_5 \end{align*}

    with coefficients a_1,a_2,a_3,a_4>0, a_5<0, and degrees of freedom 2, 4, 4, 16 and 108.

    A degree of freedom of 2 is pretty small, and it could be the cause of the large upper bound.
    To circumvent this problem, we could try to replace a_1x_1 + a_2x_2 with its Satterthwaite approximation: \underset{y}{\underbrace{a_1 x_1 + a_2 x_2}} + a_3 x_3 + a_4 x_4 + a_5 x_5

    and then apply the Ting & al interval to the new linear combination y+a_3 x_3 + a_4 x_4 + a_5 x_5. Let’s look what it gives for the second row of simulations:

    1. x <- sims[2, c("B", "AB", "BC", "ABC", "E")]
    2. dofs <- c(2, 4, 4, 16, 108)
    3. y <- sum(a[1:2]*x[1:2])
    4. nu <- y^2/sum((a[1:2]*x[1:2])^2/dofs[1:2])
    5. x_new <- c(y, x[3], x[4], x[5])
    6. dofs_new <- c(nu, dofs[3], dofs[4], dofs[5])
    7. a_new <- c(1, a[3], a[4], a[5])
    8. # original interval:
    9. ciTing(x, dofs, a)
    10. ## lwr upr
    11. ## 3.991959 212.144698
    12. # new interval:
    13. ciTing(x_new, dofs_new, a_new)
    14. ## lwr upr
    15. ## 3.332372 79.301873

    The upper bound is considerably smaller. Now let’s have a look at the coverage when we apply this method to the previous simulations:

    1. # confidence intervals
    2. Bounds_new <- matrix(numeric(1), nrow=nsims, ncol=2)
    3. colnames(Bounds_new) <- c("lwr", "upr")
    4. for(i in 1:nsims){
    5. x <- sims[i, c("B", "AB", "BC", "ABC", "E")]
    6. y <- sum(a[1:2]*x[1:2])
    7. nu <- y^2/sum((a[1:2]*x[1:2])^2/dofs[1:2])
    8. x_new <- c(y, x[3], x[4], x[5])
    9. dofs_new <- c(nu, dofs[3], dofs[4], dofs[5])
    10. Bounds_new[i,] <- ciTing(x_new, dofs_new, a_new)
    11. }
    12. # coverage of the upper one-sided interval:
    13. mean(Bounds_new[,"lwr"] < sigma2Brepro)
    14. ## [1] 0.9759
    15. # coverage of the lower one-sided interval:
    16. mean(Bounds_new[,"upr"] > sigma2Brepro)
    17. ## [1] 0.9967
    18. # coverage of the two-sided interval:
    19. mean(Bounds_new[,"lwr"] < sigma2Brepro & Bounds_new[,"upr"] > sigma2Brepro)
    20. ## [1] 0.9726

    This time, the upper one-sided interval achieves a coverage close to the nominal value. The lower one-sided interval still have a too large coverage, but the upper bounds we get are generally pretty shorter:

    1. head(Bounds)
    2. ## lwr upr
    3. ## [1,] 2.7472674 86.87765
    4. ## [2,] 3.9919586 212.14470
    5. ## [3,] 0.6890338 20.96861
    6. ## [4,] 2.1775242 40.13811
    7. ## [5,] 1.8697206 85.46661
    8. ## [6,] 1.9688034 155.66416
    9. head(Bounds_new)
    10. ## lwr upr
    11. ## [1,] 2.3231901 24.761722
    12. ## [2,] 3.3323718 79.301873
    13. ## [3,] 0.5833824 6.020251
    14. ## [4,] 2.0399168 17.245143
    15. ## [5,] 1.6179591 32.873508
    16. ## [6,] 1.8678017 119.880194

    Note that the method we proposed here is not intended to be a general one. The only thing we propose to the user is to explore the performance of the confidence intervals with the help of simulations for a specific design (values of H, I, J and K) and the expected values of the variance components. We also recall that Ting & al’s paper provides some improvements of the confidence intervals that we did not consider here.

    References

    Graybill & Wang: Confidence Intervals on Nonnegative Linear Combinations of Variances. Journal of the American Statistical Association 75 (1980), 869-873.

    Ting, Burdick, Graybill, Jeyaratman & Lu: Confidence intervals on linear combinations of variance components that are unrestricted in sign. Journal of Statistical Computation and Simulation 35 (1990), 135-143.

    Burdick, Borror, Montgomery: Design and Analysis of Gauge R&R Studies. SIAM 2005.