Evaluate confidence intervals for power estimates

Calculates the confidence interval for a power estimate from a simulation study. The confidence interval quantifies uncertainty about the true power parameter.

When the number of simulations used to calculate a test's power is too small, the power estimate will have high uncertainty (wide confidence/prediction intervals). When the number of simulations used to calculate a test's power is too large, computational time may be prohibitive. This function allows you to determine the appropriate number of simulated datasets to reach your desired precision for power before spending computational time on simulations.

Usage

eval_power_ci(power, nsims, ci_level = 0.95, method = c("wilson", "exact"))

Arguments

power: (numeric: (0, 1))
Hypothetical observed power value(s).
nsims: (integer: [2, Inf))
Number of simulations.
ci_level: (Scalar numeric: 0.95; (0,1))
The confidence level.
method: (Scalar character: "wilson"; c("wilson", "exact"))
Method for computing confidence intervals. One of "wilson" (default) or "exact". See 'Details' for more information.

Value

A list with elements:

Name	Description
`lower`	Lower bound of confidence interval.
`upper`	Upper bound of confidence interval.

Details

Power estimation via simulation is a binomial proportion problem. The confidence interval answers: "What is the plausible range of true power values given my simulation results?"

Let $\pi$ denote the hypothetical true power value, $\hat{\pi} = x/n$ denote the hypothetical observed power value, $n$ denote the number of simulations, and $x = \text{round}(\hat{\pi} \cdot n)$ denote the number of rejections. Two methods are available.

Wilson Score Interval

The Wilson score interval is derived from inverting the score test. Starting with the inequality

$$ \left| \frac{\hat{\pi}-\pi}{\sqrt{\pi(1-\pi)/n}} \right| \le z_{1-\alpha/2}, $$

and solving the resulting quadratic for $\pi$ yields

$$ \frac{\hat{\pi}+\frac{z^2}{2n} \pm z \sqrt{\frac{\hat{\pi}(1-\hat{\pi})}{n}+\frac{z^2}{4n^2}}}{1+\frac{z^2}{n}}, $$

with $z = z_{1-\alpha/2}$ and $\hat{\pi} = x/n$.

Clopper-Pearson Interval

The Clopper-Pearson exact interval inverts the binomial test via Beta quantiles. The bounds $(\pi_L, \pi_U)$ satisfy:

$$P(X \geq x \mid \pi = \pi_L) = \alpha/2$$ $$P(X \leq x \mid \pi = \pi_U) = \alpha/2$$

With $x$ successes in $n$ trials,

$$\pi_L = B^{-1}\left(\frac{\alpha}{2}; x, n-x+1\right)$$ $$\pi_U = B^{-1}\left(1-\frac{\alpha}{2}; x+1, n-x\right)$$

where $B^{-1}(q; a, b)$ is the $q$-th quantile of $\text{Beta}(a, b)$.

This method guarantees at least nominal coverage but is conservative (intervals are wider than necessary).

Approximate parametric tests

When power is computed using approximate parametric tests (see simulated()), the power estimate and confidence/prediction intervals apply to the Monte Carlo test power $\mu_K = P(\hat{p} \leq \alpha)$ rather than the exact test power $\pi = P(p \leq \alpha)$. These quantities converge as the number of datasets simulated under the null hypothesis $K$ increases. The minimum observable p-value is $1/(K+1)$, so $K > 1/\alpha - 1$ is required to observe any rejections. For practical accuracy, we recommend choosing $\text{max}(5000, K \gg 1/\alpha - 1)$ for most scenarios. For example, if $\alpha = 0.05$, use simulated(nsims = 5000).

References

Newcombe RG (1998). “Two-sided confidence intervals for the single proportion: comparison of seven methods.” Statistics in Medicine, 17(8), 857–872. ISSN 0277-6715, 1097-0258, doi:10.1002/(SICI)1097-0258(19980430)17:8<857::AID-SIM777>3.0.CO;2-E . ,

Wilson EB (1927). “Probable Inference, the Law of Succession, and Statistical Inference.” Journal of the American Statistical Association, 22(158), 209–212. ISSN 0162-1459, 1537-274X, doi:10.1080/01621459.1927.10502953 . ,

Clopper CJ, Pearson ES (1934). “THE USE OF CONFIDENCE OR FIDUCIAL LIMITS ILLUSTRATED IN THE CASE OF THE BINOMIAL.” Biometrika, 26(4), 404–413. ISSN 0006-3444, 1464-3510, doi:10.1093/biomet/26.4.404 .

Examples

#----------------------------------------------------------------------------
# eval_power_ci() examples
#----------------------------------------------------------------------------
library(depower)

# Expected CI for 80% power with 1000 simulations
eval_power_ci(power = 0.80, nsims = 1000)
#> $lower
#> [1] 0.774081
#> 
#> $upper
#> [1] 0.8236229
#> 

# Compare precision across different simulation counts
eval_power_ci(power = 0.80, nsims = c(100, 500, 1000, 5000))
#> $lower
#> [1] 0.7111708 0.7627109 0.7740810 0.7886843
#> 
#> $upper
#> [1] 0.8666331 0.8327145 0.8236229 0.8108551
#> 

# Compare Wilson vs exact method
eval_power_ci(power = 0.80, nsims = 1000, method = "wilson")
#> $lower
#> [1] 0.774081
#> 
#> $upper
#> [1] 0.8236229
#> 
eval_power_ci(power = 0.80, nsims = 1000, method = "exact")
#> $lower
#> [1] 0.7738406
#> 
#> $upper
#> [1] 0.8243794
#> 

# Vectorized over power values
eval_power_ci(power = c(0.70, 0.80, 0.90), nsims = 1000)
#> $lower
#> [1] 0.6708761 0.7740810 0.8798480
#> 
#> $upper
#> [1] 0.7275932 0.8236229 0.9170906
#> 

# 99% confidence interval
eval_power_ci(power = 0.80, nsims = 1000, ci_level = 0.99)
#> $lower
#> [1] 0.7654881
#> 
#> $upper
#> [1] 0.8305572
#>