Test statistic distribution under the null
distribution.RdConstructs a list which defines the test statistic reference distribution under the null hypothesis.
Arguments
- method
(Scalar string:
"approximate")
The method used to derive the distribution of the test statistic under the null hypothesis. Must be one of"approximate"(default) or"exact". See 'Details' for additional information.- nsims
(Scalar integer:
1000L;[2, Inf))
The number of resamples formethod = "approximate". Not used formethod = "exact", except for the case when the number of exact resamples exceeds approximately1e6and thenmethod = "approximate"will be used as a fallback. In thepower()context,nsimsdefines the number of simulated datasets under the null hypothesis. For this case you would typically setnsimsas greater than or equal to the number of simulated datasets in the design row of the power analysis. See 'Details' for additional information.- ncores
(Scalar integer:
1L;[1, Inf))
The number of cores (number of worker processes) to use. Do not set greater than the value returned byparallel::detectCores().- ...
Optional arguments for internal use.
Details
The default asymptotic test is performed for distribution = asymptotic().
When setting argument distribution = simulated(method = "exact"), the
exact randomization test is defined by:
Independent two-sample tests
Calculate the observed test statistic.
Check if
length(combn(x=n1+n2, m=n1))<1e6If
TRUEcontinue with the exact randomization test.If
FALSErevert to the approximate randomization test.
For all
combn(x=n1+n2, m=n1)permutations:Assign corresponding group labels.
Calculate the test statistic.
Calculate the exact randomization test p-value as the mean of the logical vector
resampled_test_stats >= observed_test_stat.
Dependent two-sample tests
Calculate the observed test statistic.
Check if
npairs < 21(maximum 2^20 resamples)If
TRUEcontinue with the exact randomization test.If
FALSErevert to the approximate randomization test.
For all
2^npairspermutations:Assign corresponding pair labels.
Calculate the test statistic.
Calculate the exact randomization test p-value as the mean of the logical vector
resampled_test_stats >= observed_test_stat.
For argument distribution = simulated(method = "approximate"), the
approximate randomization test is defined by:
Independent two-sample tests
Calculate the observed test statistic.
For
nsimsiterations:Randomly assign group labels.
Calculate the test statistic.
Insert the observed test statistic to the vector of resampled test statistics.
Calculate the approximate randomization test p-value as the mean of the logical vector
resampled_test_stats >= observed_test_stat.
Dependent two-sample tests
Calculate the observed test statistic.
For
nsimsiterations:Randomly assign pair labels.
Calculate the test statistic.
Insert the observed test statistic to the vector of resampled test statistics.
Calculate the approximate randomization test p-value as the mean of the logical vector
resampled_test_stats >= observed_test_stat.
In the power analysis setting, power(), we can simulate data for
groups 1 and 2 using their known distributions under the assumptions of the
null hypothesis. Unlike above where nonparametric randomization tests
are performed, in this setting approximate parametric tests are performed.
For example, power(wald_test_nb(distribution = simulated())) would result
in an approximate parametric Wald test defined by:
For each relevant design row in
data:For
simulated(nsims=integer())iterations:Simulate new data for group 1 and group 2 under the null hypothesis.
Calculate the Wald test statistic, \(\chi^2_{null}\).
Collect all \(\chi^2_{null}\) into a vector.
For each of the
sim_nb(nsims=integer())simulated datasets:Calculate the Wald test statistic, \(\chi^2_{obs}\).
Calculate the p-value based on the empirical null distribution of test statistics, \(\chi^2_{null}\). (the mean of the logical vector
null_test_stats >= observed_test_stat)
Collect all p-values into a vector.
Calculate power as
sum(p <= alpha) / nsims.
Return all results from
power().
Randomization tests use the positive-biased p-value estimate in the style of Davison and Hinkley (1997) (see also Phipson and Smyth (2010) ):
$$ \hat{p} = \frac{1 + \sum_{i=1}^B \mathbb{I} \{\chi^2_i \geq \chi^2_{obs}\}}{B + 1}. $$
The number of resamples defines the minimum observable p-value
(e.g. nsims=1000L results in min(p-value)=1/1001).
It's recommended to set \(\text{nsims} \gg \frac{1}{\alpha}\).
References
Davison AC, Hinkley DV (1997). Bootstrap Methods and their Application, 1 edition. Cambridge University Press. ISBN 9780521574716, doi:10.1017/CBO9780511802843 .
Phipson B, Smyth GK (2010). “Permutation P-values Should Never Be Zero: Calculating Exact P-values When Permutations Are Randomly Drawn.” Statistical Applications in Genetics and Molecular Biology, 9(1). ISSN 1544-6115, doi:10.48550/arXiv.1603.05766 .
Examples
#----------------------------------------------------------------------------
# asymptotic() examples
#----------------------------------------------------------------------------
library(depower)
set.seed(1234)
data <- sim_nb(
n1 = 60,
n2 = 40,
mean1 = 10,
ratio = 1.5,
dispersion1 = 2,
dispersion2 = 8
)
data |>
wald_test_nb(distribution = asymptotic())
#> $chisq
#> [1] 11.35158
#>
#> $df
#> [1] 1
#>
#> $p
#> [1] 0.0007538376
#>
#> $ratio
#> $ratio$estimate
#> [1] 1.542934
#>
#> $ratio$lower
#> [1] NA
#>
#> $ratio$upper
#> [1] NA
#>
#>
#> $mean1
#> [1] 9.316667
#>
#> $mean2
#> [1] 14.375
#>
#> $dispersion1
#> [1] 1.545421
#>
#> $dispersion2
#> [1] 11.08002
#>
#> $n1
#> [1] 60
#>
#> $n2
#> [1] 40
#>
#> $method
#> [1] "Asymptotic Wald test for independent negative binomial ratio of means"
#>
#> $ci_level
#> NULL
#>
#> $equal_dispersion
#> [1] FALSE
#>
#> $link
#> [1] "log"
#>
#> $ratio_null
#> [1] 1
#>
#> $mle_code
#> [1] 0
#>
#> $mle_message
#> [1] "relative convergence (4)"
#>
#----------------------------------------------------------------------------
# simulated() examples
#----------------------------------------------------------------------------
data |>
wald_test_nb(distribution = simulated(nsims = 200L))
#> $chisq
#> [1] 11.35158
#>
#> $df
#> [1] 1
#>
#> $p
#> [1] 0.00990099
#>
#> $ratio
#> $ratio$estimate
#> [1] 1.542934
#>
#> $ratio$lower
#> [1] NA
#>
#> $ratio$upper
#> [1] NA
#>
#>
#> $mean1
#> [1] 9.316667
#>
#> $mean2
#> [1] 14.375
#>
#> $dispersion1
#> [1] 1.545421
#>
#> $dispersion2
#> [1] 11.08002
#>
#> $n1
#> [1] 60
#>
#> $n2
#> [1] 40
#>
#> $method
#> [1] "Approximate randomization Wald test for independent negative binomial ratio of means"
#>
#> $ci_level
#> NULL
#>
#> $equal_dispersion
#> [1] FALSE
#>
#> $link
#> [1] "log"
#>
#> $ratio_null
#> [1] 1
#>
#> $mle_code
#> [1] 0
#>
#> $mle_message
#> [1] "relative convergence (4)"
#>