Rank difference test
rdt.RdPerforms Kornbrot's rank difference test. The rank difference test is a modified Wilcoxon signed-rank test that produces consistent and meaningful results for ordinal or monotonically transformed data.
Usage
rdt(
data,
formula,
conf_level = 0,
conf_method = "inversion",
n_resamples = 1000L,
alternative = "two.sided",
mu = 0,
distribution = "auto",
correct = TRUE,
zero_method = "wilcoxon",
agg_fun = "error",
digits_rank = Inf,
tol_root = 1e-04
)
rdt2(
x,
y,
conf_level = 0,
conf_method = "inversion",
n_resamples = 1000L,
alternative = "two.sided",
mu = 0,
distribution = "auto",
correct = TRUE,
zero_method = "wilcoxon",
digits_rank = Inf,
tol_root = 1e-04
)Arguments
- data
(data.frame)
The data frame of interest.- formula
(formula)
A formula of form:- y ~ group | block
Use when
datais in tall format.yis the numeric outcome,groupis the binary grouping variable, andblockis the subject/item-level variable indicating pairs of observations.groupwill be converted to a factor and the first level will be the reference value. For example, whenlevels(data$group) <- c("pre", "post"), the focal level is 'post', so differences arepost - pre. Pairs with missing values are silently dropped. Seeagg_funfor handling duplicate cases of grouping/blocking combinations.- y ~ x
Use when
datais in wide format.yandxmust be numeric vectors. Differences of ranks correspond withdata$y - data$x. Pairs with missing values are silently dropped.
- conf_level
(Scalar numeric:
0;[0, 1))
The confidence level. Set to0.95for a 95% confidence interval. If0(default), no confidence interval is calculated.- conf_method
(Scalar character:
c("inversion", "percentile", "bca"))
The type of confidence interval. If"inversion"(default), the bounds are computed by inverting the hypothesis test. If"percentile", the bounds are computed using a percentile bootstrap. If"bca", the bounds are computed using a bias-corrected and accelerated (BCa) bootstrap.- n_resamples
(Scalar integer:
1000L;[10, Inf))
The number of bootstrap resamples. Only used if"percentile"or"bca"confidence intervals are chosen.- alternative
(Scalar character:
c("two.sided", "greater", "less"))
The alternative hypothesis. Must be one of"two.sided"(default),"greater", or"less".- mu
(Scalar numeric:
0;(-Inf, Inf))
Under the null hypothesis, differences of ranks are assumed to be symmetric aroundmu.- distribution
(Scalar character:
c("auto", "exact", "asymptotic"))
The method used to calculate the p-value. If"auto"(default), an appropriate method will automatically be chosen (distribution = "exact"when n < 50 ordistribution = "asymptotic"otherwise). If"exact", the exact Wilcoxon signed-rank distribution is used. If"asymptotic", the asymptotic normal approximation is used.- correct
(Scalar logical:
c(TRUE, FALSE))
Whether or not to apply a continuity correction to the Z-statistic for the asymptotic approximation of the p-value.- zero_method
(Scalar character:
c("wilcoxon", "pratt"))
The method used to handle differences of ranks equal to zero. If"wilcoxon"(default), zeros are removed prior to ranking (classic Wilcoxon convention). If"pratt", zeros are retained for ranking but excluded from the signed-rank sum.- agg_fun
(Scalar character or function:
"error")
Used for aggregating duplicate cases of grouping/blocking combinations when data is in tall format andformulahas structurey ~ group | block."error"(default) will return an error if duplicate grouping/blocking combinations are encountered. Select one of"first","last","sum","mean","median","min", or"max"for built in aggregation handling (each appliesna.rm = TRUE). Or define your own function. For example,myfun <- function(x) {as.numeric(quantile(x, 0.75, na.rm = TRUE))}.- digits_rank
(Scalar integer:
Inf;(0, Inf])
Controls ranking precision. If finite, ranks are computed from base::signif(abs(diffs), digits_rank). IfInf(default), ranks are computed fromabs(diffs). Smaller values may introduce ties (because they no longer depend on extremely small numeric differences) and thus change averaged ranks and tie counts.- tol_root
(Scalar numeric:
1e-4;(0, Inf))
For stats::uniroot(tol=tol_root)calls whenconf_level > 0anddistribution = "asymptotic".- x
(numeric)
Numeric vector of data. Differences of ranks correspond withx - y. Pairs with missing values are silently dropped.- y
(numeric)
Numeric vector of data. Differences of ranks correspond withx - y. Pairs with missing values are silently dropped.
Value
A list with the following elements:
| Slot | Subslot | Name | Description |
| 1 | p_value | p-value. | |
| 2 | statistic | Test statistic. \(W^+\) for the exact Wilcoxon signed-rank distribution. \(Z\) for the asymptotic normal approximation. | |
| 3 | pseudomedian | Measure of centrality. | |
| 4 | lower | Lower bound of confidence interval for the pseudomedian. NULL if no CI requested. | |
| 5 | upper | Upper bound of confidence interval for the pseudomedian. NULL if no CI requested. | |
| 6 | method | Test method. | |
| 7 | info | Additional test information. | |
| 7 | 1 | p_value_method | Method used to calculate the p-value. |
| 7 | 2 | pseudomedian_method | Method used to calculate the pseudomedian. |
| 7 | 3 | conf_method | Method used to calculate the confidence interval. |
| 7 | 4 | conf_level_achieved | Achieved confidence level. |
| 7 | 5 | n_sample | Number of observations in the original data. |
| 7 | 6 | n_analytic | Number of observations after removing missing values from the original data. |
| 7 | 7 | n_zeros | Number of zeros among differences of ranks in the analytic data set. |
| 7 | 8 | n_signed | Number of nonzero differences of ranks in the analytic data set. |
| 7 | 9 | n_ties | Number of tied ranks after ranking the absolute differences of ranks. |
| 7 | 10 | data_type | Data type. |
| 7 | 11 | focal_name | Name of the focal variable (differences are focal - reference). |
| 7 | 12 | reference_name | Name of the reference variable (differences are focal - reference). |
| 8 | call | A named list of the function's arguments (use as.call() to convert to a call; call$distribution may be updated from "exact" to "asymptotic"). |
Details
For paired data, the Wilcoxon signed-rank test results in subtraction of the paired values. However, this subtraction is not meaningful for ordinal scale variables. In addition, any monotone transformation of the data will result in different signed ranks, thus different p-values. However, ranking the original data allows for meaningful addition and subtraction of ranks and preserves ranks over monotonic transformation. Kornbrot developed the rank difference test for these reasons.
Kornbrot recommends that the rank difference test be used in preference to the Wilcoxon signed-rank test in all paired comparison designs where the data are not both of interval scale and of known distribution. The rank difference test preserves good power compared to Wilcoxon's signed-rank test, is more powerful than the sign test, and has the benefit of being a true distribution-free test.
The procedure for Kornbrot's rank difference test is as follows:
Combine all \(2n\) paired observations.
Order the values from smallest to largest.
Assign ranks \(1, 2, \dots, 2n\) with average rank for ties.
Perform the Wilcoxon signed-rank test using the paired ranks.
The test statistic for the rank difference test \((D)\) is not exactly equal to the test statistic of the naive rank-transformed Wilcoxon signed-rank test \((W^+)\).
However, using \(W^+\) should result in a conservative estimate for \(D\), and they approach in distribution as the sample size increases.
Kornbrot (1990)
discusses methods for calculating \(D\) when \(n \leq 7\) and \(8 \leq n \leq 20\).
rdt() uses \(W^+\) instead of \(D\).
See srt() for additional details about implementation of Wilcoxon's signed-rank test.
References
Kornbrot DE (1990). “The rank difference test: A new and meaningful alternative to the Wilcoxon signed ranks test for ordinal data.” British Journal of Mathematical and Statistical Psychology, 43(2), 241–264. ISSN 00071102, doi:10.1111/j.2044-8317.1990.tb00939.x .
Examples
#----------------------------------------------------------------------------
# rdt() example
#----------------------------------------------------------------------------
library(rankdifferencetest)
# Use example data from Kornbrot (1990)
data <- kornbrot_table1
# Create tall-format data for demonstration purposes
data_tall <- reshape(
data = kornbrot_table1,
direction = "long",
varying = c("placebo", "drug"),
v.names = c("time"),
idvar = "subject",
times = c("placebo", "drug"),
timevar = "treatment",
new.row.names = seq_len(prod(length(c("placebo", "drug")), nrow(kornbrot_table1)))
)
# Subject and treatment should be factors. The ordering of the treatment factor
# will determine the difference (placebo - drug).
data_tall$subject <- factor(data_tall$subject)
data_tall$treatment <- factor(data_tall$treatment, levels = c("drug", "placebo"))
# Recreate analysis and results from table 3 (page 248) in Kornbrot (1990)
## Divide p-value by 2 for one-tailed probability.
rdt(
data = data,
formula = placebo ~ drug,
alternative = "two.sided",
distribution = "asymptotic",
zero_method = "wilcoxon",
correct = TRUE,
conf_level = 0.95
)
#> Error in get_dep_data(data = x, formula = formula, single_term = FALSE, agg_fun = agg_fun): unused argument (single_term = FALSE)
rdt2(
x = data$placebo,
y = data$drug,
alternative = "two.sided",
distribution = "asymptotic",
zero_method = "wilcoxon",
correct = TRUE,
conf_level = 0.95
)
#>
#> asymptotic Kornbrot-Wilcoxon rank difference test
#>
#> p = 0.262
#> Z = 1.12
#> Paired differences of ranks: data$placebo - data$drug
#> Alternative hypothesis: True location shift of paired
#> differences of ranks is not equal to 0
#>
#> Pseudomedian: 1.5
#> 95% CI: -2.5, 6
# The same outcome is seen after transforming time to rate.
## Rate transformation inverts the rank ordering.
data$placebo_rate <- 60 / data$placebo
data$drug_rate <- 60 / data$drug
data_tall$rate <- 60 / data_tall$time
rdt(
data = data_tall,
formula = rate ~ treatment | subject,
alternative = "two.sided",
distribution = "asymptotic",
zero_method = "wilcoxon",
correct = TRUE,
conf_level = 0.95
)
#> Error in get_dep_data(data = x, formula = formula, single_term = FALSE, agg_fun = agg_fun): unused argument (single_term = FALSE)
# In contrast to the rank difference test, the Wilcoxon signed-rank test
# produces differing results. See table 1 and table 2 (page 245) in
# Kornbrot (1990).
## Divide p-value by 2 for one-tailed probability.
srt(
data = data,
formula = placebo ~ drug,
alternative = "two.sided",
distribution = "asymptotic",
zero_method = "wilcoxon",
correct = TRUE,
conf_level = 0.95
)
#> Error in get_dep_data(data = x, formula = formula, single_term = TRUE, agg_fun = agg_fun): unused argument (single_term = TRUE)
srt(
data = data_tall,
formula = rate ~ treatment | subject,
alternative = "two.sided",
distribution = "asymptotic",
zero_method = "wilcoxon",
correct = TRUE,
conf_level = 0.95
)
#> Error in get_dep_data(data = x, formula = formula, single_term = TRUE, agg_fun = agg_fun): unused argument (single_term = TRUE)