Rank difference pseudomedian
rdpmedian.RdComputes the Hodges-Lehmann pseudomedian and bootstrap confidence interval for the differences of ranks.
Usage
rdpmedian(
data,
formula,
conf_level = 0.95,
conf_method = "percentile",
n_resamples = 1000L,
agg_fun = "error"
)
rdpmedian2(
x,
y,
conf_level = 0.95,
conf_method = "percentile",
n_resamples = 1000L
)Arguments
- data
(data.frame)
The data frame of interest.- formula
(formula)
A formula of form:- y ~ group | block
Use when
datais in tall format.yis the numeric outcome,groupis the binary grouping variable, andblockis the subject/item-level variable indicating pairs of observations.groupwill be converted to a factor and the first level will be the reference value. For example, whenlevels(data$group) <- c("pre", "post"), the focal level is 'post', so differences arepost - pre. Pairs with missing values are silently dropped. Seeagg_funfor handling duplicate cases of grouping/blocking combinations.- y ~ x
Use when
datais in wide format.yandxmust be numeric vectors. Differences of ranks correspond withdata$y - data$x. Pairs with missing values are silently dropped.
- conf_level
(Scalar numeric:
0.95;[0, 1))
The confidence level. Ifconf_level = 0, no confidence interval is calculated.- conf_method
(Scalar character:
c("percentile", "bca"))
The type of bootstrap confidence interval.- n_resamples
(Scalar integer:
1000L;[10L, Inf))
The number of bootstrap resamples. Ifconf_level = 0, no resampling is performed.- agg_fun
(Scalar character or function:
"error")
Used for aggregating duplicate cases of grouping/blocking combinations when data is in tall format andformulahas structurey ~ group | block."error"(default) will return an error if duplicate grouping/blocking combinations are encountered. Select one of"first","last","sum","mean","median","min", or"max"for built in aggregation handling (each appliesna.rm = TRUE). Or define your own function. For example,myfun <- function(x) {as.numeric(quantile(x, 0.75, na.rm = TRUE))}.- x
(numeric)
Numeric vector of data. Differences of ranks correspond withx - y. Pairs with missing values are silently dropped.- y
(numeric)
Numeric vector of data. Differences of ranks correspond withx - y. Pairs with missing values are silently dropped.
Value
A list with the following elements:
| Slot | Subslot | Name | Description |
| 1 | pseudomedian | Measure of centrality. | |
| 2 | lower | Lower bound of confidence interval for the pseudomedian. | |
| 3 | upper | Upper bound of confidence interval for the pseudomedian. | |
| 4 | method | Estimate method. | |
| 5 | info | Additional information. | |
| 5 | 1 | n_sample | Number of observations in the original data. |
| 5 | 2 | n_analytic | Number of observations after removing missing values from the original data. |
| 5 | 3 | data_type | Data type. |
| 5 | 4 | focal_name | Name of the focal variable (differences are focal - reference). |
| 5 | 5 | reference_name | Name of the reference variable (differences are focal - reference). |
| 6 | call | A named list of the function's arguments (use as.call() to convert to a call). |
Details
This function generates a confidence interval for the pseudomedian based on the observed differences of ranks, not based on an inversion of the rank difference test rdt().
The Hodges-Lehmann estimator is the median of all pairwise averages of the sample values. $$\mathrm{HL} = \mathrm{median} \left\{ \frac{x_i + x_j}{2} \right\}_{i \le j}$$ This pseudomedian is a robust, distribution-free estimate of central tendency for a single sample, or a location-shift estimator for paired data. It's resistant to outliers and compatible with rank-based inference.
The percentile and BCa bootstrap confidence interval methods are described in chapter 5.3 of Davison and Hinkley (1997) .
This function is mainly a wrapper for the function Hmisc::pMedian().
References
Davison AC, Hinkley DV (1997). Bootstrap Methods and their Application, 1 edition. Cambridge University Press. ISBN 9780511802843, doi:10.1017/CBO9780511802843 .
Harrell Jr FE (2025). Hmisc: Harrell Miscellaneous. R package version 5.2-4, https://hbiostat.org/R/Hmisc/.
Examples
#----------------------------------------------------------------------------
# rdpmedian() example
#----------------------------------------------------------------------------
library(rankdifferencetest)
# Use example data from Kornbrot (1990)
data <- kornbrot_table1
# Create tall-format data for demonstration purposes
data_tall <- reshape(
data = kornbrot_table1,
direction = "long",
varying = c("placebo", "drug"),
v.names = c("time"),
idvar = "subject",
times = c("placebo", "drug"),
timevar = "treatment",
new.row.names = seq_len(prod(length(c("placebo", "drug")), nrow(kornbrot_table1)))
)
# Subject and treatment should be factors. The ordering of the treatment factor
# will determine the difference (placebo - drug).
data_tall$subject <- factor(data_tall$subject)
data_tall$treatment <- factor(data_tall$treatment, levels = c("drug", "placebo"))
# Rate transformation inverts the rank ordering.
data$placebo_rate <- 60 / data$placebo
data$drug_rate <- 60 / data$drug
data_tall$rate <- 60 / data_tall$time
# Estimates
rdpmedian(
data = data,
formula = placebo ~ drug
)
#> Error in get_dep_data(data = x, formula = formula, single_term = FALSE, agg_fun = agg_fun): unused argument (single_term = FALSE)
rdpmedian(
data = data_tall,
formula = time ~ treatment | subject
)
#> Error in get_dep_data(data = x, formula = formula, single_term = FALSE, agg_fun = agg_fun): unused argument (single_term = FALSE)
rdpmedian2(
x = data$placebo_rate,
y = data$drug_rate
)
#>
#> Hodges-Lehmann estimator and percentile bootstrap confidence
#> interval
#>
#> Paired differences of ranks: data$placebo_rate -
#> data$drug_rate
#>
#> Pseudomedian: -1.5
#> 95% CI: -5.5, 2
rdpmedian(
data = data_tall,
formula = rate ~ treatment | subject
)
#> Error in get_dep_data(data = x, formula = formula, single_term = FALSE, agg_fun = agg_fun): unused argument (single_term = FALSE)