Assess the range of possible bias based on specified assumptions about how nonrespondents differ from respondents
Assess the range of possible bias based on specified assumptions about how nonrespondents differ from respondents
This range-of-bias analysis assesses the range of possible nonresponse bias under varying assumptions about how nonrespondents differ from respondents. The range of potential bias is calculated for both unadjusted estimates (i.e., from using base weights) and nonresponse-adjusted estimates (i.e., based on nonresponse-adjusted weights).
survey_design: A survey design object created with the 'survey' package
y_var: Name of a variable whose mean or proportion is to be estimated
comparison_cell: (Optional) The name of a variable in the data dividing the sample into cells. If supplied, then the analysis is based on assumptions about differences between respondents and nonrespondents within the same cell. Typically, the variable used is a nonresponse adjustment cell or post-stratification variable.
status: A character string giving the name of the variable representing response/eligibility status. The status variable should have at most four categories, representing eligible respondents (ER), eligible nonrespondents (EN), known ineligible cases (IE), and cases whose eligibility is unknown (UE).
status_codes: A named vector, with four entries named 'ER', 'EN', 'IE', and 'UE'. status_codes indicates how the values of the status variable are to be interpreted.
assumed_multiple: One or more numeric values. Within each nonresponse adjustment cell, the mean for nonrespondents is assumed to be a specified multiple of the mean for respondents. If y_var is a categorical variable, then the assumed nonrespondent mean (i.e., the proportion) in each cell is capped at 1.
assumed_percentile: One or more numeric values, ranging from 0 to 1. Within each nonresponse adjustment cell, the mean of a continuous variable among nonrespondents is assumed to equal a specified percentile of the variable among respondents. The assumed_percentile parameter should be used only when the y_var
variable is numeric. Quantiles are estimated with weights, using the function svyquantile(..., qrule = "hf2").
Returns
A data frame summarizing the range of bias under each assumption. For a numeric outcome variable, there is one row per value of assumed_multiple or assumed_percentile. For a categorical outcome variable, there is one row per combination of category and assumed_multiple or assumed_percentile.
The column bias_of_unadj_estimate is the nonresponse bias of the estimate from respondents produced using the unadjusted weights. The column bias_of_adj_estimate is the nonresponse bias of the estimate from respondents produced using nonresponse-adjusted weights, based on a weighting-class adjustment with comparison_cell as the weighting class variable. If no comparison_cell is specified, the two bias estimates will be the same.
Examples
# Load example datasuppressPackageStartupMessages(library(survey))data(api)base_weights_design <- svydesign( data = apiclus1, id =~dnum, weights =~pw, fpc =~fpc
)|> as.svrepdesign(type ="JK1")base_weights_design$variables$response_status <- sample( x = c("Respondent","Nonrespondent"), prob = c(0.75,0.25), size = nrow(base_weights_design), replace =TRUE)# Assess range of bias for mean of `api00`# based on assuming nonrespondent means# are equal to the 25th percentile or 75th percentile# among respondents, within nonresponse adjustment cells assess_range_of_bias( survey_design = base_weights_design, y_var ="api00", comparison_cell ="stype", status ="response_status", status_codes = c("ER"="Respondent","EN"="Nonrespondent","IE"="Ineligible","UE"="Unknown"), assumed_percentile = c(0.25,0.75))# Assess range of bias for proportions of `sch.wide`# based on assuming nonrespondent proportions# are equal to some multiple of respondent proportions,# within nonresponse adjustment cells assess_range_of_bias( survey_design = base_weights_design, y_var ="sch.wide", comparison_cell ="stype", status ="response_status", status_codes = c("ER"="Respondent","EN"="Nonrespondent","IE"="Ineligible","UE"="Unknown"), assumed_multiple = c(0.25,0.75))
References
See Petraglia et al. (2016) for an example of a range-of-bias analysis using these methods.
Petraglia, E., Van de Kerckhove, W., and Krenzke, T. (2016). Review of the Potential for Nonresponse Bias in FoodAPS 2012. Prepared for the Economic Research Service, U.S. Department of Agriculture. Washington, D.C.