calculate_response_rates() R function from [nrba]

Calculate Response Rates

Calculates response rates using one of the response rate formulas defined by AAPOR (American Association of Public Opinion Research).


calculate_response_rates(
  data,
  status,
  status_codes = c("ER", "EN", "IE", "UE"),
  weights,
  rr_formula = "RR3",
  elig_method = "CASRO-subgroup",
  e = NULL
)

Arguments

data: A data frame containing the selected sample, one row per case.
status: A character string giving the name of the variable representing response/eligibility status. The status variable should have at most four categories, representing eligible respondents (ER), eligible nonrespondents (EN), known ineligible cases (IE), and cases whose eligibility is unknown (UE).
status_codes: A named vector, with four entries named 'ER', 'EN', 'IE', and 'UE'. status_codes indicates how the values of the status variable are to be interpreted.
weights: (Optional) A character string giving the name of a variable representing weights in the data to use for calculating weighted response rates
rr_formula: A character vector including any of the following: 'RR1', 'RR3', and 'RR5'.

These are the names of formulas defined by AAPOR. See the Formulas section below for formulas.
elig_method: If rr_formula='RR3', this specifies how to estimate an eligibility rate for cases with unknown eligibility. Must be one of the following:

'CASRO-overall'

Estimates an eligibility rate using the overall sample. If response rates are calculated for subgroups, the single overall sample estimate will be used as the estimated eligibility rate for subgroups as well.

'CASRO-subgroup'

Estimates eligibility rates separately for each subgroup.

'specified'

With this option, a numeric value is supplied by the user to the parameter e.

For elig_method='CASRO-overall' or elig_method='CASRO-subgroup', the eligibility rate is estimated as $(ER)/(ER + NR + IE)$ .
e: (Required if elig_method='specified'). A numeric value between 0 and 1 specifying the estimated eligibility rate for cases with unknown eligibility. A character string giving the name of a numeric variable may also be supplied; in that case, the eligibility rate must be constant for all cases in a subgroup.

Returns

Output consists of a data frame giving weighted and unweighted response rates. The following columns may be included, depending on the arguments supplied:

RR1_Unweighted
RR1_Weighted
RR3_Unweighted
RR3_Weighted
RR5_Unweighted
RR5_Weighted
n: Total sample size
Nhat: Sum of weights for the total sample
n_ER: Number of eligible respondents
Nhat_ER: Sum of weights for eligible respondents
n_EN: Number of eligible nonrespondents
Nhat_EN: Sum of weights for eligible nonrespondents
n_IE: Number of ineligible cases
Nhat_IE: Sum of weights for ineligible cases
n_UE: Number of cases whose eligibility is unknown
Nhat_UE: Sum of weights for cases whose eligibility is unknown
e_unwtd: If RR3 is calculated, the eligibility rate estimate e used for the unweighted response rate.
e_wtd: If RR3 is calculated, the eligibility rate estimate e used for the weighted response rate.

If the data frame is grouped (i.e. by using df %>% group_by(Region)), then the output contains one row per subgroup.

Formulas

Denote the sample totals as follows:

ER : Total number of eligible respondents
EN : Total number of eligible non-respondents
IE : Total number of ineligible cases
UE : Total number of cases whose eligibility is unknown

For weighted response rates, these totals are calculated using weights.

The response rate formulas are then as follows:

RR1 = ER / ( ER + EN + UE )

RR1 essentially assumes that all cases with unknown eligibility are in fact eligible.

RR3 = ER / ( ER + EN + (e * UE) )

RR3 uses an estimate, e, of the eligibility rate among cases with unknown eligibility.

RR5 = ER / ( ER + EN )

RR5 essentially assumes that all cases with unknown eligibility are in fact ineligible.

For RR3, an estimate, e, of the eligibility rate among cases with unknown eligibility must be used. AAPOR strongly recommends that the basis for the estimate should be explicitly stated and detailed.

The CASRO methods, which might be appropriate for the design, use the formula $e = 1 - ( IE / (ER + EN + IE) )$ .

For elig_method='CASRO-overall', an estimate is calculated for the overall sample and this single estimate is used when calculating response rates for subgroups.
For elig_method='CASRO-subgroup', estimates are calculated separately for each subgroup.

Please consult AAPOR's current Standard Definitions for in-depth explanations.

Examples


# Load example data
data(involvement_survey_srs, package = "nrba")

involvement_survey_srs[["RESPONSE_STATUS"]] <- sample(1:4, size = 5000, replace = TRUE)

# Calculate overall response rates

involvement_survey_srs %>%
  calculate_response_rates(
    status = "RESPONSE_STATUS",
    status_codes = c("ER" = 1, "EN" = 2, "IE" = 3, "UE" = 4),
    weights = "BASE_WEIGHT",
    rr_formula = "RR3",
    elig_method = "CASRO-overall"
  )

# Calculate response rates by subgroup

library(dplyr)

involvement_survey_srs %>%
  group_by(STUDENT_RACE, STUDENT_SEX) %>%
  calculate_response_rates(
    status = "RESPONSE_STATUS",
    status_codes = c("ER" = 1, "EN" = 2, "IE" = 3, "UE" = 4),
    weights = "BASE_WEIGHT",
    rr_formula = "RR3",
    elig_method = "CASRO-overall"
  )

# Compare alternative approaches for handling of cases with unknown eligiblity

involvement_survey_srs %>%
  group_by(STUDENT_RACE) %>%
  calculate_response_rates(
    status = "RESPONSE_STATUS",
    status_codes = c("ER" = 1, "EN" = 2, "IE" = 3, "UE" = 4),
    rr_formula = "RR3",
    elig_method = "CASRO-overall"
  )

involvement_survey_srs %>%
  group_by(STUDENT_RACE) %>%
  calculate_response_rates(
    status = "RESPONSE_STATUS",
    status_codes = c("ER" = 1, "EN" = 2, "IE" = 3, "UE" = 4),
    rr_formula = "RR3",
    elig_method = "CASRO-subgroup"
  )

involvement_survey_srs %>%
  group_by(STUDENT_RACE) %>%
  calculate_response_rates(
    status = "RESPONSE_STATUS",
    status_codes = c("ER" = 1, "EN" = 2, "IE" = 3, "UE" = 4),
    rr_formula = "RR3",
    elig_method = "specified",
    e = 0.5
  )

involvement_survey_srs %>%
  transform(e_by_email = ifelse(PARENT_HAS_EMAIL == "Has Email", 0.75, 0.25)) %>%
  group_by(PARENT_HAS_EMAIL) %>%
  calculate_response_rates(
    status = "RESPONSE_STATUS",
    status_codes = c("ER" = 1, "EN" = 2, "IE" = 3, "UE" = 4),
    rr_formula = "RR3",
    elig_method = "specified",
    e = "e_by_email"
  )

References

The American Association for Public Opinion Research. 2016. Standard Definitions: Final Dispositions of Case Codes and Outcome Rates for Surveys. 9th edition. AAPOR.

nrba package Read PDF manual

Maintainer: Ben Schneider
License: GPL (>= 3)
Last published: 2023-11-21

Useful links

calculate_response_rates function