summarize_rep_weights(rep_design, type ="both", by)
Arguments
rep_design: A replicate design object, created with either the survey or srvyr packages.
type: Default is "both". Use type = "overall", for an overall summary of the replicate weights. Use type = "specific" for a summary of each column of replicate weights, with each column of replicate weights summarized in a given row of the summary.
Use type = "both" for a list containing both summaries, with the list containing the names "overall" and "both".
by: (Optional) A character vector with the names of variables used to group the summaries.
Returns
If type = "both" (the default), the result is a list of data frames with names "overall" and "specific". If type = "overall", the result is a data frame providing an overall summary of the replicate weights.
The contents of the "overall" summary are the following:
"nrows": Number of rows for the weights
"ncols": Number of columns of replicate weights
"degf_svy_pkg": The degrees of freedom according to the survey package in R
"rank": The matrix rank as determined by a QR decomposition
"avg_wgt_sum": The average column sum
"sd_wgt_sums": The standard deviation of the column sums
"min_rep_wgt": The minimum value of any replicate weight
"max_rep_wgt": The maximum value of any replicate weight
If type = "specific", the result is a data frame providing a summary of each column of replicate weights, with each column of replicate weights described in a given row of the data frame. The contents of the "specific" summary are the following:
"Rep_Column": The name of a given column of replicate weights. If columns are unnamed, the column number is used instead
"N": The number of entries
"N_NONZERO": The number of nonzero entries
"SUM": The sum of the weights
"MEAN": The average of the weights
"CV": The coefficient of variation of the weights (standard deviation divided by mean)
"MIN": The minimum weight
"MAX": The maximum weight
Examples
# Load example datasuppressPackageStartupMessages(library(survey))data(api)dclus1 <- svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)dclus1$variables$response_status <- sample(x = c("Respondent","Nonrespondent","Ineligible","Unknown eligibility"), size = nrow(dclus1), replace =TRUE)rep_design <- as.svrepdesign(dclus1)# Adjust weights for cases with unknown eligibilityue_adjusted_design <- redistribute_weights( design = rep_design, reduce_if = response_status %in% c("Unknown eligibility"), increase_if =!response_status %in% c("Unknown eligibility"), by = c("stype"))# Summarize replicate weightssummarize_rep_weights(rep_design, type ="both")# Summarize replicate weights by grouping variablessummarize_rep_weights(ue_adjusted_design, type ='overall', by = c("response_status"))summarize_rep_weights(ue_adjusted_design, type ='overall', by = c("stype","response_status"))# Compare replicate weightsrep_wt_summaries <- lapply(list('original'= rep_design,'adjusted'= ue_adjusted_design), summarize_rep_weights, type ="overall")print(rep_wt_summaries)