subsample_replicates function

Retain only a random subset of the replicates in a design

Retain only a random subset of the replicates in a design

Randomly subsamples the replicates of a survey design object, to keep only a subset. The scale factor used in estimation is increased to account for the subsampling.

subsample_replicates(design, n_reps)

Arguments

  • design: A survey design object, created with either the survey or srvyr packages.
  • n_reps: The number of replicates to keep after subsampling

Returns

An updated survey design object, where only a random selection of the replicates has been retained. The overall 'scale' factor for the design (accessed with design$scale) is increased to account for the sampling of replicates.

Statistical Details

Suppose the initial replicate design has LL replicates, with respective constants ckc_k for k=1,,Lk=1,\dots,L used to estimate variance with the formula

vR=k=1Lck(T^y(k)T^y)2 v_{R} = \sum_{k=1}^L c_k\left(\hat{T}_y^{(k)}-\hat{T}_y\right)^2

With subsampling of replicates, L0L_0 of the original LL replicates are randomly selected, and then variances are estimated using the formula:

vR=LL0k=1L0ck(T^y(k)T^y)2 v_{R} = \frac{L}{L_0} \sum_{k=1}^{L_0} c_k\left(\hat{T}_y^{(k)}-\hat{T}_y\right)^2

This subsampling is suggested for certain replicate designs in Fay (1989). Kim and Wu (2013) provide a detailed theoretical justification and also propose alternative methods of subsampling replicates.

Examples

library(survey) set.seed(2023) # Create an example survey design object sample_data <- data.frame( STRATUM = c(1,1,1,1,2,2,2,2), PSU = c(1,2,3,4,5,6,7,8) ) survey_design <- svydesign( data = sample_data, strata = ~ STRATUM, ids = ~ PSU, weights = ~ 1 ) rep_design <- survey_design |> as_fays_gen_rep_design(variance_estimator = "Ultimate Cluster") # Inspect replicates before subsampling rep_design |> getElement("repweights") # Inspect replicates after subsampling rep_design |> subsample_replicates(n_reps = 4) |> getElement("repweights")

References

Fay, Robert. 1989. "Theory And Application Of Replicate Weighting For Variance Calculations." In, 495–500. Alexandria, VA: American Statistical Association. http://www.asasrms.org/Proceedings/papers/1989_033.pdf

Kim, J.K. and Wu, C. 2013. "Sparse and Efficient Replication Variance Estimation for Complex Surveys." Survey Methodology , Statistics Canada, 39(1), 91-120.