Retain only a random subset of the replicates in a design
Retain only a random subset of the replicates in a design
Randomly subsamples the replicates of a survey design object, to keep only a subset. The scale factor used in estimation is increased to account for the subsampling.
subsample_replicates(design, n_reps)
Arguments
design: A survey design object, created with either the survey or srvyr packages.
n_reps: The number of replicates to keep after subsampling
Returns
An updated survey design object, where only a random selection of the replicates has been retained. The overall 'scale' factor for the design (accessed with design$scale) is increased to account for the sampling of replicates.
Statistical Details
Suppose the initial replicate design has L replicates, with respective constants ck for k=1,…,L used to estimate variance with the formula
vR=k=1∑Lck(T^y(k)−T^y)2
With subsampling of replicates, L0 of the original L replicates are randomly selected, and then variances are estimated using the formula:
vR=L0Lk=1∑L0ck(T^y(k)−T^y)2
This subsampling is suggested for certain replicate designs in Fay (1989). Kim and Wu (2013) provide a detailed theoretical justification and also propose alternative methods of subsampling replicates.
Kim, J.K. and Wu, C. 2013. "Sparse and Efficient Replication Variance Estimation for Complex Surveys." Survey Methodology , Statistics Canada, 39(1), 91-120.