A function to sample from a SITAR dataset for experimental design purposes. Two different sampling schemes are offered, based on the values of id
and x.
subsample(x, id, data, prob =1, xlim =NULL)
Arguments
x: vector of age.
id: factor of subject identifiers.
data: dataframe containing x and id.
prob: scalar defining sampling probability. See Details.
xlim: length 2 vector defining range of x to be selected. See Details.
Returns
Returns a logical the length of x where TRUE indicates a sampled value.
Details
With the first sampling scheme xlim is set to NULL (default), and rows of data are sampled with probability prob without replacement. With the second sampling scheme xlim is set to a range within range(x). Subjects id are then sampled with probability prob without replacement, and all their rows where x is within xlim are selected. The second scheme is useful for testing the power of the model to predict later growth when data only up to a certain age are available. Setting xlim to range(x)
allows data to be sampled by subject. The returned value can be used as the subset argument in sitar or update.sitar.
Examples
## draw 50% random samples50 <- subsample(age, id, heights, prob=0.5)## truncate age range to 7-12 for 50% of subjectst50 <- subsample(age, id, heights, prob=0.5, xlim=c(7,12))