stratrs function

Perform stratified random sampling to balance outcomes

Perform stratified random sampling to balance outcomes

This function is used to perform stratified random sampling to balance outcomes among the shards.

stratrs(y, C=5, P=0)

Arguments

  • y: The binary/categorical/continuous outcome.
  • C: The number of shards to break the data set into.
  • P: For continuous data, we break the range into P segments via the quantiles. Specifying, P=20 seems to work reasonably well.

Details

To perform BART with large data sets, random sampling is employed to break the data into C shards. Each shard should be balanced with respect to the outcome. For binary/categorical outcomes, stratified random sampling is employed with this function.

Returns

A vector is returned with each element assigned to a shard.

See Also

rs.pbart

Examples

set.seed(12) x <- rbinom(25000, 1, 0.1) a <- stratrs(x) table(a, x) z <- pmin(rpois(25000, 0.8), 5) b <- stratrs(z) table(b, z)
  • Maintainer: Rodney Sparapani
  • License: GPL (>= 2)
  • Last published: 2024-06-21

Useful links