shuffle_grouped_data function

Generate in one go a shuffling function that produces permutations with specific constraints on multiple sample variables and group sizes fitting one specific allocation variable

Generate in one go a shuffling function that produces permutations with specific constraints on multiple sample variables and group sizes fitting one specific allocation variable

shuffle_grouped_data( batch_container, allocate_var, keep_together_vars = c(), keep_separate_vars = c(), n_min = NA, n_max = NA, n_ideal = NA, subgroup_var_name = NULL, report_grouping_as_attribute = FALSE, prefer_big_groups = FALSE, strict = TRUE, fullTree = FALSE, maxCalls = 1e+06 )

Arguments

  • batch_container: Batch container with all samples assigned that are to be grouped and sub-grouped
  • allocate_var: Name of a variable in the samples table to inform possible groupings, as (sub)group sizes must add up to the correct totals
  • keep_together_vars: Vector of column names in sample table; groups are formed by pooling samples with identical values of all those variables
  • keep_separate_vars: Vector of column names in sample table; items with identical values in those variables will not be put into the same subgroup if at all possible
  • n_min: Minimal number of samples in one sub(!)group; by default 1
  • n_max: Maximal number of samples in one sub(!)group; by default the size of the biggest group
  • n_ideal: Ideal number of samples in one sub(!)group; by default the floor or ceiling of mean(n_min,n_max), depending on the setting of prefer_big_groups
  • subgroup_var_name: An optional column name for the subgroups which are formed (or NULL)
  • report_grouping_as_attribute: Boolean, if TRUE, add an attribute table to the permutation functions' output, to be used in scoring during the design optimization
  • prefer_big_groups: Boolean; indicating whether or not bigger subgroups should be preferred in case of several possibilities
  • strict: Boolean; if TRUE, subgroup size constraints have to be met strictly, implying the possibility of finding no solution at all
  • fullTree: Boolean: Enforce full search of the possibility tree, independent of the value of maxCalls
  • maxCalls: Maximum number of recursive calls in the search tree, to avoid long run times with very large trees

Returns

Shuffling function that on each call returns an index vector for a valid sample permutation