data_prep function

Data Preparation

Data Preparation

Prepares the dataset for effective use in batch effect diagnostics, harmonization, and post-harmonization downstream analysis processes within the ComBatFamQC package.

data_prep( stage = "harmonization", result = NULL, features = NULL, batch = NULL, covariates = NULL, df = NULL, type = "lm", random = NULL, smooth = NULL, interaction = NULL, smooth_int_type = NULL, predict = FALSE, object = NULL )

Arguments

  • stage: Specifies the stage of analysis for which the data preparation is intended: harmonization or residual.
  • result: A list derived from visual_prep() that contains dataset and batch effect diagnostic information for Shiny visualization. Can be skipped if features, batch, covariates and df are provided.
  • features: The name of the features to be harmonized. This can be skipped if result is provided.
  • batch: The name of the batch variable. Can be skipped if result is provided.
  • covariates: The names of covariates supplied to model. This can be be skipped if result is provided.
  • df: The dataset to be harmonized. This can be be skipped if result is provided.
  • type: The name of a regression model to be used in batch effect diagnostics, harmonization, and the post-harmonization stage: "lmer", "lm", "gam".
  • random: The variable name of a random effect in linear mixed effect model.
  • smooth: The name of the covariates that require a smooth function.
  • interaction: Expression of interaction terms supplied to model (eg: "age,diagnosis").
  • smooth_int_type: A vector that indicates the types of interaction in gam models. By default, smooth_int_type is set to be NULL, "linear" represents linear interaction terms. "categorical-continuous", "factor-smooth" both represent categorical-continuous interactions ("factor-smooth" includes categorical variable as part of the smooth), "tensor" represents interactions with different scales, and "smooth-smooth" represents interaction between smoothed variables.
  • predict: A boolean variable indicating whether to run ComBat from scratch or apply existing model to new dataset (currently only work for "original ComBat" and "ComBat-GAM").
  • object: Existing ComBat model.

Returns

data_prep returns a list containing the processed data and parameter-related information for batch effect diagnostics, harmonization, and post-harmonization downstream analysis.

Examples

data_prep(stage = "harmonization", result = NULL, features = colnames(adni)[43:53], batch = "manufac", covariates = "AGE", df = head(adni, 100), type = "lm", random = NULL, smooth = NULL, interaction = NULL, smooth_int_type = NULL, predict = FALSE, object = NULL)