Prepares the dataset for effective use in batch effect diagnostics, harmonization, and post-harmonization downstream analysis processes within the ComBatFamQC package.
data_prep( stage ="harmonization", result =NULL, features =NULL, batch =NULL, covariates =NULL, df =NULL, type ="lm", random =NULL, smooth =NULL, interaction =NULL, smooth_int_type =NULL, predict =FALSE, object =NULL)
Arguments
stage: Specifies the stage of analysis for which the data preparation is intended: harmonization or residual.
result: A list derived from visual_prep() that contains dataset and batch effect diagnostic information for Shiny visualization. Can be skipped if features, batch, covariates and df are provided.
features: The name of the features to be harmonized. This can be skipped if result is provided.
batch: The name of the batch variable. Can be skipped if result is provided.
covariates: The names of covariates supplied to model. This can be be skipped if result is provided.
df: The dataset to be harmonized. This can be be skipped if result is provided.
type: The name of a regression model to be used in batch effect diagnostics, harmonization, and the post-harmonization stage: "lmer", "lm", "gam".
random: The variable name of a random effect in linear mixed effect model.
smooth: The name of the covariates that require a smooth function.
interaction: Expression of interaction terms supplied to model (eg: "age,diagnosis").
smooth_int_type: A vector that indicates the types of interaction in gam models. By default, smooth_int_type is set to be NULL, "linear" represents linear interaction terms. "categorical-continuous", "factor-smooth" both represent categorical-continuous interactions ("factor-smooth" includes categorical variable as part of the smooth), "tensor" represents interactions with different scales, and "smooth-smooth" represents interaction between smoothed variables.
predict: A boolean variable indicating whether to run ComBat from scratch or apply existing model to new dataset (currently only work for "original ComBat" and "ComBat-GAM").
object: Existing ComBat model.
Returns
data_prep returns a list containing the processed data and parameter-related information for batch effect diagnostics, harmonization, and post-harmonization downstream analysis.
Examples
data_prep(stage ="harmonization", result =NULL, features = colnames(adni)[43:53],batch ="manufac", covariates ="AGE", df = head(adni,100), type ="lm", random =NULL,smooth =NULL, interaction =NULL, smooth_int_type =NULL, predict =FALSE, object =NULL)