Feature Generation for Contamination Detection Model
Generates features from each pair of input VCF objects for training contamination detection model.
generate_feature(file, hom_p = 0.999, het_p = 0.5, hom_rho = 0.005, het_rho = 0.1, mixture, homcut = 0.99, highcut = 0.7, hetcut = 0.3)
file
: VCF input objecthom_p
: The initial value for p in Homozygous Beta-Binomial model, default is 0.999het_p
: The initial value for p in Heterozygous Beta-Binomial model, default is 0.5hom_rho
: The initial value for rho in Homozygous Beta-Binomial model, default is 0.005het_rho
: The initial value for rho in Heterozygous Beta-Binomial model, default is 0.1mixture
: A vector of whether the sample is contaminated: 0 for pure; 1 for contaminatedhomcut
: Cutoff allele frequency value between hom and high, default is 0.99highcut
: Cutoff allele frequency value between high and het, default is 0.7hetcut
: Cutoff allele frequency value between het and low, default is 0.3A data frame with all features for training model of contamination detection
Useful links