generate_feature function

Feature Generation for Contamination Detection Model

Feature Generation for Contamination Detection Model

Generates features from each pair of input VCF objects for training contamination detection model.

generate_feature(file, hom_p = 0.999, het_p = 0.5, hom_rho = 0.005, het_rho = 0.1, mixture, homcut = 0.99, highcut = 0.7, hetcut = 0.3)

Arguments

  • file: VCF input object
  • hom_p: The initial value for p in Homozygous Beta-Binomial model, default is 0.999
  • het_p: The initial value for p in Heterozygous Beta-Binomial model, default is 0.5
  • hom_rho: The initial value for rho in Homozygous Beta-Binomial model, default is 0.005
  • het_rho: The initial value for rho in Heterozygous Beta-Binomial model, default is 0.1
  • mixture: A vector of whether the sample is contaminated: 0 for pure; 1 for contaminated
  • homcut: Cutoff allele frequency value between hom and high, default is 0.99
  • highcut: Cutoff allele frequency value between high and het, default is 0.7
  • hetcut: Cutoff allele frequency value between het and low, default is 0.3

Returns

A data frame with all features for training model of contamination detection

  • Maintainer: Tao Jiang
  • License: GPL-2
  • Last published: 2018-06-15

Useful links