consecutiveRuns function

Function to detect consecutive runs in a vector (individual's genotypes)

Function to detect consecutive runs in a vector (individual's genotypes)

This is a core function. It implements the consecutive method for detection of runs in diploid genomes (see Marras et al. 2015)

consecutiveRuns(indGeno, individual, mapFile, ROHet = TRUE, minSNP = 3, maxOppositeGenotype = 1, maxMiss = 1, minLengthBps = 1000, maxGap = 10^6)

Arguments

  • indGeno: vector of 0/1/NAs of individual genotypes (0: homozygote; 1: heterozygote)
  • individual: list of group (breed, population, case/control etc.) and ID of individual sample
  • mapFile: Plink map file (for SNP position)
  • ROHet: shall we detect ROHet or ROHom?
  • minSNP: minimum number of SNP in a run
  • maxOppositeGenotype: max n. of homozygous/heterozygous SNP
  • maxMiss: max. n. of missing SNP
  • minLengthBps: min length of a run in bps
  • maxGap: max distance between consecutive SNP in a window to be still considered a potential run

Returns

A data frame of runs per individual sample

Details

The consecutive method detect runs by consecutively scanning SNP loci along the genome. No sliding windows are used. Checks on minimum n. of SNP, max n. of opposite and missing genotypes, max gap between adjacent loci and minimum length of the run are implemented (as in the sliding window method). Both runs of homozygosity (RoHom) and of heterozygosity (RoHet) can be search for (option ROHet: TRUE/FALSE)