Function to retrieve most common runs in the population
Function to retrieve most common runs in the population
This function takes in input either the run results or the output from the function snpInsideRuns (proportion of times a SNP is inside a run) in the population/group, and returns a subset of the runs most commonly found in the group/population. The parameter threshold controls the definition of most common (e.g. in at least 50%, 70% etc. of the sampled individuals)
runs: R object (dataframe) with results on detected runs
SnpInRuns: dataframe with the proportion of times each SNP falls inside a run in the population (output from snpInsideRuns)
genotypeFile: Plink ped file (for SNP position)
mapFile: Plink map file (for SNP position)
threshold: value from 0 to 1 (default 0.7) that controls the desired proportion of individuals carrying that run (e.g. 70%)
Returns
A dataframe with the most common runs detected in the sampled individuals (the group/population, start and end position of the run, chromosome and number of SNP included in the run are reported in the output dataframe)