check for matching (or close to matching) genotypes in a data frame
check for matching (or close to matching) genotypes in a data frame
Super simple function that looks at all pairs of fish from the data frame and returns a tibble that includes those which shared a fraction >= than min_frac_non_miss of the genotypes not missing in either fish, and which were matching at a fraction >= min_frac_matching of those non-missing pairs of genotypes.
D: a two-column format genetic dataset, with "repunit", "collection", and "indiv" columns, as well as a "sample_type" column that has entried either of "reference" or "mixture" or both.
gen_start_col: the first column of genetic data in reference
min_frac_non_miss: the fraction of loci that the pair must share non missing in order to be reported
min_frac_matching: the fraction of shared non-missing loci that must be shared between the indivdiuals to be reported as a matching pair.
Returns
a tibble ...
Examples
# one pair found in the interal alewife data set:close_matching_samples(alewife,17)