estimate_p_hat function

estimate_p_hat Estimates probability of linkage between two individuals

estimate_p_hat Estimates probability of linkage between two individuals

This function computes the probability that pathogen sequences from two individuals randomly sampled from their respective population groups (e.g. communities) are linked.

estimate_p_hat(df_counts, ...) ## Default S3 method: estimate_p_hat(df_counts, ...)

Arguments

  • df_counts: A data.frame returned by the function: prep_p_hat()
  • ...: Further arguments.

Returns

Returns a data.frame containing:

  • H1_group, Name of population group 1
  • H2_group, Name of population group 2
  • number_hosts_sampled_group_1, Number of individuals sampled from population group 1
  • number_hosts_sampled_group_2, Number of individuals sampled from population group 2
  • number_hosts_population_group_1, Estimated number of individuals in population group 1
  • number_hosts_population_group_2, Estimated number of individuals in population group 2
  • max_possible_pairs_in_sample, Number of distinct possible transmission pairs between individuals sampled from population groups 1 and 2
  • max_possible_pairs_in_population, Number of distinct possible transmission pairs between individuals in population groups 1 and 2
  • num_linked_pairs_observed, Number of observed directed transmission pairs between samples from population groups 1 and 2
  • p_hat, Probability that pathogen sequences from two individuals randomly sampled from their respective population groups are linked

Details

For a population group pairing (u,v)(u,v), p_hat is computed as the fraction of distinct possible pairs between samples from groups uu and vv that are linked. Note: The number of distinct possible (u,v)(u,v)

pairs in the sample is the product of sampled individuals in groups uu

and uu. If u=vu = v, then the distinct possible pairs is the number of individuals sampled in population group uu choose 2. See bumblebee website for more details https://magosil86.github.io/bumblebee/.

Methods (by class)

  • default: Estimates probability of linkage between two individuals

Examples

library(bumblebee) library(dplyr) # Estimate the probability of linkage between two individuals randomly sampled from # two population groups of interest. # We shall use the data of HIV transmissions within and between intervention and control # communities in the BCPP/Ya Tsie HIV prevention trial. To learn more about the data # ?counts_hiv_transmission_pairs and ?sampling_frequency # Prepare input to estimate p_hat # View counts of observed directed HIV transmissions within and between intervention # and control communities counts_hiv_transmission_pairs # View the estimated number of individuals with HIV in intervention and control # communities and the number of individuals sampled from each sampling_frequency results_prep_p_hat <- prep_p_hat(group_in = sampling_frequency$population_group, individuals_sampled_in = sampling_frequency$number_sampled, individuals_population_in = sampling_frequency$number_population, linkage_counts_in = counts_hiv_transmission_pairs, verbose_output = FALSE) # View results results_prep_p_hat # Estimate p_hat results_estimate_p_hat <- estimate_p_hat(df_counts = results_prep_p_hat) # View results results_estimate_p_hat

References

  1. Magosi LE, et al., Deep-sequence phylogenetics to quantify patterns of HIV transmission in the context of a universal testing and treatment trial – BCPP/ Ya Tsie trial. To submit for publication, 2021.
  2. Carnegie, N.B., et al., Linkage of viral sequences among HIV-infected village residents in Botswana: estimation of linkage rates in the presence of missing data. PLoS Computational Biology, 2014. 10(1): p. e1003430.

See Also

See prep_p_hat to prepare input data to estimate p_hat

  • Maintainer: Lerato E Magosi
  • License: MIT + file LICENSE
  • Last published: 2021-05-11