estimate_prob_group_pairing_and_linked function

estimate_prob_group_pairing_and_linked Estimates joint probability of linkage

estimate_prob_group_pairing_and_linked Estimates joint probability of linkage

This function computes the joint probability that a pair of pathogen sequences is from a specific population group pairing and linked.

estimate_prob_group_pairing_and_linked( df_counts_and_p_hat, individuals_population_in, ... ) ## Default S3 method: estimate_prob_group_pairing_and_linked( df_counts_and_p_hat, individuals_population_in, verbose_output = FALSE, ... )

Arguments

  • df_counts_and_p_hat: A data.frame returned by function: estimate_p_hat()
  • individuals_population_in: A numeric vector of the estimated number of individuals per population group
  • ...: Further arguments.
  • verbose_output: A boolean value to display intermediate output. (Default is FALSE)

Returns

Returns a data.frame containing:

  • H1_group, Name of population group 1
  • H2_group, Name of population group 2
  • number_hosts_sampled_group_1, Number of individuals sampled from population group 1
  • number_hosts_sampled_group_2, Number of individuals sampled from population group 2
  • number_hosts_population_group_1, Estimated number of individuals in population group 1
  • number_hosts_population_group_2, Estimated number of individuals in population group 2
  • max_possible_pairs_in_sample, Number of distinct possible transmission pairs between individuals sampled from population groups 1 and 2
  • max_possible_pairs_in_population, Number of distinct possible transmission pairs between individuals in population groups 1 and 2
  • num_linked_pairs_observed, Number of observed directed transmission pairs between samples from population groups 1 and 2
  • p_hat, Probability that pathogen sequences from two individuals randomly sampled from their respective population groups are linked
  • prob_group_pairing_and_linked, Probability that a pair of pathogen sequences is from a specific population group pairing and is linked

Details

For a population group pairing (u,v)(u,v), the joint probability that a pair is from groups (u,v)(u,v) and is linked is computed as

(Nuv/Nchoose2)phatuv, (N_uv / N_choose_2) * p_hat_uv ,

where,

  • N_uv = N_u * N_v: maximum distinct possible (u,v)(u,v) pairs in population
  • p_hat_uv: probability of linkage between two individuals randomly sampled from groups uu and vv
  • N choose 2 or (N * (N - 1))/2 : all distinct possible pairs in population.

See bumblebee website for more details https://magosil86.github.io/bumblebee/.

Methods (by class)

  • default: Estimates joint probability of linkage

Examples

library(bumblebee) library(dplyr) # Estimate joint probability that a pair is from a specific group pairing and linked # We shall use the data of HIV transmissions within and between intervention and control # communities in the BCPP/Ya Tsie HIV prevention trial. To learn more about the data # ?counts_hiv_transmission_pairs and ?sampling_frequency # Load and view data # # The input data comprises counts of observed directed HIV transmission pairs # within and between intervention and control communities in the BCPP/Ya Tsie # trial, sampling information and the probability of linkage between individuals # sampled from intervention and control communities (i.e. \code{p_hat}) # # See ?estimate_p_hat() for details on estimating p_hat results_estimate_p_hat <- estimated_hiv_transmission_flows[, c(1:10)] results_estimate_p_hat # Estimate prob_group_pairing_and_linked results_prob_group_pairing_and_linked <- estimate_prob_group_pairing_and_linked( df_counts_and_p_hat = results_estimate_p_hat, individuals_population_in = sampling_frequency$number_population) # View results results_prob_group_pairing_and_linked

References

  1. Magosi LE, et al., Deep-sequence phylogenetics to quantify patterns of HIV transmission in the context of a universal testing and treatment trial – BCPP/ Ya Tsie trial. To submit for publication, 2021.
  2. Carnegie, N.B., et al., Linkage of viral sequences among HIV-infected village residents in Botswana: estimation of linkage rates in the presence of missing data. PLoS Computational Biology, 2014. 10(1): p. e1003430.

See Also

See estimate_p_hat to prepare input data to estimate prob_group_pairing_and_linked

  • Maintainer: Lerato E Magosi
  • License: MIT + file LICENSE
  • Last published: 2021-05-11