prep_p_hat function

prep_p_hat Prepares input data to estimate p_hat

prep_p_hat Prepares input data to estimate p_hat

This function generates variables required for estimating p_hat, the probability that pathogen sequences from two individuals randomly sampled from their respective population groups are linked. For a population group pairing (u,v)(u,v), prep_p_hat determines all possible group pairings i.e. (uu,uv,vu,vv)(uu, uv, vu, vv).

prep_p_hat( group_in, individuals_sampled_in, individuals_population_in, linkage_counts_in, ... ) ## Default S3 method: prep_p_hat( group_in, individuals_sampled_in, individuals_population_in, linkage_counts_in, verbose_output = FALSE, ... )

Arguments

  • group_in: A character vector indicating population groups/strata (e.g. communities, age-groups, genders or trial arms) between which transmission flows will be evaluated,

  • individuals_sampled_in: A numeric vector indicating the number of individuals sampled per population group,

  • individuals_population_in: A numeric vector of the estimated number of individuals per population group,

  • linkage_counts_in: A data.frame of counts of linked pairs identified between samples of each population group pairing of interest.

    The data.frame should contain the following three fields:

    • H1_group (character) Name of population group 1
    • H2_group (character) Name of population group 2
    • number_linked_pairs_observed (numeric) Number of observed directed transmission pairs between samples from population groups 1 and 2
  • ...: Further arguments.

  • verbose_output: A boolean value to display intermediate output. (Default is FALSE)

Returns

Returns a data.frame containing:

  • H1_group, Name of population group 1
  • H2_group, Name of population group 2
  • number_hosts_sampled_group_1, Number of individuals sampled from population group 1
  • number_hosts_sampled_group_2, Number of individuals sampled from population group 2
  • number_hosts_population_group_1, Estimated number of individuals in population group 1
  • number_hosts_population_group_2, Estimated number of individuals in population group 2
  • max_possible_pairs_in_sample, Number of distinct possible transmission pairs between individuals sampled from population groups 1 and 2
  • max_possible_pairs_in_population, Number of distinct possible transmission pairs between individuals in population groups 1 and 2
  • num_linked_pairs_observed, Number of observed directed transmission pairs between samples from population groups 1 and 2

Details

Counts of observed directed transmission pairs can be obtained from deep-sequence phylogenetic data (via phyloscanner) or from known epidemiological contacts. Note: Deep-sequence data is also commonly referred to as high-throughput or next-generation sequence data. See references to learn more about phyloscanner.

Methods (by class)

  • default: Prepares input data to estimate p_hat

Examples

library(bumblebee) library(dplyr) # Prepare input to estimate p_hat # We shall use the data of HIV transmissions within and between intervention and control # communities in the BCPP/Ya Tsie HIV prevention trial. To learn more about the data # ?counts_hiv_transmission_pairs and ?sampling_frequency # View counts of observed directed HIV transmissions within and between intervention # and control communities counts_hiv_transmission_pairs # View the estimated number of individuals with HIV in intervention and control # communities and the number of individuals sampled from each sampling_frequency results_prep_p_hat <- prep_p_hat(group_in = sampling_frequency$population_group, individuals_sampled_in = sampling_frequency$number_sampled, individuals_population_in = sampling_frequency$number_population, linkage_counts_in = counts_hiv_transmission_pairs, verbose_output = TRUE) # View results results_prep_p_hat

References

  1. Magosi LE, et al., Deep-sequence phylogenetics to quantify patterns of HIV transmission in the context of a universal testing and treatment trial – BCPP/ Ya Tsie trial. To submit for publication, 2021.
  2. Ratmann, O., et al., Inferring HIV-1 transmission networks and sources of epidemic spread in Africa with deep-sequence phylogenetic analysis. Nature Communications, 2019. 10(1): p. 1411.
  3. Wymant, C., et al., PHYLOSCANNER: Inferring Transmission from Within and Between-Host Pathogen Genetic Diversity. Molecular Biology and Evolution, 2017. 35(3): p. 719-733.

See Also

estimate_p_hat

  • Maintainer: Lerato E Magosi
  • License: MIT + file LICENSE
  • Last published: 2021-05-11