edgelist_from_base function

Compute the edgelist of a network from a database of movements records.

Compute the edgelist of a network from a database of movements records.

This function computes the edgelist of a network of facilities across which subjects can be transferred. The edgelist is computed from a database that contains the records of the subjects' stays in the facilities.

edgelist_from_base( base, window_threshold = 365, count_option = "successive", prob_params = c(0.0036, 1/365, 0.128), condition = "dates", noloops = TRUE, nmoves_threshold = NULL, flag_vars = NULL, flag_values = NULL, verbose = FALSE )

Arguments

  • base: (data.table) A database of records of stays of subjects in facilities. The table should have at least the following columns:

    • subjectID (character) unique subject identifier
    • facilityID (character) unique facility identifier
    • admDate (POSIXct) date of admission in the facility
    • disDate (POSIXct) date of discharge of the facility
  • window_threshold: (integer) A number of days. If two stays of a subject at two facilities occurred within this window, this constitutes a connection between the two facilities (given that potential other conditions are met).

  • count_option: (character) How to count connections. Options are "successive", "probability" or "all". See details.

  • prob_params: (vector of numeric) Three numerical values to calculate the probability that a movement causes an introduction from hospital A to hospital B. See Donker T, Wallinga J, Grundmann H. (2010) doi:10.1371/journal.pcbi.1000715 for more details. For use with count_option="probability". prob_params[1] is the rate of acquisition in hospital A (related to LOS in hospital A). Default: 0.0036 prob_params[2] is the rate of loss of colonisation (related to time between admissions). Default: 1/365 prob_params[4] is the rate of transmission to other patients in hospital B (related to LOS in hospital B). Default: 0.128

  • condition: (character) Condition(s) used to decide what constitutes a connection. Can be "dates", "flags", or "both". See details.

  • noloops: (boolean). Should transfers within the same nodes (loops) be kept or set to 0. Defaults to TRUE, removing loops (setting matrix diagonal to 0).

  • nmoves_threshold: (numeric) A threshold for the minimum number of subject transfer between two facilities. Set to NULL to deactivate, default to NULL.

  • flag_vars: (list) Additional variables that can help flag a transfer, besides the dates of admission and discharge. Must be a named list of two character vectors which are the names of the columns that can flag a transfer: the column that can flag a potential origin, and the column that can flag a potential target. The list must be named with "origin" and "transfer". Eg: list("origin" = "var1", "target" = "var2"). See details.

  • flag_values: (list) A named list of two character vectors which contain the values of the variables in flag_var that are matched to flag a potential transfer. The list must be named with "origin" and "transfer". The character vectors might be of length greater than one. Eg: list("origin" = c("value1", "value2"), "target" = c("value2", "value2")). The values in 'origin' and 'target' are the values that flag a potential origin of a transfer, or a potential target, respectively. See details.

  • verbose: TRUE to print computation steps

Returns

A list of two data.tables, which are the edgelists. One in long format (el_long), and one aggregated by pair of nodes (el_aggr).

Details

The edgelist contains the information on the connections between nodes of the network, that is the movements of subjects between facilities. The edgelist can be in two different formats: long or aggregated. In long format, each row corresponds to a single movement between two facilities, therefore only two columns are needed, one containing the origin facilities of a movement, the other containing the target facilities. In aggregated format, the edgelist is aggregated by unique pairs of origin-target facilities.

Examples

mydb <- create_fake_subjectDB(n_subjects = 100, n_facilities = 10) myBase <- checkBase(mydb) edgelist_from_base(myBase)

See Also

matrix_from_edgelist, matrix_from_base