Record Linkage and Epidemiological Case Definitions in 'R'
Sub-criteria attributes.
Vectorised approach to group operations.
Vector combinations
Nested sorting
d_report
Unlink group identifiers
Labelling in diyar
epid
object
Group dated events into episodes.
Link events to chronological episodes.
Multistage record linkage
Record linkage
Grammatical lists.
Convert an edge list to record identifiers.
Combinations and permutations of record-sets.
Create epid
and pid
objects with index of matching records
Merge group identifiers
number_line
object
number_line
Overlapping number line objects
pane
object
Distribute events into specified intervals.
pid
objects
Predefined logical tests in ‘diyar’
Modify sub_criteria
objects
Schema diagram for group identifiers
Set operations on number line objects
Match criteria
Windows and lengths
An R package for iterative and batched record linkage, and applying epidemiological case definitions. 'diyar' can be used for deterministic and probabilistic record linkage, or multistage record linkage combining both approaches. It features the implementation of nested match criteria, and mechanisms to address missing data and conflicting matches during stepwise record linkage. Case definitions are implemented by assigning records to groups based on match criteria such as person or place, and overlapping time or duration of events e.g. sample collection dates or periods of hospital stays. Matching records are assigned a unique group ID. Index and duplicate records are removed or further analyses as required.
Useful links