ds2dd_detailed function

Extract data from stata file for data dictionary

Extract data from stata file for data dictionary

ds2dd_detailed( data, add.auto.id = FALSE, date.format = "dmy", form.name = NULL, form.sep = NULL, form.prefix = TRUE, field.type = NULL, field.label = NULL, field.label.attr = "label", field.validation = NULL, metadata = names(REDCapCAST::redcapcast_meta), convert.logicals = FALSE )

Arguments

  • data: data frame
  • add.auto.id: flag to add id column
  • date.format: date format, character string. ymd/dmy/mdy. dafault is dmy.
  • form.name: manually specify form name(s). Vector of length 1 or ncol(data). Default is NULL and "data" is used.
  • form.sep: If supplied dataset has form names as suffix or prefix to the column/variable names, the seperator can be specified. If supplied, the form.name is ignored. Default is NULL.
  • form.prefix: Flag to set if form is prefix (TRUE) or suffix (FALSE) to the column names. Assumes all columns have pre- or suffix if specified.
  • field.type: manually specify field type(s). Vector of length 1 or ncol(data). Default is NULL and "text" is used for everything but factors, which wil get "radio".
  • field.label: manually specify field label(s). Vector of length 1 or ncol(data). Default is NULL and colnames(data) is used or attribute field.label.attr for haven_labelled data set (imported .dta file with haven::read_dta()).
  • field.label.attr: attribute name for named labels for haven_labelled data set (imported .dta file with haven::read_dta(). Default is "label"
  • field.validation: manually specify field validation(s). Vector of length 1 or ncol(data). Default is NULL and levels() are used for factors or attribute factor.labels.attr for haven_labelled data set (imported .dta file with haven::read_dta()).
  • metadata: redcap metadata headings. Default is names(REDCapCAST::redcapcast_meta).
  • convert.logicals: convert logicals to factor. Default is TRUE.

Returns

list of length 2

Details

This function is a natural development of the ds2dd() function. It assumes that the first column is the ID-column. No checks. Please, do always inspect the data dictionary before upload.

Ensure, that the data set is formatted with as much information as possible.

field.type can be supplied

Examples

## Basic parsing with default options requireNamespace("REDCapCAST") redcapcast_data |> dplyr::select(-dplyr::starts_with("redcap_")) |> ds2dd_detailed() ## Adding a record_id field iris |> ds2dd_detailed(add.auto.id = TRUE) ## Passing form name information to function iris |> ds2dd_detailed( add.auto.id = TRUE, form.name = sample(c("b", "c"), size = 6, replace = TRUE, prob = rep(.5, 2)) ) |> purrr::pluck("meta") mtcars |> dplyr::mutate(unknown = NA) |> numchar2fct() |> ds2dd_detailed(add.auto.id = TRUE) ## Using column name suffix to carry form name data <- iris |> ds2dd_detailed(add.auto.id = TRUE) |> purrr::pluck("data") names(data) <- glue::glue("{sample(x = c('a','b'),size = length(names(data)), replace=TRUE,prob = rep(x=.5,2))}__{names(data)}") data |> ds2dd_detailed(form.sep = "__")