redcap_column_sanitize() R function from [REDCapR]

Sanitize to adhere to REDCap character encoding requirements

Replace non-ASCII characters with legal characters that won't cause problems when writing to a REDCap project.


redcap_column_sanitize(
  d,
  column_names = colnames(d),
  encoding_initial = "latin1",
  substitution_character = "?"
)

Arguments

d: The base::data.frame() or tibble::tibble() containing the dataset used to update the REDCap project. Required.
column_names: An array of character values indicating the names of the variables to sanitize. Optional.
encoding_initial: An array of character values indicating the names of the variables to sanitize. Optional.
substitution_character: The character value that replaces characters that were unable to be appropriately matched.

Returns

A data frame with same columns, but whose character values have been sanitized.

Details

Letters like an accented 'A' are replaced with a plain 'A'.

This is a thin wrapper around base::iconv(). The ASCII//TRANSLIT option does the actual transliteration work. As of R 3.1.0, the OSes use similar, but different, versions to convert the characters. Be aware of this in case you notice OS-dependent differences.

Examples


# Typical examples are not shown because they require non-ASCII encoding,
#   which makes the package documentation less portable.

dirty <- data.frame(
  id     = 1:3,
  names  = c("Ekstr\xf8m", "J\xf6reskog", "bi\xdfchen Z\xfcrcher")
)

REDCapR::redcap_column_sanitize(dirty)

Author(s)

Will Beasley

REDCapR package Read PDF manual

Maintainer: Will Beasley
License: MIT + file LICENSE
Last published: 2025-01-11

Useful links

redcap_column_sanitize function