utils_num_str function

Utilities for handling with numbers and strings

Utilities for handling with numbers and strings

  • all_lower_case(): Translate all non-numeric strings of a data frame to lower case.
  • all_upper_case(): Translate all non-numeric strings of a data frame to upper case.
  • all_title_case(): Translate all non-numeric strings of a data frame to title case.
  • first_upper_case: Translate the first word of a string to upper case.
  • extract_number(): Extract the number(s) of a string.
  • extract_string(): Extract all strings, ignoring case.
  • find_text_in_num(): Find text characters in a numeric sequence and return the row index.
  • has_text_in_num(): Inspect columns looking for text in numeric sequence and return a warning if text is found.
  • remove_space(): Remove all blank spaces of a string.
  • remove_strings(): Remove all strings of a variable.
  • replace_number(): Replace numbers with a replacement.
  • replace_string(): Replace all strings with a replacement, ignoring case.
  • round_cols(): Round a selected column or a whole data frame to significant figures.
  • tidy_strings(): Tidy up characters strings, non-numeric columns, or any selected columns in a data frame by putting all word in upper case, replacing any space, tabulation, punctuation characters by '_', and putting '_' between lower and upper case. Suppose that str = c("Env1", "env 1", "env.1") (which by definition should represent a unique level in plant breeding trials, e.g., environment 1) is subjected to tidy_strings(str): the result will be then c("ENV_1", "ENV_1", "ENV_1"). See Examples section for more examples.
all_upper_case(.data, ...) all_lower_case(.data, ...) all_title_case(.data, ...) first_upper_case(.data, ...) extract_number(.data, ..., pattern = NULL) extract_string(.data, ..., pattern = NULL) find_text_in_num(.data, ...) has_text_in_num(.data) remove_space(.data, ...) remove_strings(.data, ...) replace_number( .data, ..., pattern = NULL, replacement = "", ignore_case = FALSE ) replace_string( .data, ..., pattern = NULL, replacement = "", ignore_case = FALSE ) round_cols(.data, ..., digits = 2) tidy_strings(.data, ..., sep = "_")

Arguments

  • .data: A data frame

  • ...: The argument depends on the function used.

    • For round_cols() ... are the variables to round. If no variable is informed, all the numeric variables from data are used.
    • For all_lower_case(), all_upper_case(), all_title_case(), stract_number(), stract_string(), remove_strings(), and tidy_strings() ... are the variables to apply the function. If no variable is informed, the function will be applied to all non-numeric variables in .data.
  • pattern: A string to be matched. Regular Expression Syntax is also allowed.

  • replacement: A string for replacement.

  • ignore_case: If FALSE (default), the pattern matching is case sensitive and if TRUE, case is ignored during matching.

  • digits: The number of significant figures.

  • sep: A character string to separate the terms. Defaults to "_".

Examples

library(metan) ################ Rounding numbers ############### # All numeric columns round_cols(data_ge2, digits = 1) # Round specific columns round_cols(data_ge2, EP, digits = 1) ########### Extract or replace numbers ########## # Extract numbers extract_number(data_ge, GEN) # Replace numbers replace_number(data_ge, GEN) replace_number(data_ge, GEN, pattern = 1, replacement = "_one") ########## Extract, replace or remove strings ########## # Extract strings extract_string(data_ge, GEN) # Replace strings replace_string(data_ge, GEN) replace_string(data_ge, GEN, pattern = "G", replacement = "GENOTYPE_") # Remove strings remove_strings(data_ge) remove_strings(data_ge, ENV) ############ Find text in numeric sequences ########### mixed_text <- data.frame(data_ge) mixed_text[2, 4] <- "2..503" mixed_text[3, 4] <- "3.2o75" find_text_in_num(mixed_text, GY) ############# upper, lower and title cases ############ gen_text <- c("This is the first string.", "this is the second one") all_lower_case(gen_text) all_upper_case(gen_text) all_title_case(gen_text) first_upper_case(gen_text) # A whole data frame all_lower_case(data_ge) ############### Tidy up messy text string ############## messy_env <- c("ENV 1", "Env 1", "Env1", "env1", "Env.1", "Env_1") tidy_strings(messy_env) messy_gen <- c("GEN1", "gen 2", "Gen.3", "gen-4", "Gen_5", "GEN_6") tidy_strings(messy_gen) messy_int <- c("EnvGen", "Env_Gen", "env gen", "Env Gen", "ENV.GEN", "ENV_GEN") tidy_strings(messy_int) library(tibble) # Or a whole data frame df <- tibble(Env = messy_env, gen = messy_gen, Env_GEN = interaction(Env, gen), y = rnorm(6, 300, 10)) df tidy_strings(df)

Author(s)

Tiago Olivoto tiagoolivoto@gmail.com