descr function

Univariate Statistics for Numerical Data

Univariate Statistics for Numerical Data

Calculates mean, sd, min, Q1*, median, Q3*, max, MAD, IQR*, CV, skewness*, SE.skewness*, and kurtosis* on numerical vectors. (*) Not available when using sampling weights.

descr( x, var = NULL, stats = st_options("descr.stats"), na.rm = TRUE, round.digits = st_options("round.digits"), transpose = st_options("descr.transpose"), order = "sort", style = st_options("style"), plain.ascii = st_options("plain.ascii"), justify = "r", headings = st_options("headings"), display.labels = st_options("display.labels"), split.tables = 100, weights = NULL, rescale.weights = FALSE, ... )

Arguments

  • x: A numerical vector or a data frame.
  • var: Unquoted expression referring to a specific column in x. Provides support for piped function calls (e.g. my_df |> descr(my_var).
  • stats: Character. Which stats to produce. Either all (default), fivenum , common (see Details), or a selection of : mean , sd , min , q1 , med , q3 , max , mad , iqr , cv , skewness , se.skewness , kurtosis , n.valid , n , and pct.valid . Can be set globally via st_options, option descr.stats . See Details.
  • na.rm: Logical. Argument to be passed to statistical functions. Defaults to TRUE.
  • round.digits: Numeric. Number of significant digits to display. Defaults to 2. Can be set globally with st_options.
  • transpose: Logical. Make variables appears as columns, and stats as rows. Defaults to FALSE. Can be set globally with st_options, option descr.transpose .
  • order: Character. When analyzing more than one variable, this parameter determines how to order variables. Valid values are sort (or simply s ), preserve (or p ), or a vector containing all variable names in the desired order. Defaults to sort .
  • style: Character. Style to be used by pander. One of simple (default), grid , rmarkdown , or jira . Can be set globally with st_options.
  • plain.ascii: Logical. pander argument; when TRUE (default), no markup characters will be used (useful when printing to console). If style = 'rmarkdown' is specified, value is set to FALSE automatically. Can be set globally using st_options.
  • justify: Character. Alignment of numbers in cells; l for left, c for center, or r for right (default). Has no effect on html tables.
  • headings: Logical. Set to FALSE to omit heading section. Can be set globally via st_options. TRUE by default.
  • display.labels: Logical. Show variable / data frame labels in heading section. Defaults to TRUE. Can be set globally with st_options.
  • split.tables: Character. pander argument that specifies how many characters wide a table can be. 100 by default.
  • weights: Numeric. Vector of weights having same length as x. NULL (default) indicates that no weights are used.
  • rescale.weights: Logical. When set to TRUE, a global constant is apply to make the total count equal nrow(x). FALSE by default.
  • ...: Additional arguments passed to pander or format.

Returns

An object having classes matrix and summarytools containing the statistics, with extra attributes useful to other functions/methods.

Details

Since version 1.1, the stats argument can be set in a more flexible way; keywords (all, common, fivenum) can be combined with single statistics, or their negation . For instance, using stats = c("all", "-q1", "-q3") would show all except q1 and q3 .

For further customization, you could redefine any preset in the following manner: .st_env$descr.stats$common <- c("mean", "sd", "n"). Use caution when modifying .st_env, and reload the package if errors ensue. Changes are temporary and will not persist across R sessions.

Examples

data("exams") # All stats (default behavior) for all numerical variables descr(exams) # Show only "common" statistics, plus "n" descr(exams, stats = c("common", "n")) # Selection of statistics, transposing the results descr(exams, stats = c("mean", "sd", "min", "max"), transpose = TRUE) # Rmarkdown-ready descr(exams, plain.ascii = FALSE, style = "rmarkdown") # Grouped statistics data("tobacco") with(tobacco, stby(BMI, gender, descr, check.nas = FALSE)) # Grouped statistics in tidy table: tb(with(tobacco, stby(BMI, age.gr, descr, stats = "common"))) ## Not run: # Show in Viewer (or browser if not in RStudio) view(descr(exams)) # Save to html file with title print(descr(exams), file = "descr_exams.html", report.title = "BMI by Age Group", footnote = "<b>Schoolyear:</b> 2018-2019<br/><b>Semester:</b> Fall") ## End(Not run)

Author(s)

Dominic Comtois, dominic.comtois@gmail.com