summarizeNumerics function

Extracts numeric variables and presents an summary in a workable format.

Extracts numeric variables and presents an summary in a workable format.

Finds the numeric variables, and ignores the others. (See summarizeFactors for a function that handles non-numeric variables). It will provide quantiles (specified probs as well as other summary statistics, specified stats. Results are returned in a data frame. The main benefits from this compared to R's default summary are 1) more summary information is returned for each variable (dispersion), 2) the results are returned in a form that is easy to use in further analysis, 3) the variables in the output may be alphabetized.

summarizeNumerics( dat, alphaSort = FALSE, probs = c(0, 0.5, 1), stats = c("mean", "sd", "skewness", "kurtosis", "nobs", "nmiss"), na.rm = TRUE, unbiased = TRUE, digits = 2 )

Arguments

  • dat: a data frame or a matrix

  • alphaSort: If TRUE, the columns are re-organized in alphabetical order. If FALSE, they are presented in the original order.

  • probs: Controls calculation of quantiles (see the R quantile function's probs argument). If FALSE or NULL, no quantile estimates are provided. Default is c("min" = 0, "med" = 0.5, "max" = 1.0), which will appear in output as c("min", "med", "max"). Other values between 0 and 1 are allowed. For example, c(0.3, 0.7)

    will appear in output as pctile_30% and pctile_70%.

  • stats: A vector including any of these: c("min", "med", "max", "mean", "sd", "var", "skewness", "kurtosis","nobs", "nmiss"). Default includes all except var. "nobs" means number of observations with non-missing, finite scores (not NA, NaN, -Inf, or Inf). "nmiss" is the number of cases with values of NA. If FALSE or NULL, provide none of these.

  • na.rm: default TRUE. Should missing data be removed to calculate summaries?

  • unbiased: If TRUE (default), skewness and kurtosis are calculated with biased corrected (N-1) divisor in the standard deviation.

  • digits: Number of digits reported after decimal point. Default is 2

Returns

a data.frame with one column per summary element (rows are the variables).

See Also

summarize and summarizeFactors

Author(s)

Paul E. Johnson pauljohn@ku.edu

  • Maintainer: Paul E. Johnson
  • License: GPL (>= 3.0)
  • Last published: 2022-08-06

Useful links