compare_groups function

Compare groups

Compare groups

Compares groups by (1) creating histogram by group; (2) summarizing descriptive statistics by group; and (3) conducting pairwise comparisons (t-tests and Mann-Whitney tests).

compare_groups( data = NULL, iv_name = NULL, dv_name = NULL, sigfigs = 3, stats = "basic", welch = TRUE, cohen_d = TRUE, cohen_d_w_ci = TRUE, adjust_p = "holm", bonferroni = NULL, mann_whitney = TRUE, t_test_stats = TRUE, round_p = 3, anova = FALSE, round_f = 2, round_t = 2, round_t_test_df = 2, save_as_png = FALSE, png_name = NULL, xlab = NULL, ylab = NULL, x_limits = NULL, x_breaks = NULL, x_labels = NULL, width = 5000, height = 3600, units = "px", res = 300, layout_matrix = NULL, col_names_nicer = TRUE, convert_dv_to_numeric = TRUE )

Arguments

  • data: a data object (a data frame or a data.table)
  • iv_name: name of the independent variable (grouping variable)
  • dv_name: name of the dependent variable (measure variable of interest)
  • sigfigs: number of significant digits to round to
  • stats: statistics to calculate for each group. If stats = "basic", group size, mean, standard deviation, median, minimum, and maximum will be calculated. If stats = "all", in addition to the aforementioned statistics, standard error, 95% confidence and prediction intervals, skewness, and kurtosis will also be calculated. The stats argument can also be a character vector with types of statistics to calculate. For example, entering stats = c("mean", "median") will calculate mean and median. By default, stats = "basic"
  • welch: Should Welch's t-tests be conducted? By default, welch = TRUE
  • cohen_d: if cohen_d = TRUE, Cohen's d statistics will be included in the pairwise comparison data.table.
  • cohen_d_w_ci: if cohen_d_w_ci = TRUE, Cohen's d with 95% CI will be included in the output data.table.
  • adjust_p: the name of the method to use to adjust p-values. If adjust_p = "holm", the Holm method will be used; if adjust_p = "bonferroni", the Bonferroni method will be used. By default, adjust_p = "holm"
  • bonferroni: The use of this argument is deprecated. Use the 'adjust_p' argument instead. If bonferroni = TRUE, Bonferroni tests will be conducted for t-tests or Mann-Whitney tests.
  • mann_whitney: if TRUE, Mann-Whitney test results will be included in the pairwise comparison data.table. If FALSE, Mann-Whitney tests will not be performed.
  • t_test_stats: if t_test_stats = FALSE, t-test statistic and degrees of freedom will be excluded in the pairwise comparison data.table. (default = TRUE)
  • round_p: number of decimal places to which to round p-values (default = 3)
  • anova: Should a one-way ANOVA be conducted and reported? By default, anova = FALSE, but when there are more than two levels in the independent variable, the value will change such tat anova = TRUE.
  • round_f: number of decimal places to which to round the f statistic (default = 2)
  • round_t: number of decimal places to which to round the t statistic (default = 2)
  • round_t_test_df: number of decimal places to which to round the degrees of freedom for t tests (default = 2)
  • save_as_png: if save_as_png = "all" or if save_as_png = TRUE, the histogram by group, descriptive statistics by group, and pairwise comparison results will be saved as a PNG file.
  • png_name: name of the PNG file to be saved. By default, the name will be "compare_groups_results_" followed by a timestamp of the current time. The timestamp will be in the format, jan_01_2021_1300_10_000001, where "jan_01_2021" would indicate January 01, 2021; 1300 would indicate 13:00 (i.e., 1 PM); and 10_000001 would indicate 10.000001 seconds after the hour.
  • xlab: title of the x-axis for the histogram by group. If xlab = FALSE, the title will be removed. By default (i.e., if no input is given), dv_name will be used as the title.
  • ylab: title of the y-axis for the histogram by group. If ylab = FALSE, the title will be removed. By default (i.e., if no input is given), iv_name will be used as the title.
  • x_limits: a numeric vector with values of the endpoints of the x axis.
  • x_breaks: a numeric vector indicating the points at which to place tick marks on the x axis.
  • x_labels: a vector containing labels for the place tick marks on the x axis.
  • width: width of the PNG file (default = 5000)
  • height: height of the PNG file (default = 3600)
  • units: the units for the width and height arguments. Can be "px" (pixels), "in" (inches), "cm", or "mm". By default, units = "px".
  • res: The nominal resolution in ppi which will be recorded in the png file, if a positive integer. Used for units other than the default. By default, res = 300
  • layout_matrix: The layout argument for arranging plots and tables using the grid.arrange function.
  • col_names_nicer: if col_names_nicer = TRUE, column names will be converted from snake_case to an easier-to-eye format.
  • convert_dv_to_numeric: logical. Should the values in the dependent variable be converted to numeric for plotting the histograms? (default = TRUE)
  • holm: if holm = TRUE, the relevant p values will be adjusted using Holm method (also known as the Holm-Bonferroni or Bonferroni-Holm method)

Returns

the output will be a list of (1) ggplot object (histogram by group) (2) a data.table with descriptive statistics by group; and (3) a data.table with pairwise comparison results. If save_as_png = TRUE, the plot and tables will be also saved on local drive as a PNG file.

Examples

## Not run: compare_groups(data = iris, iv_name = "Species", dv_name = "Sepal.Length") compare_groups(data = iris, iv_name = "Species", dv_name = "Sepal.Length", x_breaks = 4:8) # Welch's t-test compare_groups( data = mtcars, iv_name = "am", dv_name = "hp") # A Student's t-test compare_groups( data = mtcars, iv_name = "am", dv_name = "hp", welch = FALSE) ## End(Not run)