Compares groups by (1) creating histogram by group; (2) summarizing descriptive statistics by group; and (3) conducting pairwise comparisons (t-tests and Mann-Whitney tests).
data: a data object (a data frame or a data.table)
iv_name: name of the independent variable (grouping variable)
dv_name: name of the dependent variable (measure variable of interest)
sigfigs: number of significant digits to round to
stats: statistics to calculate for each group. If stats = "basic", group size, mean, standard deviation, median, minimum, and maximum will be calculated. If stats = "all", in addition to the aforementioned statistics, standard error, 95% confidence and prediction intervals, skewness, and kurtosis will also be calculated. The stats argument can also be a character vector with types of statistics to calculate. For example, entering stats = c("mean", "median") will calculate mean and median. By default, stats = "basic"
welch: Should Welch's t-tests be conducted? By default, welch = TRUE
cohen_d: if cohen_d = TRUE, Cohen's d statistics will be included in the pairwise comparison data.table.
cohen_d_w_ci: if cohen_d_w_ci = TRUE, Cohen's d with 95% CI will be included in the output data.table.
adjust_p: the name of the method to use to adjust p-values. If adjust_p = "holm", the Holm method will be used; if adjust_p = "bonferroni", the Bonferroni method will be used. By default, adjust_p = "holm"
bonferroni: The use of this argument is deprecated. Use the 'adjust_p' argument instead. If bonferroni = TRUE, Bonferroni tests will be conducted for t-tests or Mann-Whitney tests.
mann_whitney: if TRUE, Mann-Whitney test results will be included in the pairwise comparison data.table. If FALSE, Mann-Whitney tests will not be performed.
t_test_stats: if t_test_stats = FALSE, t-test statistic and degrees of freedom will be excluded in the pairwise comparison data.table. (default = TRUE)
round_p: number of decimal places to which to round p-values (default = 3)
anova: Should a one-way ANOVA be conducted and reported? By default, anova = FALSE, but when there are more than two levels in the independent variable, the value will change such tat anova = TRUE.
round_f: number of decimal places to which to round the f statistic (default = 2)
round_t: number of decimal places to which to round the t statistic (default = 2)
round_t_test_df: number of decimal places to which to round the degrees of freedom for t tests (default = 2)
save_as_png: if save_as_png = "all" or if save_as_png = TRUE, the histogram by group, descriptive statistics by group, and pairwise comparison results will be saved as a PNG file.
png_name: name of the PNG file to be saved. By default, the name will be "compare_groups_results_" followed by a timestamp of the current time. The timestamp will be in the format, jan_01_2021_1300_10_000001, where "jan_01_2021" would indicate January 01, 2021; 1300 would indicate 13:00 (i.e., 1 PM); and 10_000001 would indicate 10.000001 seconds after the hour.
xlab: title of the x-axis for the histogram by group. If xlab = FALSE, the title will be removed. By default (i.e., if no input is given), dv_name will be used as the title.
ylab: title of the y-axis for the histogram by group. If ylab = FALSE, the title will be removed. By default (i.e., if no input is given), iv_name will be used as the title.
x_limits: a numeric vector with values of the endpoints of the x axis.
x_breaks: a numeric vector indicating the points at which to place tick marks on the x axis.
x_labels: a vector containing labels for the place tick marks on the x axis.
width: width of the PNG file (default = 5000)
height: height of the PNG file (default = 3600)
units: the units for the width and height arguments. Can be "px" (pixels), "in" (inches), "cm", or "mm". By default, units = "px".
res: The nominal resolution in ppi which will be recorded in the png file, if a positive integer. Used for units other than the default. By default, res = 300
layout_matrix: The layout argument for arranging plots and tables using the grid.arrange function.
col_names_nicer: if col_names_nicer = TRUE, column names will be converted from snake_case to an easier-to-eye format.
convert_dv_to_numeric: logical. Should the values in the dependent variable be converted to numeric for plotting the histograms? (default = TRUE)
holm: if holm = TRUE, the relevant p values will be adjusted using Holm method (also known as the Holm-Bonferroni or Bonferroni-Holm method)
Returns
the output will be a list of (1) ggplot object (histogram by group) (2) a data.table with descriptive statistics by group; and (3) a data.table with pairwise comparison results. If save_as_png = TRUE, the plot and tables will be also saved on local drive as a PNG file.