This implementation contrasts the empirical distribution of a measurement variables against assumed distributions. The approach is adapted from the idea of rootograms (Tukey 1977) which is also applicable for count data (Kleiber and Zeileis 2016).
resp_vars: variable the name of the continuous measurement variable
study_data: data.frame the data frame that contains the measurements
label_col: variable attribute the name of the column in the metadata with labels of variables
item_level: data.frame the data frame that contains metadata attributes of study data
dist_col: variable attribute the name of the variable attribute in meta_data that provides the expected distribution of a study variable
guess: logical estimate parameters
par1: numeric first parameter of the distribution if applicable
par2: numeric second parameter of the distribution if applicable
end_digits: logical internal use. check for end digits preferences
flip_mode: enum default | flip | noflip | auto. Should the plot be in default orientation, flipped, not flipped or auto-flipped. Not all options are always supported. In general, this con be controlled by setting the roptions(dataquieR.flip_mode = ...). If called from dq_report, you can also pass flip_mode to all function calls or set them specifically using specific_args.
meta_data: data.frame old name for item_level
meta_data_v2: character path to workbook like metadata file, see prep_load_workbook_like_file for details. ALL LOADED DATAFRAMES WILL BE PURGED , using prep_purge_data_frame_cache, if you specify meta_data_v2.
Returns
a list with:
ResultData: data.frame underlying the plot
SummaryPlot: ggplot2::ggplot2 probability distribution plot
SummaryTable: data.frame with the columns Variables and FLG_acc_ud_shape
ALGORITHM OF THIS IMPLEMENTATION:
This implementation is restricted to data of type float or integer.
Missing codes are removed from resp_vars (if defined in the metadata)
The user must specify the column of the metadata containing probability distribution (currently only: normal, uniform, gamma)
Parameters of each distribution can be estimated from the data or are specified by the user
A histogram-like plot contrasts the empirical vs. the technical distribution