Create a 'violin plot' or density plot of the distribution of a set of variables
Create a 'violin plot' or density plot of the distribution of a set of variables
Among the many ways to describe a data set, one is a density plot for each value of a grouping variable and another is violin plot of multiple variables. A density plot shows the density for different groups to show effect sizes. A violin plot is similar to a box plot but shows the actual distribution. Median and 25th and 75th percentile lines are added to the display. If a grouping variable is specified, violinBy will draw violin plots for each variable and for each group. Data points may be drawn as well in what is known as a "raincloud plot".
x: A matrix or data.frame (can be expressed in formula input)
var: The variable(s) to display
grp: The grouping variable(s)
data: The name of the data object if using formula input
grp.name: If the grouping variable is specified, then what names should be give to the group? Defaults to 1:ngrp
ylab: The y label
xlab: The x label
main: Figure title
vertical: If TRUE, plot the violins vertically, otherwise, horizontonally
dots: if TRUE, add a stripchart with the data points
rain: If TRUE, draw a half violin with rain drops
jitter: If doing a stripchart, then jitter the points this much
errors: If TRUE, add error bars or cats eyes to the violins
eyes: if TRUE and errors=TRUE, then draw cats eyes
alpha: A degree of transparency (0=transparent ... 1 not transparent)
adjust: Allows smoothing of density histograms when plotting variables like height
freq: if TRUE, then plot frequencies (n * density)
restrict: Restrict the density to the observed max and min of the data
xlim: if not specified, will be .5 beyond the number of variables
ylim: If not specified, determined by the data
add: Allows overplotting
col: Allows for specification of colours. The default for 2 groups is blue and red, for more group levels, rainbows.
pch: The plot character for the mean is by default a small filled circle. To not show the mean, use pch=NA
scale: If NULL, scale the widths by the square root of sample size, otherwise scale by the value supplied.
legend: If not NULL, draw a legend at c(topleft,topright,top,left,right)
...: Other graphic parameters
Details
Describe the data using a violin plot. Change alpha to modify the shading. The grp variable may be used to draw separate violin plots for each of multiple groups.
For relatively smallish data sets (< 500-1000), it is informative to also show the actual data points. This done with the dots=TRUE option. The jitter value is arbitrarily set to .05, but making it larger (say .1 or .2) will display more points.
Perhaps even prettier, is draw "raincloud" plots (half violins with rain drops)
Returns
The density (y axis) by value (x axis) of the data (for densityBy) or a violin plot for each variable (perhaps broken down by groups)
Author(s)
William Revelle
Note
Formula input added July 12, 2020
See Also
describe, describeBy and statsBy for descriptive statistics and error.bars, error.bars.by and bi.bars, histBy and scatterHist for other graphic displays
Examples
violin(bfi[4:8])violin(SATV + SATQ ~ gender, data=sat.act, grp.name =cs(MV,FV,MQ,FQ))#formula inputviolinBy(bfi,var=4:7,grp ="gender",grp.name=c("M","F"))#rain does not work for multiple DVS, just up to 2 IVs violinBy(SATV ~ education,data =sat.act, rain=TRUE, pch=".", vertical=FALSE)#rain densityBy(SATV ~ gender,data =sat.act,legend=1)#formula input