Create a Stan data list from an item response matrix or from long-form data.
Create a Stan data list from an item response matrix or from long-form data.
This function prepares item response data, creating a data list that may be passed to irt_stan.
irt_data( response_matrix = matrix(), y = integer(), ii = integer(), jj = integer(), covariates = data.frame(), formula =NULL, integerize =TRUE, validate_regression =TRUE)
Arguments
response_matrix: An item response matrix. Columns represent items and rows represent persons. NA may be supplied for missing responses. The lowest score for each item should be 0, with exception to rating scale models. y, ii, and jj should not be supplied if a response matrix is given.
y: A vector of scored responses for long-form data. The lowest score for each item should be 0, with exception to rating scale models. NAs are not permitted, but missing responses may simply be omitted instead. Required if response_matrix is not supplied.
ii: A vector indexing the items in y. This must consist of consecutive integers starting at 1. labelled_integer may be used to create a suitable vector. Required if response_matrix is not supplied.
jj: A vector indexing the persons in y. This must consist of consecutive integers starting at 1. labelled_integer may be used to create a suitable vector. Required if response_matrix is not supplied.
covariates: An optional data frame containing (only) person-covariates. It must contain one row per person or be of the same length as y, ii, and jj. If it contains one row per person, it must be in the same order as the response matrix (or unique(jj)). If it has a number of columns equal to the length of y, ii, and jj, it must be in the same order as jj (for example, it may be a subset of columns from the same data frame that contains y, ii, and jj).
formula: An optional formula for the latent regression that is applied to covariates. The left side should be blank (for example, ~ v1 + v2). By default it includes only a model intercept, which then represents the mean of the person distribution. If set to NULL (default), then covariates is used directly as the design matrix for the latent regression.
integerize: Whether to apply labelled_integer to ii and jj. Defaults to TRUE, which should be the case unless the inputs are already consecutive integers.
validate_regression: Whether to check the latent regression equation and covariates for compatibility with the prior distributions for the coefficients. Defaults to TRUE and throws a warning if problems are identified.
Returns
A data list suitable for irt_stan.
Examples
# For a response matrix ("wide-form" data) with person covariates:spelling_list <- irt_data(response_matrix = spelling[,2:5], covariates = spelling[,"male", drop =FALSE], formula =~ rescale_binary(male))# For long-form data (one row per item-person pair):agg_list_1 <- irt_data(y = aggression$poly, ii = aggression$item, jj = aggression$person)# Add a latent regression and use labelled_integer() with the itemsagg_list_2 <- irt_data(y = aggression$poly, ii = labelled_integer(aggression$description), jj = aggression$person, covariates = aggression[, c("male","anger")], formula =~1+ rescale_continuous(male)*rescale_continuous(anger))
See Also
See labelled_integer for a means of creating appropriate inputs for ii and jj. See irt_stan to fit a model to the data list.