modelStatistics() R function from [ndl]

Calculate a range of goodness of fit measures for an object fitted with some multivariate statistical method that yields probability estimates for outcomes.

modelStatistics calculates a range of goodness of fit measures.


modelStatistics(observed, predicted, frequency=NA, p.values,
     n.data, n.predictors, outcomes=levels(as.factor(observed)),
     p.normalize=TRUE, cross.tabulation=TRUE, 
     p.zero.correction=1/(NROW(p.values)*NCOL(p.values))^2)

Arguments

observed: observed values of the response variable
predicted: predicted values of the response variable; typically the outcome estimated to have the highest probability
frequency: frequencies of observed and predicted values; if NA, frequencies equal to 1 for all observed and predicted values
p.values: matrix of probabilities for all values of the response variable (i.e outcomes)
n.data: sum frequency of data points in model
n.predictors: number of predictor levels in model
outcomes: a vector with the possible values of the response variable
p.normalize: if TRUE, probabilities are normalized so that sum(P) of all outcomes for each datapoint is equal to 1
cross.tabulation: if TRUE, statistics on the crosstabulation of observed and predicted response values are calculated with crosstableStatistics
p.zero.correction: a function to adjust slightly response/outcome-specific probability estimates which are exactly P=0; necessary for the proper calculation of pseudo-R-squared statistics; by default calculated on the basis of the dimensions of the matrix of probabilities p.values.

Returns

A list with the following components:

loglikelihood.null: Loglikelihood for null model
loglikelihood.model: Loglikelihood for fitted model
deviance.null: Null deviance
deviance.model: Model deviance
R2.likelihood: (McFadden's) R-squared
R2.nagelkerke: Nagelkerke's R-squared
AIC.model: Akaike's Information Criterion
BIC.model: Bayesian Information Criterion
C: index of concordance C (for binary response variables only)
crosstable: Crosstabulation of observed and predicted outcomes, if cross.tabulation=TRUE
crosstableStatistics(crosstable): Various statistics calculated on crosstable with crosstableStatistics, if cross.tabulation=TRUE

References

Arppe, A. 2008. Univariate, bivariate and multivariate methods in corpus-based lexicography -- a study of synonymy. Publications of the Department of General Linguistics, University of Helsinki, No. 44. URN: http://urn.fi/URN:ISBN:978-952-10-5175-3.

Arppe, A., and Baayen, R. H. (in prep.) Statistical modeling and the principles of human learning.

Hosmer, David W., Jr., and Stanley Lemeshow 2000. Applied Regression Analysis (2nd edition). New York: Wiley.

Author(s)

Antti Arppe and Harald Baayen

Examples


data(think)
think.ndl <- ndlClassify(Lexeme ~ Agent + Patient, data=think)
probs <- acts2probs(think.ndl$activationMatrix)$p
preds <- acts2probs(think.ndl$activationMatrix)$predicted
n.data <- nrow(think)
n.predictors <- nrow(think.ndl$weightMatrix) *
   ncol(think.ndl$weightMatrix)
modelStatistics(observed=think$Lexeme, predicted=preds, p.values=probs,
   n.data=n.data, n.predictors=n.predictors)

ndl package Read PDF manual

Maintainer: Tino Sering
License: GPL-3
Last published: 2018-09-10

Useful links

modelStatistics function