Statistics and Data Sets for Corpus Frequency Data
Random samples from data frames (corpora)
Split string into words, similar to qw() in Perl (corpora)
Propagate vector to single-row or single-column matrix (corpora)
Compute association scores for collocation analysis (corpora)
P-values of the binomial test for frequency counts (corpora)
P-values of Pearson's chi-squared test for frequency comparisons (corp...
Pearson's chi-squared statistic for frequency comparisons (corpora)
Build contingency tables for frequency comparison (corpora)
corpora: Statistical Inference from Corpus Frequency Data
Colour palettes for linguistic visualization (corpora)
P-values of Fisher's exact test for frequency comparisons (corpora)
Compute best-practice keyness measures (corpora)
Confidence interval for proportion based on frequency counts (corpora)
Simulated census data for examples and illustrations (corpora)
Simulated study on effectiveness of language course (corpora)
Simulated type and token counts for Wikipedia articles (corpora)
Show p-values as significance stars (corpora)
P-values of the z-score test for frequency counts (corpora)
The z-score statistic for frequency counts (corpora)
Utility functions for the statistical analysis of corpus frequency data. This package is a companion to the open-source course "Statistical Inference: A Gentle Introduction for Computational Linguists and Similar Creatures" ('SIGIL').