Probabilistic Suffix Trees and Variable Length Markov Chains
Mining contexts
Plot single nodes of a probabilistic suffix tree
Empirical conditional probability distributions of order L
Generate sequences using a probabilistic suffix tree
Impute missing values using a probabilistic suffix tree
Log-Likelihood of a variable length Markov chain model
Extract the number of observations to which a VLMC model is fitted
Retrieve the node labels of a PST
Compute probabilistic divergence between two PST
Plot a PST
PST based pattern mining
Plotting a branch of a probabilistic suffix tree
Prediction quality plot
Compute the probability of categorical sequences using a probabilistic...
Print method for objects of class PSTf
and PSTr
Prune a probabilistic suffix tree
Flat representation of a probabilistic suffix tree
Nested representation of a probabilistic suffix tree
Build a probabilistic suffix tree
Retrieve counts or next symbol probability distribution
Example sequence data set
Longitudinal data on self rated health
Extract a subtree from a segmented PST
Summary of variable length Markov chain model
AIC, AICc or BIC based model selection
Provides a framework for analysing state sequences with probabilistic suffix trees (PST), the construction that stores variable length Markov chains (VLMC). Besides functions for learning and optimizing VLMC models, the PST library includes many additional tools to analyse sequence data with these models: visualization tools, functions for sequence prediction and artificial sequences generation, as well as for context and pattern mining. The package is specifically adapted to the field of social sciences by allowing to learn VLMC models from sets of individual sequences possibly containing missing values, and by accounting for case weights. The library also allows to compute probabilistic divergence between two models, and to fit segmented VLMC, where sub-models fitted to distinct strata of the learning sample are stored in a single PST. This software results from research work executed within the framework of the Swiss National Centre of Competence in Research LIVES, which is financed by the Swiss National Science Foundation. The authors are grateful to the Swiss National Science Foundation for its financial support.