prune function

Prune k-gram objects

Prune k-gram objects

Prune M-gram frequency tables or Stupid Back-Off prediction tables for an M-gram model to a smaller order N.

prune(object, N, ...) ## S3 method for class 'sbo_kgram_freqs' prune(object, N, ...) ## S3 method for class 'sbo_predtable' prune(object, N, ...)

Arguments

  • object: A kgram_freqs or a sbo_predtable class object.
  • N: a length one positive integer. N-gram order of the new object.
  • ...: further arguments passed to or from other methods.

Returns

an object of the same class of the input object.

Details

This generic function provides a helper to prune M-gram frequency tables or M-gram models, represented by sbo_kgram_freqs and sbo_predtable objects respectively, to objects of a smaller N-gram order, N < M. For k-gram frequency objects, frequency tables for k > N are simply dropped. For sbo_predtable's, the predictions coming from the nested N-gram model are instead retained. In both cases, all other other attributes besides k-gram order (such as the corpus preprocessing function, or the lambda penalty in Stupid Back-Off training) are left unchanged.

Examples

# Drop k-gram frequencies for k > 2 freqs <- twitter_freqs summary(freqs) freqs <- prune(freqs, N = 2) summary(freqs) # Extract a 2-gram model from a larger 3-gram model pt <- twitter_predtable summary(pt) pt <- prune(pt, N = 2) summary(pt)

Author(s)

Valerio Gherardi