weblmCalculateConditionalProbability function

Calculates the conditional probability that a word follows a sequence of words.

Calculates the conditional probability that a word follows a sequence of words.

This function calculates the conditional probability that a particular word will follow a given sequence of words. The input string must be in ASCII format.

Internally, this function invokes the Microsoft Cognitive Services Web Language Model REST API documented at https://www.microsoft.com/cognitive-services/en-us/web-language-model-api/documentation.

You MUST have a valid Microsoft Cognitive Services account and an API key for this function to work properly. See https://www.microsoft.com/cognitive-services/en-us/pricing

for details.

weblmCalculateConditionalProbability(precedingWords, continuations, modelToUse = "body", orderOfNgram = 5L)

Arguments

  • precedingWords: (character) Character string for which to calculate continuation probabilities. Must be in ASCII format.
  • continuations: (character vector) Vector of words following precedingWords for which to calculate conditional probabilities.
  • modelToUse: (character) Which language model to use, supported values: "title", "anchor", "query", or "body" (optional, default: "body")
  • orderOfNgram: (integer) Which order of N-gram to use, supported values: 1L, 2L, 3L, 4L, or 5L (optional, default: 5L)

Returns

An S3 object of the class weblm. The results are stored in the results dataframe inside this object. The dataframe contains the continuation words and their log(probability).

Examples

## Not run: tryCatch({ # Calculate conditional probability a particular word will follow a given sequence of words conditionalProbabilities <- weblmCalculateConditionalProbability( precedingWords = "hello world wide", # ASCII only continuations = c("web", "range", "open"), # ASCII only modelToUse = "title", # "title"|"anchor"|"query"(default)|"body" orderOfNgram = 4L # 1L|2L|3L|4L|5L(default) ) # Class and structure of conditionalProbabilities class(conditionalProbabilities) #> [1] "weblm" str(conditionalProbabilities, max.level = 1) #> List of 3 #> $ results:'data.frame': 3 obs. of 3 variables: #> $ json : chr "{"results":[{"words":"hello world wide","word":"web", __truncated__ }]} #> $ request:List of 7 #> ..- attr(*, "class")= chr "request" #> - attr(*, "class")= chr "weblm" # Print results pandoc.table(conditionalProbabilities$results) #> ------------------------------------- #> words word probability #> ---------------- ------ ------------- #> hello world wide web -0.32 #> #> hello world wide range -2.403 #> #> hello world wide open -2.97 #> ------------------------------------- }, error = function(err) { # Print error geterrmessage() }) ## End(Not run)

Author(s)

Phil Ferriere pferriere@hotmail.com

  • Maintainer: Phil Ferriere
  • License: MIT + file LICENSE
  • Last published: 2016-06-15