weblmCalculateJointProbability function

Calculates the joint probability that a sequence of words will appear together.

Calculates the joint probability that a sequence of words will appear together.

This function calculates the joint probability that a particular sequence of words will appear together. The input string must be in ASCII format.

Internally, this function invokes the Microsoft Cognitive Services Web Language Model REST API documented at https://www.microsoft.com/cognitive-services/en-us/web-language-model-api/documentation.

You MUST have a valid Microsoft Cognitive Services account and an API key for this function to work properly. See https://www.microsoft.com/cognitive-services/en-us/pricing

for details.

weblmCalculateJointProbability(inputWords, modelToUse = "body", orderOfNgram = 5L)

Arguments

  • inputWords: (character vector) Vector of character strings for which to calculate the joint probability. Must be in ASCII format.
  • modelToUse: (character) Which language model to use, supported values: "title", "anchor", "query", or "body" (optional, default: "body")
  • orderOfNgram: (integer) Which order of N-gram to use, supported values: 1L, 2L, 3L, 4L, or 5L (optional, default: 5L)

Returns

An S3 object of the class weblm. The results are stored in the results dataframe inside this object. The dataframe contains the word sequences and their log(probability).

Examples

## Not run: tryCatch({ # Calculate joint probability a particular sequence of words will appear together jointProbabilities <- weblmCalculateJointProbability( inputWords = c("where", "is", "San", "Francisco", "where is", "San Francisco", "where is San Francisco"), # ASCII only modelToUse = "query", # "title"|"anchor"|"query"(default)|"body" orderOfNgram = 4L # 1L|2L|3L|4L|5L(default) ) # Class and structure of jointProbabilities class(jointProbabilities) #> [1] "weblm" str(jointProbabilities, max.level = 1) #> List of 3 #> $ results:'data.frame': 7 obs. of 2 variables: #> $ json : chr "{"results":[{"words":"where","probability":-3.378}, __truncated__ ]} #> $ request:List of 7 #> ..- attr(*, "class")= chr "request" #> - attr(*, "class")= chr "weblm" # Print results pandoc.table(jointProbabilities$results) #> ------------------------------------ #> words probability #> ---------------------- ------------- #> where -3.378 #> #> is -2.607 #> #> san -3.292 #> #> francisco -4.051 #> #> where is -3.961 #> #> san francisco -4.086 #> #> where is san francisco -7.998 #> ------------------------------------ }, error = function(err) { # Print error geterrmessage() }) ## End(Not run)

Author(s)

Phil Ferriere pferriere@hotmail.com

  • Maintainer: Phil Ferriere
  • License: MIT + file LICENSE
  • Last published: 2016-06-15