Compute element-wise string distances between two H2OFrames
Compute element-wise string distances between two H2OFrames
Compute element-wise string distances between two H2OFrames. Both frames need to have the same shape (N x M) and only contain string/factor columns. Return a matrix (H2OFrame) of shape N x M.
h2o.stringdist( x, y, method = c("lv","lcs","qgram","jaccard","jw","soundex"), compare_empty =TRUE)
Arguments
x: An H2OFrame
y: A comparison H2OFrame
method: A string identifier indicating what string distance measure to use. Must be one of: "lv" - Levenshtein distance "lcs" - Longest common substring distance "qgram" - q-gram distance "jaccard" - Jaccard distance between q-gram profiles "jw" - Jaro, or Jaro-Winker distance "soundex" - Distance based on soundex encoding
compare_empty: if set to FALSE, empty strings will be handled as NaNs
Examples
## Not run:h2o.init()x <- as.h2o(c("Martha","Dwayne","Dixon"))y <- as.character(as.h2o(c("Marhta","Duane","Dicksonx")))h2o.stringdist(x, y, method ="jw")## End(Not run)