Fast Fuzzy String Joins for Data Frames
Fuzzy anti join
Fuzzy full join
Fuzzy inner join
Fuzzy left join
Fuzzy right join
Fuzzy semi join
Fuzzy join backend using 'data.table' + 'C++' row binding
Join two tables based on fuzzy string matching
fuzzystring: Fast Fuzzy String Joins for Data Frames
Perform fuzzy joins on data frames using approximate string matching. Implements all standard join types (inner, left, right, full, semi, anti) with support for multiple string distance metrics from the 'stringdist' package including Levenshtein, Damerau-Levenshtein, Jaro-Winkler, and Soundex. Features a high-performance 'data.table' backend with 'C++' row binding for efficient processing of large datasets. Ideal for matching misspellings, inconsistent labels, messy user input, or reconciling datasets with slight variations in identifiers. Optionally returns distance metrics alongside matched records.
Useful links