Soundex was designed for a very specific purpose. It is very culture-dependent a...

Declanomous · on Oct 21, 2016

Soundex works fine as part of a larger process, especially when combined with other kinds of normalization. You need a human to make the final judgement on matches. In the course of a year I have to match 100k names to names in a database of 850k people. Soundex is great for flagging names that might match, or for flagging matches that might be incorrect. I use Soundex in combination with NYSIIS, double metaphone, lists of normally confused names, etc. Before I created our current matching process, we were creating approximately 5-10k duplicate records a year.

Quick edit: Our data sources are handwritten and typed names, often transcribed by a second party. So algorithms that detect transposition errors as well as phonetic errors are really helpful.

trentnelson · on Oct 22, 2016

I've used a Python implementation of soundex() in a production data mining app to help resolve things like ECQUADOR->ECUADOR. Worked well (as an entity resolution mechanism among many others).