A Daitch-Mokotoff Soundex Function for R

April 2021

An online paper by team members Anton Perdoncin & Pierre Mercklé


Aidelman, Ajdelman, Edelman, Ejdelman; Morgenstern, Morgensztern, Morgiensztern; Raizl, Rachel, Ruchla, Rajzla, Rechla; Leibush, Lejbus, Lejbusz: these four lists of patronyms and first names sound the same, but are not spelled identically. How is it possible to detect automatically the phonetic correspondence between orthographic variants of the same names?

Anton Perdoncin & Pierre Mercklé propose a function in R to convert names into soundex codes, according to Daitch-Mokotoff rules.

The function is available in a new R package datatools (under development) that can be downloaded on GitHub : https://github.com/pmerckle/datatools.

