A Probabilistic Model for Learning Concatenative Morphology

Part of Advances in Neural Information Processing Systems 15 (NIPS 2002)

Bibtex Metadata Paper

Authors

Matthew G. Snover, Michael R. Brent

Abstract

This paper describes a system for the unsupervised learning of morpho- logical suffixes and stems from word lists. The system is composed of a generative probability model and hill-climbing and directed search algo- rithms. By extracting and examining morphologically rich subsets of an input lexicon, the directed search identifies highly productive paradigms. The hill-climbing algorithm then further maximizes the probability of the hypothesis. Quantitative results are shown by measuring the accuracy of the morphological relations identified. Experiments in English and Pol- ish, as well as comparisons with another recent unsupervised morphol- ogy learning algorithm demonstrate the effectiveness of this technique.