Part of Advances in Neural Information Processing Systems 9 (NIPS 1996)
David J. Miller, Hasan Uyar
We address statistical classifier design given a mixed training set con(cid:173) sisting of a small labelled feature set and a (generally larger) set of unlabelled features. This situation arises, e.g., for medical images, where although training features may be plentiful, expensive expertise is re(cid:173) quired to extract their class labels. We propose a classifier structure and learning algorithm that make effective use of unlabelled data to im(cid:173) prove performance. The learning is based on maximization of the total data likelihood, i.e. over both the labelled and unlabelled data sub(cid:173) sets. Two distinct EM learning algorithms are proposed, differing in the EM formalism applied for unlabelled data. The classifier, based on a joint probability model for features and labels, is a "mixture of experts" structure that is equivalent to the radial basis function (RBF) classifier, but unlike RBFs, is amenable to likelihood-based training. The scope of application for the new method is greatly extended by the observation that test data, or any new data to classify, is in fact additional, unlabelled data - thus, a combined learning/classification operation - much akin to what is done in image segmentation - can be invoked whenever there is new data to classify. Experiments with data sets from the UC Irvine database demonstrate that the new learning algorithms and structure achieve substantial performance gains over alternative approaches.