Part of Advances in Neural Information Processing Systems 13 (NIPS 2000)
Thore Graepel, Ralf Herbrich
We present an algorithm that samples the hypothesis space of ker(cid:173) nel classifiers. Given a uniform prior over normalised weight vectors and a likelihood based on a model of label noise leads to a piece(cid:173) wise constant posterior that can be sampled by the kernel Gibbs sampler (KGS). The KGS is a Markov Chain Monte Carlo method that chooses a random direction in parameter space and samples from the resulting piecewise constant density along the line chosen. The KGS can be used as an analytical tool for the exploration of Bayesian transduction, Bayes point machines, active learning, and evidence-based model selection on small data sets that are contam(cid:173) inated with label noise. For a simple toy example we demonstrate experimentally how a Bayes point machine based on the KGS out(cid:173) performs an SVM that is incapable of taking into account label noise.