Part of Advances in Neural Information Processing Systems 3 (NIPS 1990)
Griff Bilbro, David van den Bout
We apply the theory of Tishby, Levin, and Sol1a (TLS) to two problems. First we analyze an elementary problem for which we find the predictions consistent with conventional statistical results. Second we numerically examine the more realistic problem of training a competitive net to learn a probability density from samples. We find TLS useful for predicting average training behavior.
.
1 TLS APPLIED TO LEARNING DENSITIES
Recently a theory of learning has been constructed which describes the learning of a relation from examples (Tishby, Levin, and Sol1a, 1989), (Schwarb, Samalan, Sol1a, and Denker, 1990). The original derivation relies on a statistical mechanics treatment of the probability of independent events in a system with a specified average value of an additive error function.
The resulting theory is not restricted to learning relations and it is not essentially statistical mechanical. The TLS theory can be derived from the principle of maz(cid:173) imum entropy, a general inference tool which produces probabilities characterized by certain values of the averages of specified functions(Jaynes, 1979). A TLS theory can be constructed whenever the specified function is additive and associated with independent examples. In this paper we treat the problem of learning a probability density from samples. Consider the model as some function p( z Iw) of fixed form and adjustable parameters w which are to be chosen to approximate 1'(z) where the overline denotes the true density. All we know about l' are the elements of a training set T which are drawn