Part of Advances in Neural Information Processing Systems 18 (NIPS 2005)
Yoshua Bengio, Nicolas Roux, Pascal Vincent, Olivier Delalleau, Patrice Marcotte
Convexity has recently received a lot of attention in the machine learning community, and the lack of convexity has been seen as a major disadvantage of many learning algorithms, such as multi-layer artificial neural networks. We show that training multi-layer neural networks in which the number of hidden units is learned can be viewed as a convex optimization problem. This problem involves an infinite number of variables, but can be solved by incrementally inserting a hidden unit at a time, each time finding a linear classifier that minimizes a weighted sum of errors.