Spectral Relaxation for K-means Clustering

Part of Advances in Neural Information Processing Systems 14 (NIPS 2001)

Bibtex Metadata Paper

Authors

Hongyuan Zha, Xiaofeng He, Chris Ding, Ming Gu, Horst D. Simon

Abstract

The popular K-means clustering partitions a data set by minimiz(cid:173) ing a sum-of-squares cost function. A coordinate descend method is then used to find local minima. In this paper we show that the minimization can be reformulated as a trace maximization problem associated with the Gram matrix of the data vectors. Furthermore, we show that a relaxed version of the trace maximization problem possesses global optimal solutions which can be obtained by com(cid:173) puting a partial eigendecomposition of the Gram matrix, and the cluster assignment for each data vectors can be found by comput(cid:173) ing a pivoted QR decomposition of the eigenvector matrix. As a by-product we also derive a lower bound for the minimum of the sum-of-squares cost function.