Part of Advances in Neural Information Processing Systems 11 (NIPS 1998)
Andrew Moore
Clust ering is impor ta nt in m any fields including m anufac tlll'ing , biol og~', fin ance , a nd astronomy. l\Iixturp models arp a popula r ap(cid:173) proach due to their st.atist.ical found a t.ions, and EM is a very pop(cid:173) ular l1wthocl for fillding mixture models. EM, however, requires lllany accesses of the dat a , a nd thus h as been dismissed as imprac(cid:173) t ical (e.g. [9]) for d ata mining of enormous dataset.s. We present a nt' \ยท algorit.hm, baspd on thp l1lultiresolution ~.'Cl- trees of [5] , which dramatically reelucps the cost of EtlI-baspd clusteriug , wit.h savings rising linearl:; wit.h the number of datapoints. Although prespnt.pd lwre for maximum likplihoocl estimation of Gaussian mixt.ure mod(cid:173) f'ls , it. is also applicable to non-(~aussian models (provided class densit.ies are monotonic in Mahalanobis dist.ance), mixed categori(cid:173) cal/ nUllwric clusters. anel Bayesian nwthocls such as Antoclass [1].
1 Learning Mixture Models In a Gaussian mixture lllod f'l (e.g. [3]) , we aSSUI1W t.hat d ata points {Xl .. . XR} ha\'p bef'n gelw r . where 8 den otps all the parameters of the mixture: the class probabilities Vi (wlwre Vi = P(Cj 18)) , the class centers fl j and the class covariances ~j' Tlw job of a mixture m odel learn er is to find a good estimat e of t.he modeL and Expectation MaximizRtion (EM) , also known a::l "Fuzzy ~'-me a n::l", i::l a popular 544