Gaussian parsimonious clustering models

作者:

Highlights:

摘要

Gaussian clustering models are useful both for understanding and suggesting powerful criteria. Banfield and Raftery, Biometriks 49, 803–821 (1993), have considered a parameterization of the variance matrix Σk of a cluster Pk in terms of its eigenvalue decomposition, Σk = λkDkAkDk′ where λk defines the volume of Pk, Dk is an orthogonal matrix which defines its orientation and Ak is a diagonal matrix with determinant 1 which defines its shape. This parametrization allows us to propose many general clustering criteria from the simplest one (spherical clusters with equal volumes which leads to the classical k-means criterion) to the most complex one (unknown and different volumes, orientations and shapes for all clusters). Methods of optimization to derive the maximum likelihood estimates as well as the practical usefulness of these models are discussed. We especially analyse the influence of the volumes of clusters. We report Monte Carlo simulations and an application on stellar data which dramatically illustrated the relevance of allowing clusters to have different volumes.

论文关键词:Gaussian mixture,Eigenvalue decomposition,Cluster volumes

论文评审过程:Received 5 October 1993, Revised 8 September 1994, Accepted 23 September 1994, Available online 7 June 2001.

论文官网地址:https://doi.org/10.1016/0031-3203(94)00125-6