The use of kernel principal component analysis to model data distributions

作者:

Highlights:

摘要

We describe the use of kernel principal component analysis (KPCA) to model data distributions in high-dimensional spaces. We show that a previous approach to representing non-linear data constraints using KPCA is not generally valid, and introduce a new ‘proximity to data’ measure that behaves correctly. We investigate the relation between this measure and the actual density for various low-dimensional data distributions. We demonstrate the effectiveness of the method by applying it to the higher-dimensional case of modelling an ensemble of images of handwritten digits, showing how it can be used to extract the digit information from noisy input images.

论文关键词:Feature extraction,Principal components,Kernel functions,Density estimation,Support vector machines,Kernel PCA

论文评审过程:Received 3 May 2001, Accepted 15 January 2002, Available online 17 February 2006.

论文官网地址:https://doi.org/10.1016/S0031-3203(02)00051-1