Sparse Generalised Principal Component Analysis

作者:

Highlights:

• A method for sparse feature extraction of exponential family data is presented.

• It extends the previous method of Generalised Principal Component Analysis.

• Applications to text data are performed using a simple Poisson model.

• Superior performance to current state of art methodology is shown on synthetic data.

• Performance on a dataset from healthcare is on par with state of art methodology.

摘要

•A method for sparse feature extraction of exponential family data is presented.•It extends the previous method of Generalised Principal Component Analysis.•Applications to text data are performed using a simple Poisson model.•Superior performance to current state of art methodology is shown on synthetic data.•Performance on a dataset from healthcare is on par with state of art methodology.

论文关键词:Dimension reduction,PCA,Text mining,Exponential family

论文评审过程:Received 3 August 2017, Revised 24 May 2018, Accepted 15 June 2018, Available online 18 June 2018, Version of Record 26 June 2018.

论文官网地址:https://doi.org/10.1016/j.patcog.2018.06.014