Penalized -regression-based bicluster localization

作者:

Highlights:

摘要

Biclustering (co-clustering, two-mode clustering), as one of the classical unsupervised learning methods, has been applied in many different fields in recent years. Different types of biclustering methods have been developed such as probabilistic methods, two-way clustering methods, variance minimization methods, and so on. However, few regression-based methods have been proposed to the best of our knowledge. Such methods have been applied in traditional clustering, which can improve both the computational efficiency and the clustering accuracy. In this paper, we present a penalized regression-based method for localizing the biclusters (PRbiclust). By imposing Truncated LASSO Penalty (TLP) and group TLP terms to penalize the column vectors and the row vectors in the regression model, the structure of biclusters in the data matrix is recovered. The model is formulated as an optimization problem with nonconvex penalties, and a computationally efficient algorithm is proposed to solve it. Convergence of the algorithm is proved. To extract the biclusters from the recovered data matrix, we propose a graph-based localization method. An evaluation criterion is also proposed to measure the efficiency of bicluster localization when noise entries exist. We apply the proposed method to both simulated datasets with different setups and a real dataset. Experiments show that this method can well capture the bicluster structure, and performs better than the existing works.

论文关键词:Biclustering,Penalized regression-based model,Alternating direction method of multipliers (ADMM),Difference of convex (DC) programming

论文评审过程:Received 2 December 2019, Revised 1 February 2021, Accepted 31 March 2021, Available online 20 April 2021, Version of Record 4 May 2021.

论文官网地址:https://doi.org/10.1016/j.patcog.2021.107984