Generalized regression model for sequence matching and clustering
作者:Hansheng Lei, Venu Govindaraju
摘要
Linear relation has been found to be valuable in rule discovery of stocks, such as if stock X goes up a, stock Y will go down b. The traditional linear regression models the linear relation of two sequences faithfully. However, if a user requires clustering of stocks into groups where sequences have high linearity or similarity with each other, it is prohibitively expensive to compare sequences one by one. In this paper, we present generalized regression model (GRM) to match the linearity of multiple sequences at a time. GRM also gives strong heuristic support for graceful and efficient clustering. The experiments on the stocks in the NASDAQ market mined interesting clusters of stock trends efficiently.
论文关键词:Similarity measure, Sequence matching, Sequence clustering, Generalized regression model, Eigenvalue and eigenvector
论文评审过程:
论文官网地址:https://doi.org/10.1007/s10115-006-0008-8