An explanation of the effectiveness of latent semantic indexing by means of a Bayesian regression model

作者:

Highlights:

摘要

Latent Semantic Indexing (LSI) is an effective automated method for determining if a document is relevant to a reader based on a few words or an abstract describing the reader's needs. A particular feature of LSI is its ability to deal automatically with synonyms. LSI generally is explained in terms of a mathematical concept called the Singular Value Decomposition and statistical methods such as factor analysis. This paper looks at LSI from a different perspective, comparing it to statistical regression and Bayesian methods. The relationships found can be useful in explaining the performance of LSI and in suggesting variations on the LSI approach.

论文关键词:

论文评审过程:Available online 23 February 1999.

论文官网地址:https://doi.org/10.1016/0306-4573(95)00055-0