Level search schemes for information filtering and retrieval
作者:
Highlights:
•
摘要
Latent semantic indexing (LSI) has been demonstrated to outperform lexical matching in information retrieval. However, the enormous cost associated with the singular value decomposition (SVD) of the large term-by-document matrix becomes a barrier for its application to scalable information retrieval. This work shows that information filtering using level search techniques can reduce the SVD computation cost for LSI. For each query, level search extracts a much smaller subset of the original term-by-document matrix, containing on average 27% of the original non-zero entries. When LSI is applied to such subsets, the average precision can degrade by as much as 23% due to level search filtering. However, for some document collections an increase in precision has also been observed. Further enhancement of level search can be based on a pruning scheme which deletes terms connected to only one document from the query-specific submatrix. Such pruning has achieved a 65% reduction (on average) in the number of non-zeros with a precision loss of 5% for most collections.
论文关键词:Filtering,Latent semantic indexing,Level search,Scalable information retrieval
论文评审过程:Received 20 February 2000, Accepted 14 June 2000, Available online 5 February 2001.
论文官网地址:https://doi.org/10.1016/S0306-4573(00)00032-7