Computationally Efficient Approximation of a Probabilistic Model for Document Representation in the WEBSOM Full-Text Analysis Method
作者:S. Kaski
摘要
WEBSOM is a recently developed neural method for exploring full-text document collections, for information retrieval, and for information filtering. In WEBSOM the full-text documents are encoded as vectors in a document space somewhat like in earlier information retrieval methods, but in WEBSOM the document space is formed in an unsupervised manner using the Self-Organizing Map algorithm. In this article the document representations the WEBSOM creates are shown to be computationally efficient approximations of the results of a certain probabilistic model. The probabilistic model incorporates information about the similarity of use of different words to take into account their semantic relations.
论文关键词:data mining, feature extraction, information retrieval, Self-Organizing Map (SOM), text analysis
论文评审过程:
论文官网地址:https://doi.org/10.1023/A:1009618125967