Natural language information retrieval

作者:

Highlights:

摘要

In this paper we describe an information retrieval system in which advanced natural language processing techniques are used to enhance the effectiveness of term-based document retrieval. The backbone of our system is a traditional statistical engine that builds inverted index files from pre-processed documents, and then searches and ranks the documents in response to user queries. Natural language processing is used to (a) preprocess the documents in order to extract content-carrying terms, (b) discover inter-term dependencies and build a conceptual hierarchy specific to the database domain, and (c) process the user's natural language requests into effective search queries. During the course of the Text REtrieval Conferences, TREC-1 and TREC-2,∗ our system has evolved from a scaled-up prototype, originally tested on such collections as CACM-3204 and Cranfield, to its present form, which can be effectively used to process hundreds of millions of words of unrestricted text.

论文关键词:

论文评审过程:Available online 21 February 2000.

论文官网地址:https://doi.org/10.1016/0306-4573(94)00055-8