Concept-based ranking: a case study in the juridical domain

作者:

Highlights:

摘要

We explore the idea of automatically using the concepts of a thesaurus to improve search results. Our focus is on improving average precision figures, not recall. This is of interest because there is a tendency to accept the view that thesauri are recall-enhancing devices that are not good for improving precision figures.In our approach, the query terms are used to match concepts in the thesaurus. These concepts are then used to find other related concepts (narrow, broad, synonym), which are interpreted as independent sources of evidential knowledge. Each source of evidence is used to produce a separate concept-based ranking of the documents in the collection. These partial rankings are then combined into a final ranking. For this, we use a Bayesian belief network.To validate our ideas we do a case study in the juridical domain. Using a juridical thesaurus and a test collection containing more than 500 hundred thousand juridical documents, obtained from the legal court system in Brazil, we compare our concept-based ranking with the standard vectorial ranking. The results indicate improvements in average precision figures of roughly 30%.

论文关键词:Concept-based ranking,Bayesian networks,Juridical domain,Thesaurus

论文评审过程:Available online 2 June 2004.

论文官网地址:https://doi.org/10.1016/j.ipm.2004.04.015