On-line legal aid: Markov chain model for efficient retrieval of legal documents

作者:

Highlights:

摘要

It is widely accepted that, with large databases, the key to good performance is effective data-clustering. In any large document database clustering is essential for efficient search, browse and therefore retrieval. Cluster analysis allows the identification of groups, or clusters, of similar objects in multi-dimensional space [1]. Conventional document retrieval systems involve the matching of a query against individual documents, whereas a clustered search compares a query with clusters of documents, thereby achieving efficient retrieval. In most document databases, periodic updating of clusters is required due to the dynamic nature of a database. Experimental evidence, however, shows that clustered searches are substantially less effective than conventional searches of corresponding non-clustered documents. In this paper, we investigate the present clustering criteria and its drawbacks. We propose a new approach to clustering and justify the reasons why this new approach should be tested and (if proved beneficial) adopted.

论文关键词:Document management,Document retrieval,Clustering,Optical disks,Markov chain

论文评审过程:Received 9 April 1997, Revised 6 October 1997, Accepted 12 November 1997, Available online 5 January 1999.

论文官网地址:https://doi.org/10.1016/S0262-8856(98)00061-4