Automatic thesaurus construction using Bayesian networks
作者:
Highlights:
•
摘要
Automatic thesaurus construction is accomplished by extracting term relations mechanically. A popular method uses statistical analysis to discover the term relations. For low-frequency terms, however, the statistical information of the terms cannot be reliably used for deciding the relationship of terms. This problem is generally referred to as the data-sparseness problem. Unfortunately, many studies have shown that low-frequency terms are of most use in thesaurus construction. This paper characterizes the statistical behavior of terms by using an inference network. A formal approach for the data-sparseness problem, which is crucial in constructing a thesaurus, is developed. The validity of this approach is shown by experiments.
论文关键词:
论文评审过程:Received 15 December 1995, Accepted 1 March 1996, Available online 19 February 1999.
论文官网地址:https://doi.org/10.1016/0306-4573(96)00026-X