An efficient approach for measuring semantic relatedness using Wikipedia bidirectional links

作者:Xinhua Zhu, Qingsong Guo, Bo Zhang, Fei Li

摘要

The measurement of the semantic relatedness between concepts is an important fundamental research topic in natural language processing. The link-based model is the most promising relatedness method in Wikipedia-based measures because its manually defined links in Wikipedia are refined and close to the semantics of humans. This paper proposes a Wikipedia two-way link model to extend the existing Wikipedia one-way out-link model, which has a low dimension and a high efficiency, as well as being easy to implement and repeat. First, this model utilizes the out-links and in-links of concepts in Wikipedia to combine into a bidirectional link vector for concept semantic interpreter and uses a TF*IDF-based bidirectional weight method to uniformly calculate the strength of the mutual association between a given concept and its out-link or in-link concept. Second, we propose a disambiguation strategy based on the social awareness of senses that directly sorts the out-links within a disambiguation page in the order in which they occur in the disambiguation page and adopts an adjustable threshold to determine how many senses will be selected. Moreover, we also propose new vector similarity metrics based on logarithm and exponent to improve the comprehensive performance of the semantic relatedness measurements based on Wikipedia links. The experimental results on some well-recognized datasets demonstrate that our model surpasses the existing popular Naïve Explicit Semantic Analysis (Naïve-ESA) and Wikipedia Out-Link vector-based Measure (WOLM) methods in the current Wikipedia versions and that our bidirectional link model significantly improves the performance of the existing one-way link model in practical applications.

论文关键词:Semantic relatedness, Link vector, Vector similarity metric, Disambiguation, Wikipedia

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-019-01452-1