Combining Specialized Word Embeddings and Subword Semantic Features for Lexical Entailment Recognition

作者:

Highlights:

摘要

The challenge of Lexical Entailment Recognition (LER) aims to identify the is-a relation between words. This problem has recently received attention from researchers in the field of natural language processing because of its application to varied downstream tasks. However, almost all prior studies have only focused on datasets that include single words; thus, how to handle compound words effectively is still a challenge. In this study, we propose a novel method called LERC (Lexical Entailment Recognition Combination) to solve this problem by combining embedding representations and subword semantic features. For this aim, firstly a specialized word embedding model for the LER tasks is trained. Secondly, subword semantic information of word pairs is exploited to compute another feature vector. This feature vector is combined with embedding vectors for supervised classification. We considered three LER tasks, including Lexical Entailment Detection, Lexical Entailment Directionality, and Lexical Entailment Determination. Experimental results conducted on several benchmark datasets in English and Vietnamese languages demonstrated that the subword semantic feature is useful for these tasks. Moreover, LERC outperformed several methods published recently.

论文关键词:Lexical entailment,Hypernymy detection,Taxonomic relation,Lexical entailment recognition

论文评审过程:Received 5 February 2022, Revised 18 June 2022, Accepted 14 August 2022, Available online 24 August 2022, Version of Record 12 September 2022.

论文官网地址:https://doi.org/10.1016/j.datak.2022.102077