Performing word sense disambiguation at the border between unsupervised and knowledge-based techniques

作者:Florentina Hristea, Marius Popescu, Monica Dumitrescu

摘要

This paper aims to fully present a new word sense disambiguation method that has been introduced in Hristea and Popescu (Fundam Inform 91(3–4):547–562, 2009) and so far tested in the case of adjectives (Hristea and Popescu in Fundam Inform 91(3–4):547–562, 2009) and verbs (Hristea in Int Rev Comput Softw 4(1):58–67, 2009). We hereby extend the method to the case of nouns and draw conclusions regarding its performance with respect to all these parts of speech. The method lies at the border between unsupervised and knowledge-based techniques. It performs unsupervised word sense disambiguation based on an underlying Naïve Bayes model, while using WordNet as knowledge source for feature selection. The performance of the method is compared to that of previous approaches that rely on completely different feature sets. Test results for all involved parts of speech show that feature selection using a knowledge source of type WordNet is more effective in disambiguation than local type features (like part-of-speech tags) are.

论文关键词:Word sense disambiguation, Unsupervised disambiguation, Knowledge-based disambiguation, Bayesian classification, The EM algorithm, WordNet

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10462-009-9117-6