A semantic approach for text clustering using WordNet and lexical chains

作者:

Highlights:

• A modified WordNet based similarity measure for word sense disambiguation.

• Lexical chains as text representation for ideally cover the theme of texts.

• Extracted core semantics are sufficient to reduce dimensionality of feature set.

• The proposed scheme is able to correctly estimate the true number of clusters.

• The topic labels have good indicator of recognizing and understanding the clusters.

摘要

•A modified WordNet based similarity measure for word sense disambiguation.•Lexical chains as text representation for ideally cover the theme of texts.•Extracted core semantics are sufficient to reduce dimensionality of feature set.•The proposed scheme is able to correctly estimate the true number of clusters.•The topic labels have good indicator of recognizing and understanding the clusters.

论文关键词:Text clustering,WordNet,Lexical chains,Core semantic features

论文评审过程:Available online 18 October 2014.

论文官网地址:https://doi.org/10.1016/j.eswa.2014.10.023