Turning from TF-IDF to TF-IGM for term weighting in text classification
作者:
Highlights:
• A new supervised term weighting scheme called TF-IGM is proposed.
• It adopts a new statistical model to measure a term's class distinguishing power.
• It makes full use of the fine-grained term distribution across different classes.
• It is adaptive to different text datasets by providing options or parameters.
• It outperforms TF-IDF and state-of-the-art supervised term weighting schemes.
摘要
•A new supervised term weighting scheme called TF-IGM is proposed.•It adopts a new statistical model to measure a term's class distinguishing power.•It makes full use of the fine-grained term distribution across different classes.•It is adaptive to different text datasets by providing options or parameters.•It outperforms TF-IDF and state-of-the-art supervised term weighting schemes.
论文关键词:Term weighting,Text classification,Inverse gravity moment (IGM),Class distinguishing power,Classifier
论文评审过程:Received 26 April 2016, Revised 9 August 2016, Accepted 5 September 2016, Available online 9 September 2016, Version of Record 17 September 2016.
论文官网地址:https://doi.org/10.1016/j.eswa.2016.09.009