Balancing between over-weighting and under-weighting in supervised term weighting

作者:

Highlights:

• Show the importance of the trade-off between over-weighting and under-weighting.

• Propose a revision of add-one smoothing on delta smoothed idf (dsidf).

• Present three regularization techniques to reduce over-weighting.

• Propose a new supervised term weighting scheme, regularized entropy (re).

摘要

•Show the importance of the trade-off between over-weighting and under-weighting.•Propose a revision of add-one smoothing on delta smoothed idf (dsidf).•Present three regularization techniques to reduce over-weighting.•Propose a new supervised term weighting scheme, regularized entropy (re).

论文关键词:Text categorization,Supervised term weighting,Over-weighting

论文评审过程:Received 20 June 2015, Revised 30 September 2016, Accepted 20 October 2016, Available online 2 November 2016, Version of Record 19 January 2017.

论文官网地址:https://doi.org/10.1016/j.ipm.2016.10.003