Experiments in term weighting for novelty mining

作者:

Highlights:

摘要

Obtaining new information in a short time is becoming crucial in today’s economy. A lot of information both offline or online is easily acquired, exacerbating the problem of information overload. Novelty mining detects documents/sentences that contain novel or new information and presents those results directly to users (Tang, Tsai, & Chen, 2010). Many methods and algorithms for novelty mining have previously been studied, but none have compared and discussed the impact of term weighting on the evaluation measures. This paper performed experiments to recommend the best term weighting function for both document and sentence-level novelty mining.

论文关键词:Novelty mining,Novelty detection,Term weighting,Binary,Term frequency,Inverse document frequency,Threshold,Novelty dataset

论文评审过程:Available online 4 May 2011.

论文官网地址:https://doi.org/10.1016/j.eswa.2011.04.218