Term discrimination for text search tasks derived from negative binomial distribution

作者:

Highlights:

• A new collection frequency weighting scheme derived from the negative binomial distribution model of term occurrences is proposed.

• Factorial experiment is designed to assess the overall performance of the various term discrimination methods.

• Our proposed term discrimination method offers a significant gain in accuracy as compared to the IDF and RIDF scheme.

摘要

•A new collection frequency weighting scheme derived from the negative binomial distribution model of term occurrences is proposed.•Factorial experiment is designed to assess the overall performance of the various term discrimination methods.•Our proposed term discrimination method offers a significant gain in accuracy as compared to the IDF and RIDF scheme.

论文关键词:Term discrimination,Poisson–Gamma mixtures,Negative binomial model,Text search tasks

论文评审过程:Received 14 October 2015, Revised 29 March 2017, Accepted 5 January 2018, Available online 30 January 2018, Version of Record 30 January 2018.

论文官网地址:https://doi.org/10.1016/j.ipm.2018.01.003