Assessing the impact of Stemming Accuracy on Information Retrieval – A multilingual perspective

作者:

Highlights:

• We tested the quality of many stemmers for English, French, Spanish and Portuguese with both intrinsic and extrinsic metrics.

• We found that a correlation between the two types of measures does exist, but it is not as strong as one might have expected.

• The most accurate stemmer was not the one to have the biggest improvement in Information Retrieval, in none of the languages.

摘要

•We tested the quality of many stemmers for English, French, Spanish and Portuguese with both intrinsic and extrinsic metrics.•We found that a correlation between the two types of measures does exist, but it is not as strong as one might have expected.•The most accurate stemmer was not the one to have the biggest improvement in Information Retrieval, in none of the languages.

论文关键词:Stemming,Information Retrieval,Evaluation,Multilingual

论文评审过程:Available online 18 April 2016, Version of Record 22 July 2016.

论文官网地址:https://doi.org/10.1016/j.ipm.2016.03.004