Learning bag-of-embedded-words representations for textual information retrieval
作者:
Highlights:
• A novel BoF-based model is proposed for efficiently representing text documents.
• A weighting mask (similar to the traditional BoW weighting schemes) is learned.
• The BoEW is optimized end-to-end (from the word embeddings to the weighting mask).
• The learned representation can be efficiently finetuned using relevance feedback.
• The proposed method is evaluated using three text collections from different domains.
摘要
•A novel BoF-based model is proposed for efficiently representing text documents.•A weighting mask (similar to the traditional BoW weighting schemes) is learned.•The BoEW is optimized end-to-end (from the word embeddings to the weighting mask).•The learned representation can be efficiently finetuned using relevance feedback.•The proposed method is evaluated using three text collections from different domains.
论文关键词:Word embeddings,Bag-of-words,Bag-of-features,Dictionary learning,Relevance feedback,Information retrieval
论文评审过程:Received 23 October 2017, Revised 9 February 2018, Accepted 8 April 2018, Available online 10 April 2018, Version of Record 18 April 2018.
论文官网地址:https://doi.org/10.1016/j.patcog.2018.04.008