Real-time Lexicon-free Scene Text Retrieval
作者:
Highlights:
• Improved method that achieves state of the art performance in image-based text retrieval.
• Results on different CNN backbones modified to predict a PHOC of detected textual instances are presented.
• Effect of different PHOC dimensions is explored and analyzed.
• PHOC embedding allows retrieving out-of-vocabulary words unseen at training time.
• Proposed method achieves state of the art in multilingual dataset of unseen samples at training time.
• Method is faster than state of the art, allowing real-time retrieval in videos.
摘要
•Improved method that achieves state of the art performance in image-based text retrieval.•Results on different CNN backbones modified to predict a PHOC of detected textual instances are presented.•Effect of different PHOC dimensions is explored and analyzed.•PHOC embedding allows retrieving out-of-vocabulary words unseen at training time.•Proposed method achieves state of the art in multilingual dataset of unseen samples at training time.•Method is faster than state of the art, allowing real-time retrieval in videos.
论文关键词:Image retrieval,Scene text detection,Scene text recognition,Word spotting,Convolutional neural networks,Region proposal networks,PHOC
论文评审过程:Received 6 May 2019, Revised 19 August 2020, Accepted 9 September 2020, Available online 10 September 2020, Version of Record 1 November 2020.
论文官网地址:https://doi.org/10.1016/j.patcog.2020.107656