Temporal feature enhancement network with external memory for live-stream video object detection

作者:

Highlights:

• Aggregating features from only the past few frames can improve detection accuracy.

• Aggregation by the attention mechanism performs better than similarity- based methods.

• Aggregating coarse features can improve accuracy while keeping real-time performance.

摘要

•Aggregating features from only the past few frames can improve detection accuracy.•Aggregation by the attention mechanism performs better than similarity- based methods.•Aggregating coarse features can improve accuracy while keeping real-time performance.

论文关键词:Video object detection,Video analysis,Object detection

论文评审过程:Received 8 February 2021, Revised 16 May 2022, Accepted 12 June 2022, Available online 13 June 2022, Version of Record 25 June 2022.

论文官网地址:https://doi.org/10.1016/j.patcog.2022.108847