Prediction and Description of Near-Future Activities in Video
作者:
Highlights:
• Our work provides a description of near-future activities from current observations.
• Ours is one of the earliest works for captioning near-future events in videos.
• We use spatio-temporal relationship of activities and objects for label prediction.
• We use a sequence-to-sequence learning-based approach for mapping labels to captions.
• We perform extensive experiments to show the effectiveness of the proposed framework.
摘要
•Our work provides a description of near-future activities from current observations.•Ours is one of the earliest works for captioning near-future events in videos.•We use spatio-temporal relationship of activities and objects for label prediction.•We use a sequence-to-sequence learning-based approach for mapping labels to captions.•We perform extensive experiments to show the effectiveness of the proposed framework.
论文关键词:Label,Caption,LSTM,Fully connected network,Sequence-to-sequence
论文评审过程:Received 29 May 2020, Revised 21 May 2021, Accepted 24 May 2021, Available online 29 May 2021, Version of Record 12 June 2021.
论文官网地址:https://doi.org/10.1016/j.cviu.2021.103230