Maximization and restoration: Action segmentation through dilation passing and temporal reconstruction

作者:

Highlights:

• We propose a divide-and-conquer method that first maximizes frame accuracy and then reconstructs the features to reduce over-segmentation.

• Dilation passing network propagates long- and short-range features enabling better understanding of the relation between frames.

• Temporal reconstruction network uses a convolutional encoder-decoder to capture local context for temporal consistency among frames.

• Our model achieves meaningful results over the state-of-the-art models on three challenging datasets.

摘要

•We propose a divide-and-conquer method that first maximizes frame accuracy and then reconstructs the features to reduce over-segmentation.•Dilation passing network propagates long- and short-range features enabling better understanding of the relation between frames.•Temporal reconstruction network uses a convolutional encoder-decoder to capture local context for temporal consistency among frames.•Our model achieves meaningful results over the state-of-the-art models on three challenging datasets.

论文关键词:Action segmentation,Temporal segmentation,Video understanding

论文评审过程:Received 12 December 2021, Revised 11 April 2022, Accepted 29 April 2022, Available online 2 May 2022, Version of Record 4 May 2022.

论文官网地址:https://doi.org/10.1016/j.patcog.2022.108764