Learning Enriched Global Context Information for Human Pose Estimation

作者:Rui Wang, Ruyi Liu, Yanping Li, Xiangyang Wang

摘要

A classic method for human pose estimation is to generate a heatmap centered on each keypoint location as a kind of small-region representation for supervised learning. The networks of such a method need to learn multi-scale feature maps and global context information under different receptive fields. For human pose estimation, a larger receptive field could learn more human body structure information, which contains more global and higher semantic features, and learn more long-distance keypoint connection features. However, as a local operation, convolution has defects in capturing the global relationship, and it is difficult to consider the surrounding pixel information fully. Furthermore, the resolution of detected results for small-region representation is generally very low, which limits the accuracy of keypoint detection. In this paper, we propose a switchable convolution operation that can adaptively select a larger receptive field, and obtain richer global context information. In addition, we utilize a dual attention unit to reconstruct the feature map to enhance gainful features and further enhance the structural information between human body parts in the heatmap. Experiments on the COCO and MPII datasets prove that our method can effectively improve the performance for human pose estimation.

论文关键词:Human pose estimation, Global context information, Switchable convolution, Dual attention

论文评审过程:

论文官网地址:https://doi.org/10.1007/s11063-021-10699-0