Feature selection accelerated convolutional neural networks for visual tracking
作者:Zhiyan Cui, Na Lu
摘要
Most of the existing tracking methods based on convolutional neural network (CNN) models are too slow for use in real-time applications despite their excellent tracking accuracy in comparison with traditional methods. Meanwhile, CNN tracking solutions are memory intensive and require considerable computational resources. In this paper, we propose a time-efficient and accurate tracking scheme, a feature selection accelerated CNN (FSNet) tracking solution based on MDNet (Multi-Domain Network). The large number of convolutional operations is a major contributor to the high computational cost of MDNet. To reduce the computational complexity, we incorporated an efficient mutual information-based feature selection over the convolutional layer that reduces the feature redundancy in feature maps. Considering that tracking is a typical binary classification problem, redundant feature maps can simply be pruned, which results in an insignificant influence on the tracking performance. To further accelerate the CNN tracking solution, a RoIAlign layer is added that can apply convolution to the entire image instead of just to each RoI (Region of Interest). The bilinear interpolation of RoIAlign could well reduce misalignment errors of the tracked target. In addition, a new fine-tuning strategy is used in the fully-connected layers to accelerate the online updating process. By combining the above strategies, the accelerated CNN achieves a speedup to 60 FPS (Frame Per Second) on the GPU compared with the original MDNet, which functioned at 1 FPS with a very low impact on tracking accuracy. We evaluated the proposed solution on four benchmarks: OTB50, OTB100 ,VOT2016 and UAV123. The extensive comparison results verify the superior performance of FSNet.
论文关键词:Visual tracking, Mutual information, Feature selection, RoIAlign
论文评审过程:
论文官网地址:https://doi.org/10.1007/s10489-021-02234-4