Siamese visual tracking combining granular level multi-scale features and global information

作者:

Highlights:

• An improved Siamese tracking network is constructed via Res2Net and transformers.

• Multi-scale information from granular levels is used via feature extraction modules.

• A cross-attention module is used to learn the connection of different features.

• A self-attention module is employed to establish long-range dependencies.

• Empirical studies on public datasets demonstrate the effectiveness of our models.

摘要

•An improved Siamese tracking network is constructed via Res2Net and transformers.•Multi-scale information from granular levels is used via feature extraction modules.•A cross-attention module is used to learn the connection of different features.•A self-attention module is employed to establish long-range dependencies.•Empirical studies on public datasets demonstrate the effectiveness of our models.

论文关键词:Visual tracking,Siamese network,Multi-scale feature,Self attention,Transformer

论文评审过程:Received 26 December 2021, Revised 9 July 2022, Accepted 11 July 2022, Available online 16 July 2022, Version of Record 21 July 2022.

论文官网地址:https://doi.org/10.1016/j.knosys.2022.109435