Multi-modal feature fusion for geographic image annotation
作者:
Highlights:
• Multi-modal feature construction: as for the shallow modality features, we propose a mixed shallow feature model which combines Color, LBP, and SIFT features to represent the extrinsic visual properties of geographic images; as for the deep modality features, we design a specialized DCNN to extract the intrinsic semantic information for geographic images.
• Multi-modal feature fusion: we propose a multi-modal feature fusion model based on DBNs and RBM to build a powerful joint representation for geographic images. The model has been shown to be effective to capture both the intrinsic and extrinsic semantic information.
• Open geographic image dataset: we have built a geographic image dataset which contains 300 images (600 × 600) in six typical areas such as urban, rural, and mountain.
摘要
•Multi-modal feature construction: as for the shallow modality features, we propose a mixed shallow feature model which combines Color, LBP, and SIFT features to represent the extrinsic visual properties of geographic images; as for the deep modality features, we design a specialized DCNN to extract the intrinsic semantic information for geographic images.•Multi-modal feature fusion: we propose a multi-modal feature fusion model based on DBNs and RBM to build a powerful joint representation for geographic images. The model has been shown to be effective to capture both the intrinsic and extrinsic semantic information.•Open geographic image dataset: we have built a geographic image dataset which contains 300 images (600 × 600) in six typical areas such as urban, rural, and mountain.
论文关键词:Convolutional neural networks (CNNs),Deep learning,Geographic image annotation,Multi-modal feature fusion
论文评审过程:Received 18 November 2016, Revised 23 April 2017, Accepted 30 June 2017, Available online 11 July 2017, Version of Record 18 September 2017.
论文官网地址:https://doi.org/10.1016/j.patcog.2017.06.036