Feature fusion within local region using localized maximum-margin learning for scene categorization

作者:

Highlights:

摘要

In the field of visual recognition such as scene categorization, representing an image based on the local feature (e.g., the bag-of-visual-word (BOVW) model and the bag-of-contextual-visual-word (BOCVW) model) has become popular and one of the most successful methods. In this paper, we propose a method that uses localized maximum-margin learning to fuse different types of features during the BOCVW modeling for eventual scene classification. The proposed method fuses multiple features at the stage when the best contextual visual word is selected to represent a local region (hard assignment) or the probabilities of the candidate contextual visual words used to represent the unknown region are estimated (soft assignment). The merits of the proposed method are that (1) errors caused by the ambiguity of single feature when assigning local regions to the contextual visual words can be corrected or the probabilities of the candidate contextual visual words used to represent the region can be estimated more accurately; and that (2) it offers a more flexible way in fusing these features through determining the similarity-metric locally by localized maximum-margin learning. The proposed method has been evaluated experimentally and the results indicate its effectiveness.

论文关键词:Scene categorization,Image recognition,Feature fusion,Similarity-metric learning

论文评审过程:Received 8 November 2010, Revised 24 August 2011, Accepted 20 September 2011, Available online 19 October 2011.

论文官网地址:https://doi.org/10.1016/j.patcog.2011.09.027