Semantic-associative visual content labelling and retrieval: A multimodal approach
作者:
Highlights:
•
摘要
A novel framework referred to as collaterally confirmed labelling (CCL) is proposed, aiming at localising the visual semantics to regions of interest in images with textual keywords. Both the primary image and collateral textual modalities are exploited in a mutually co-referencing and complementary fashion. The collateral content and context-based knowledge is used to bias the mapping from the low-level region-based visual primitives to the high-level visual concepts defined in a visual vocabulary. We introduce the notion of collateral context, which is represented as a co-occurrence matrix of the visual keywords. A collaborative mapping scheme is devised using statistical methods like Gaussian distribution or Euclidean distance together with collateral content and context-driven inference mechanism. We introduce a novel high-level visual content descriptor that is devised for performing semantic-based image classification and retrieval. The proposed image feature vector model is fundamentally underpinned by the CCL framework. Two different high-level image feature vector models are developed based on the CCL labelling of results for the purposes of image data clustering and retrieval, respectively. A subset of the Corel image collection has been used for evaluating our proposed method. The experimental results to-date already indicate that the proposed semantic-based visual content descriptors outperform both traditional visual and textual image feature models.
论文关键词:Automatic image annotation,Cross-modal indexing,Semantic-level visual content descriptor,Multi-modal data modelling
论文评审过程:Received 29 May 2007, Accepted 30 May 2007, Available online 12 June 2007.
论文官网地址:https://doi.org/10.1016/j.image.2007.05.011