ObjectPatchNet: Towards scalable and semantic image annotation and retrieval

作者:

Highlights:

摘要

The ever increasing Internet image collection densely samples the real world objects, scenes, etc. and is commonly accompanied with multiple metadata such as textual descriptions and user comments. Such image data has potential to serve as a knowledge source for large-scale image applications. Facilitated by such publically available and ever-increasing loosely annotated image data on the Internet, we propose a scalable data-driven solution for annotating and retrieving Web-scale image data. We extrapolate from large-scale loosely annotated images a compact and informative representation, namely ObjectPatchNet. Each vertex in ObjectPatchNet, which is called as an ObjectPatchNode, is defined as a collection of discriminative image patches annotated with object category labels. The edge linking two ObjectPatchNodes models the co-occurrence relationship among different objects in the same image. Therefore, ObjectPatchNet models not only probabilistically labeled image patches, but also the contextual relationship between objects. It is well suited to scalable image annotation task. Besides, we further take ObjectPatchNet as a visual vocabulary with semantic labels, and hence are able to easily develop inverted file indexing for efficient semantic image retrieval. ObjectPatchNet is tested on both large-scale image annotation and large-scale image retrieval applications. Experimental results manifest that ObjectPatchNet is both discriminative and efficient in these applications.

论文关键词:

论文评审过程:Available online 30 August 2013.

论文官网地址:https://doi.org/10.1016/j.cviu.2013.03.008