A novel methodology for retrieving infographics utilizing structure and message content

作者:

Highlights:

摘要

Information graphics (infographics) in popular media are highly structured knowledge representations that are generally designed to convey an intended message. This paper presents a novel methodology for retrieving infographics from a digital library that takes into account a graphic's structural and message content. The retrieval methodology can be summarized thus: 1) hypothesize requisite structural and message content from a natural language query, 2) measure the relevance of each candidate infographic to the requisite structural and message content hypothesized from the user query, and 3) integrate these relevance measurements via a linear combination model in order to produce a ranked list of infographics in response to the user query. The methodology has been implemented and evaluated, and it significantly outperforms a baseline method that treats queries and graphics as bags of words.

论文关键词:Semi-structured data and XML,Information retrieval,Digital libraries,Query,Graphic retrieval,Natural language query processing,Short document expansion,Linear combination ranking model

论文评审过程:Received 10 January 2015, Accepted 28 May 2015, Available online 25 June 2015, Version of Record 6 November 2015.

论文官网地址:https://doi.org/10.1016/j.datak.2015.05.005