Using information gain to improve multi-modal information retrieval systems

作者:

Highlights:

摘要

Nowadays, access to information requires managing multimedia databases effectively, and so, multi-modal retrieval techniques (particularly images retrieval) have become an active research direction. In the past few years, a lot of content-based image retrieval (CBIR) systems have been developed. However, despite the progress achieved in the CBIR, the retrieval accuracy of current systems is still limited and often worse than only textual information retrieval systems. In this paper, we propose to combine content-based and text-based approaches to multi-modal retrieval in order to achieve better results and overcome the lacks of these techniques when they are taken separately. For this purpose, we use a medical collection that includes both images and non-structured text. We retrieve images from a CBIR system and textual information through a traditional information retrieval system. Then, we combine the results obtained from both systems in order to improve the final performance. Furthermore, we use the information gain (IG) measure to reduce and improve the textual information included in multi-modal information retrieval systems. We have carried out several experiments that combine this reduction technique with a visual and textual information merger. The results obtained are highly promising and show the profit obtained when textual information is managed to improve conventional multi-modal systems.

论文关键词:Multi-modal information retrieval,Information gain,Data fusion,Medical database access

论文评审过程:Received 18 July 2007, Revised 20 September 2007, Accepted 25 September 2007, Available online 19 November 2007.

论文官网地址:https://doi.org/10.1016/j.ipm.2007.09.014