A hybrid computational model for an automated image descriptor for visually impaired users
作者:
Highlights:
•
摘要
Nowadays, with the development of high-quality software, most presentations contain images. This makes a problem for visually impaired people, as there is a support for text-to-voice conversion but not for image-to-voice. For documents which combine images and text, we propose a hybrid model to make a meaningful and easily recognizable descriptor for images in three main categories (statistical, geometrical and non-geometrical). First, a neural classifier is trained, by mining the associated texts using advanced concepts, so that it can assign each document to a specific category. Then, a similarity matching with that category’s annotated templates is performed for images in every other category. We have made a classifier by using novel features based on color projection and able to differentiate geometrical images from ordinary images. Thus we have significantly improved the similarity matching, to achieve more accurate descriptions of images for visually impaired users. An important feature of the proposed model is that its specific matching techniques, suitable for a particular category, can be easily integrated and developed for other categories.
论文关键词:Classification,Image analysis and descriptor
论文评审过程:Available online 23 May 2010.
论文官网地址:https://doi.org/10.1016/j.chb.2010.04.018