3D object recognition from static 2D views using multiple coarse data channels

作者:

Highlights:

摘要

A 3D object recognition system is described that employs novel multiresolution representation and coarse encoding of feature information. Modifications are bought to classic feature extraction methods by proposing the use of wavelet transform maxima for directing the actions of feature extraction modules. The reasons behind the use of a multi-channel architecture are described, together with the description of the feature extraction and coarse modules. The targeted field of application being automatic categorisation of natural objects, the proposed system is designed to run on ordinary hardware platforms and to process an input in a short timeframe. The system has been evaluated on a variety of 2D views of a set of 5 synthetic objects designed to present various degrees of similarity, as being rated by a panel of human subjects. Parallels between these ratings and the system’s behaviour are drawn. Additionally a small set of photomicrographs of fish larvae has been used to assess the system’s performance when presented with very similar, non-rigid shapes. For comparison, the parameters extracted from each image were fed into two categorisers, discriminant analysis and multilayer feedforward neural network with backpropagation of error. Experimental evidence is presented which demonstrates the efficacy of the methods. The satisfactory categorisation performances of the system are reported, and conclusions are drawn about the system’s behaviour.

论文关键词:Viewer-centred representation,A Trous transform,Coarse coding,Discriminant analysis,Neural networks,Self-organising maps

论文评审过程:Received 17 November 1997, Revised 12 August 1998, Accepted 26 August 1998, Available online 11 August 1999.

论文官网地址:https://doi.org/10.1016/S0262-8856(98)00159-0