Real-time estimation of 3D scene geometry from a single image

作者:

Highlights:

摘要

Significant advances have recently been made in the development of computational methods for predicting 3D scene structure from a single monocular image. However, their computational complexity severely limits the adoption of such technologies to various computer vision and pattern recognition applications. In this paper, we address the problem of inferring 3D scene geometry from a single monocular image of man-made environments. Our goal is to estimate the 3D structure of a scene in real-time with a level of accuracy useful in certain real applications. Towards this end, we decompose the three-dimensional world space into a set of geometrically inspired primitive subspaces. One important advantage of our approach is that the complex estimation problem can be systematically broken down into a sequence of subproblems, which are easier to solve and more reliable even with the presence of occlusion or clutter, without loss of generality. The proposed algorithm also serves as the technical foundation for effective representation of the 3D scene geometry based on a simple description of the textural patterns present in the image and their spatial arrangement. Extensive experiments have been conducted on a large scale challenging dataset of real-world images. Our results demonstrate that the proposed method remarkably outperforms the recent state-of-the-art algorithms with respect to speed and accuracy.

论文关键词:3D scene geometry,Geometric scene categorization,Depth information,Monocular vision,Machine learning

论文评审过程:Received 27 June 2011, Revised 10 February 2012, Accepted 21 February 2012, Available online 10 March 2012.

论文官网地址:https://doi.org/10.1016/j.patcog.2012.02.028