Perceptual Monocular Depth Estimation

作者:Janice Pan, Alan C. Bovik

摘要

Monocular depth estimation (MDE), which is the task of using a single image to predict scene depths, has gained considerable interest, in large part owing to the popularity of applying deep learning methods to solve “computer vision problems”. Monocular cues provide sufficient data for humans to instantaneously extract an understanding of scene geometries and relative depths, which is evidence of both the processing power of the human visual system and the predictive power of the monocular data. However, developing computational models to predict depth from monocular images remains challenging. Hand-designed MDE features do not perform particularly well, and even current “deep” models are still evolving. Here we propose a novel approach that uses perceptually-relevant natural scene statistics (NSS) features to predict depths from monocular images in a simple, scale-agnostic way that is competitive with state-of-the-art systems. While the statistics of natural photographic images have been successfully used in a variety of image and video processing, analysis, and quality assessment tasks, they have never been applied in a predictive end-to-end deep-learning model for monocular depth. Correspondingly, no previous work has explicitly incorporated perceptual features in a monocular depth-prediction approach. Here we accomplish this by developing a new closed-form bivariate model of image luminances and use features extracted from this model and from other NSS models to drive a novel deep learning framework for predicting depth given a single image.

论文关键词:Monocular depth estimation, Natural scene statistics, Depth estimation, Bivariate natural scene statistics, Neural networks

论文评审过程:

论文官网地址:https://doi.org/10.1007/s11063-021-10437-6