A two-streamed network for estimating fine-scaled depth maps from single RGB images

作者:

Highlights:

摘要

Estimating depth from a single RGB image is an ill-posed and inherently ambiguous problem. State-of-the-art deep learning methods can now estimate accurate 2D depth maps, but when the maps are projected into 3D, they lack local detail and are often highly distorted. We propose a fast-to-train two-streamed CNN that predicts depth and depth gradients, which are then fused together into an accurate and detailed depth map. To overcome the challenge of learning from limited sized datasets, we define a novel set loss over multiple images. By regularizing the estimation between a common set of images, the network is less prone to over-fitting and achieves better accuracy than competing methods. Our method is applicable to both entire scenes and individual objects and we demonstrate this by evaluating on the NYU Depth v2 and ScanNet datasets for indoor scenes and on the ShapeNet dataset for single man-made objects. Experiments shows that our depth predictions are competitive with state-of-the-art and lead to faithful 3D projections rich in detailing and structure.

论文关键词:

论文评审过程:Received 11 July 2018, Revised 2 June 2019, Accepted 6 June 2019, Available online 10 June 2019, Version of Record 12 August 2019.

论文官网地址:https://doi.org/10.1016/j.cviu.2019.06.002