Physics-Based Visual Understanding

作者:

Highlights:

摘要

An understanding of a scene's causal physics—how scene elements interact and respond to forces—is a precondition to reasoning about how the scene came to be, how it may evolve in time, and how it will respond to manipulation. We propose a computationally inexpensive method for recovering causal structure from images, in which a scene model is built incrementally through interleaved sensing and analysis. Reasoning uses generic qualitative knowledge about rigid-body interactions, reusable between domains and similar to concepts thought to be acquired or activated during child development. Causal constraint propagation reveals anomalous degrees of freedom in the scene model; prediction yields sensory plans to resolve them. Sensing operations are highly directed and local in scope, e.g., visual routines and proprioception. Inference depth and the number of pixels “touched” are bounded by the complexity of the scene. We present algorithms and semantics that have been successfully reused in several domains of highly structured scenes; in particular we detail a vision system that reverse-engineers machines.

论文关键词:

论文评审过程:Received 14 November 1995, Accepted 20 November 1996, Available online 18 April 2002.

论文官网地址:https://doi.org/10.1006/cviu.1996.0572