Eyes in the interface | 数据学习(DataLearner)

摘要

Computer vision has a significant role to play in the human-computer interaction (HCI) devices of the future. All computer input devices serve one essential purpose. They transduce some motion or energy from a human agent into machine useable signals. One may therefore think of input devices as the ‘perceptual organs’ by which computers sense the intents of their human users. We outline the role computer vision will play, highlight the impediments to the development of vision-based interfaces, and propose an approach for overcoming these impediments. Prospective vision research areas for HCI include human face recognition, facial expression interpretation, lip reading, head orientation detection, eye gaze tracking three-dimensional finger pointing, hand tracking, hand gesture interpretation and body pose tracking. For vision-based interfaces to make any impact, we will have to embark on an expansive approach, which begins with the study of the interaction modality we seek to implement. We illustrate our approach by discussing our work on vision-based hand gesture interfaces. This work is based on information from such varied disciplines as semiotics, anthropology, neurophysiology, neuropsychology and psycholinguistics. Concentrating on communicative (as opposed to manipulative) gestures, we argue that interpretation of a large number of gestures involves analysis of image dynamics to identify and characterize the gestural stroke, locating the stroke extrema in ordinal 3D space, and recognizing the hand pose at stroke extrema. We detail our dynamic image analysis algorithm which enforces our constraints: directional variance, spatial cohesion, directional cohesion and path cohesion. The clustered vectors characterize the motion of a gesturing hand.