iccv12

iccv 2011 论文列表

IEEE International Conference on Computer Vision Workshops, ICCV 2011 Workshops, Barcelona, Spain, November 6-13, 2011.

Object representation based on gabor wave vector binning: An application to human head pose detection.
Criteria and metrics for thresholded AU detection.
Face detection using SURF cascade.
VADANA: A dense dataset for facial image analysis.
High quality facial expression recognition in video streams using shape related information only.
Evaluation of face recognition system in heterogeneous environments (visible vs NIR).
Single- and cross- database benchmarks for gender classification under unconstrained settings.
Annotated Facial Landmarks in the Wild: A large-scale, real-world database for facial landmark localization.
Manifold based Sparse Representation for robust expression recognition without neutral subtraction.
High-resolution comprehensive 3-D dynamic database for facial articulation analysis.
A face biometric benchmarking review and characterisation.
UMB-DB: A database of partially occluded 3D faces.
Static facial expression analysis in tough conditions: Data, evaluation protocol and benchmark.
3D Twins and Expression Challenge.
Facial action unit detection using kernel partial least squares.
Illumination-free gaze estimation method for first-person vision wearable device.
Unsupervised sub-categorization for object detection: Finding cars from a driving vehicle.
Learning multi-lane trajectories using vehicle-based vision.
Showing vehicles at blind corners from mixed-dimensional multi-view geometry.
Contrast restoration of road images taken in foggy weather.
Direct Iterative Closest Point for real-time visual odometry.
A confidence measure for assessing optical flow accuracy in the absence of ground truth.
A framework for global vehicle localization using stereo images and satellite and road maps.
Stereo estimation of depth along virtual cut planes.
Monocular Camera Trajectory Optimization using LiDAR data.
Stixels estimation without depth map computation.
Constrained UAV mission planning: A comparison of approaches.
SPARTAN system: Towards a low-cost and high-performance vision architecture for space exploratory rovers.
Automatic real-time FACS-coder to anonymise drivers in eye tracker videos.
A real-time multi-cue framework for determining optical flow confidence.
Reversible video stream anonymization for video surveillance systems based on pixels relocation and watermarking.
Semantic video event search for surveillance video.
Operator attention based video surveillance.
A surveillance video analysis and storage scheme for scalable synopsis browsing.
Slice matching for accurate spatio-temporal alignment.
Multi-cue learning and visualization of unusual events.
Video-based traffic queue length estimation.
Self-calibrating 3D context for retrieving people with luggage.
Unsupervised workflow discovery in industrial environments.
A group sparsity-driven approach to 3-D action recognition.
An X-T slice based method for action recognition.
Full-motion recovery from multiple video cameras applied to face tracking and recognition.
Human gait characteristics from unconstrained walks and viewpoints.
Re-identification of pedestrians with variable occlusion and scale.
Flexible tracklet association for complex scenarios using a Markov Logic Network.
Robust object tracking via online learning of adaptive appearance manifold.
Globally optimal target tracking in real time using max-flow network.
Tracking visual and infrared objects using joint Riemannian manifold appearance and affine shape modeling.
An analytical formulation of global occlusion reasoning for multi-target tracking.
Efficient framework for extended visual object tracking.
Branch and bound global optima search for tracking a single object in a network of non-overlapping cameras.
Multi-view people surveillance using 3D information.
Eigenshape kernel based mean shift for human tracking.
Improving object localization using macrofeature layout selection.
Person detection in surveillance environment with HoGG: Gabor filters and Histogram of Oriented Gradient.
Relational HOG feature with wild-card for object detection.
Efficient pedestrian detection with group lasso.
Self-adaptive Gaussian mixture model for urban traffic monitoring system.
Automatic surveillance video matting using a shape prior.
Multi-scale multi-feature codebook-based background subtraction.
Spatially adaptive illumination modeling for background subtraction.
Transferring activities: Updating human behavior analysis.
A discriminative key pose sequence model for recognizing human interactions.
Multiple person re-identification using part based spatio-temporal color appearance model.
Appearance-based head pose estimation with scene-specific adaptation.
Approximate techniques in solving optimal camera placement problems.
Camera auto-calibration using pedestrians and zebra-crossings.
Multi-camera multi-object tracking by robust hough-based homography projections.
Tensor-based covariance matrices for object tracking.
Detection-based multi-human tracking using a CRF model.
Robust object tracking with boosted discriminative model via graph embedding.
The eleventh IEEE international workshop on visual surveillance.
Fast volumetric visual hull computation.
A pixel-based approach to template-based monocular 3D reconstruction of deformable surfaces.
Facial expression recognition with temporal modeling of shapes.
Motion capture from dynamic orthographic cameras.
The wave kernel signature: A quantum mechanical approach to shape analysis.
3D reconstruction of bat flight kinematics from sparse multiple views.
3D dynamics analysis in Teichmüller space.
Multiview projectors/cameras system for 3D reconstruction of dynamic scenes.
4D facial expression recognition.
FEM models to code non-rigid EKF monocular SLAM.
Joint reconstruction of 3D shape and non-rigid motion in a region-growing framework.
Event-driven feature analysis in a 4D spatiotemporal representation for ambient assisted living.
Incorporating temporal context in Bag-of-Words models.
Recognizing manipulation actions in arts and crafts shows using domain-specific visual and textual cues.
Transductive transfer learning for action recognition in tennis games.
Correcting cuboid corruption for action recognition in complex environment.
Detection of activities and events without explicit categorization.
Combining sparse and dense descriptors with temporal semantic structures for robust human action recognition.
YouTubeEvent: On large-scale video event classification.
A hybrid framework for event detection using multi-modal features.
Individuals, groups, and crowds: Modelling complex, multi-object behaviour in phase space.
Structure context of local features in realistic human action recognition.
Fine-grained categorization of fish motion patterns in underwater videos.
Human Focused Video Description.
A generative framework to investigate the underlying patterns in human activities.
Video event detection based on over-segmented STV regions.
Inferring social roles in long timespan video sequence.
Towards a theory of compositional learning and encoding of objects.
Tensor-based total bregman divergences between graphs.
An information geometry approach to shape density Minimum Description Length model selection.
An information theoretic approach to gender feature selection.
Information theoretic preattentive saliency: A closed-form solution.
Exploring the representation capabilities of the HOG descriptor.
Bayesian online learning on Riemannian manifolds using a dual model with applications to video object tracking.
Evaluating image segments by applying the description length to sets of superpixels.
Supervised feature quantization with entropy optimization.
A invertible dimension reduction of curves on a manifold.
Probabilistic shape-based segmentation using level sets.
Computing importance of 2D contour parts by reconstructability.
Search pruning in video surveillance systems: Efficiency-reliability tradeoff.
A multi-affine model for tensor decomposition.
STARS: A new ensemble partitioning approach.
Efficient variational inference in large-scale Bayesian compressed sensing.
Iteratively merging information from a pair of flash/no-flash images using nonlinear diffusion.
Human action silhouette recognition based on tensor analysis using synthetic silhouette data.
Temporal key poses for human action recognition.
Action recognition by learning discriminative key poses.
A discriminative prototype selection approach for graph embedding in human action recognition.
Modeling vs. learning approaches for monocular 3D human pose estimation.
Probabilistic subspace-based learning of shape dynamics modes for multi-view action recognition.
Automatic user interaction correction via Multi-label Graph cuts.
Expressive Maps for 3D Facial Expression Recognition.
UMPM benchmark: A multi-person dataset with synchronized video and motion capture data for evaluation of articulated human motion and interaction.
Interactive object segmentation for mono and stereo applications: Geodesic prior induced graph cut energy minimization.
Scene specific people detection by simple human interaction.
Cultural factors in the regression of non-verbal communication perception.
Towards robust cross-user hand tracking and shape recognition.
Real time hand pose estimation using depth sensors.
Real-time preprocessing for dense 3-D range imaging on the GPU: Defect interpolation, bilateral temporal averaging and guided filtering.
A cluster-based strategy for active learning of RGB-D object detectors.
I spy with my little eye: Learning optimal filters for cross-modal stereo under projected patterns.
Learning shape models for monocular human pose estimation from the Microsoft Xbox Kinect.
Putting the pieces together: Connected Poselets for human pose estimation.
Going into depth: Evaluating 2D and 3D cues for object classification on a new, large-scale object dataset.
Featureweighting in dynamic timewarping for gesture recognition in depth data.
Multi-modal surface registration for markerless initial patient setup in radiation therapy using microsoft's Kinect sensor.
A category-level 3-D object dataset: Putting the Kinect to work.
Real-time RGB-D mapping and 3-D modeling on the GPU using the random ball cover data structure.
3D with Kinect.
RGBD-HuDaAct: A color-depth video database for human daily activity recognition.
A brute force approach to depth camera odometry.
Fast and accurate environment modeling using three-dimensional occupancy grids.
Humanising GrabCut: Learning to segment humans using the Kinect.
Consolidation of multiple depth maps.
Spelling it out: Real-time ASL fingerspelling recognition.
The capturing of turbulent gas flows using multiple Kinects.
A novel stereoscopic cue for figure-ground segregation of semi-transparent objects.
Visual object classification by robots, using on-line, self-supervised learning.
Real-time plane extraction from depth images with the Randomized Hough Transform.
Detecting and tracking people using an RGB-D camera via multiple detector fusion.
More accurate pinhole camera calibration with imperfect planar target.
Visual estimation of pointed targets for robot guidance via fusion of face pose and hand orientation.
Deformable part models revisited: A performance evaluation for object category pose estimation.
Visual control of a multi-robot coupled system: Application to collision avoidance in human-robot interaction.
Unsupervised discovery of object classes in 3D outdoor scenarios.
Topological localization using optical flow descriptors.
Driving me around the bend: Learning to drive from visual gist.
High resolution visual terrain classification for outdoor robots.
Robot localization using 3D-models and an off-board monocular camera.
Visual grasp affordances from appearance-based cues.
Evaluation of stereo algorithms for 3D object recognition.
Semantic structure from motion with object and point interactions.
Real-time multi-person tracking with detector assisted structure propagation.
Efficient object detection and segmentation with a cascaded Hough Forest ISM.
Learning temporal signatures for Lip Reading.
Video summarization guiding evaluative rectification for industrial activity recognition.
Video scene classification based on natural language description.
Detecting individual in crowd with moving feature's structure consistency.
Spatiotemporally localized new event detection in crowds.
People appearance tracing in video by spectral graph transduction.
Reading the signs: A video based sign dictionary.
On the effect of temporal information on monocular 3d human pose estimation.
Automatic crowd density and motion analysis in airborne image sequences based on a probabilistic framework.
Automatic analysis of composite activities in video sequences using Key Action Discovery and hierarchical graphical models.
High-level situation recognition using Fuzzy Metric Temporal Logic, case studies in surveillance and smart environments.
On improving the robustness of differential optical flow.
Differentiating spontaneous from posed facial expressions within a generic facial expression recognition framework.
A joint estimation of head and body orientation cues in surveillance video.
An efficient IP approach to constrained multiple face tracking and recognition.
Human pose estimation using structural support vector machines.
Do they like me? Using video cues to predict desires during speed-dates.
Who knows who - Inverting the Social Force Model for finding groups.
Inverse rendering in SUV space with a linear texture model.
Accuracy of the spider model in decomposing layered surfaces.
A theory of color barcodes.
Illumination estimation from shadow borders.
General p constrained approach for colour constancy.
Illuminant color estimation for real-world mixed-illuminant scenes.
Color constancy and non-uniform illumination: Can existing algorithms work?
Estimating the unknown poses of a reference plane for specular shape recovery.
A new approach of photometric stereo from linear image representation under close lighting.
Photometric stereo with auto-radiometric calibration.
Deblurring shaken and partially saturated images.
Color correction using rotation matrix for HDR rendering in iCAM06.
Perceptually motivated automatic sharpness enhancement using hierarchy of non-local means.
Live 3D shape reconstruction, recognition and registration.
Real-time structure and motion recovery from two views of a multiplanar scene.
Surface reconstruction for RGB-D data using real-time depth propagation.
Real-time visual odometry from dense RGB-D images.
GEA optimization for live structureless motion estimation.
Recursive Live Dense Reconstruction: Some comments on established and imaginable new approaches.
Online 3D reconstruction using convex optimization.
Real-time global prediction for temporally stable stereo.
An asymmetric real-time dense visual localisation and mapping system.
Efficient edge-preserving stereo matching.
Stochastic models for semantic parsing, multi-faceted topic discovery, and causal event inference: Perspectives from natural language processing.
Learning and the language of thought.
Information theoretic methods for learning generative models for relational structures.
Object detection grammars.
Sum-product networks: A new deep architecture.
The evolution of stochastic grammars for representation and recognition of activities in videos.
PEL-CNF: Probabilistic event logic conjunctive normal form for video interpretation.
Interactive learning of human activities using active video composition.
Towards coherent natural language description of video streams.
An ontology for generating descriptions about natural outdoor scenes.
Unsupervised learning of stochastic AND-OR templates for object modeling.
Human parsing using stochastic and-or grammars and rich appearances.
Modeling symmetries for stochastic structural recognition.
Trainable 3D recognition using stereo matching.
Scale and rotation invariant color features for weakly-supervised object Learning in 3D space.
Scale-space representation of scalar functions on 2D manifolds.
Indoor scene segmentation using a structured light sensor.
Holistic 3D reconstruction of urban structures from low-rank textures.
CAD-model recognition and 6DOF pose estimation using 3D cues.
Projectable classifiers for multi-view object class recognition.
Revisiting 3D geometric models for accurate object shape and pose.
A compositional approach to learning part-based models of objects.
A hypothesize-and-bound algorithm for simultaneous object classification, pose estimation and 3D reconstruction from a single 2D image.
Automatic alignment of paintings and photographs depicting a 3D scene.
A GPU-assisted personal video organizing system.
Real-time GPU-based face detection in HD video sequences.
Variational Depth from Defocus in real-time.
Real-time semi-global matching disparity estimation on the GPU.
GPU & CPU cooperative accelerated pedestrian and vehicle detection.
Accelerating multi-scale flows for LDDKBM diffeomorphic registration.
Rapid weak-perspective Structure from Motion with missing data.
Learning a dictionary of deformable patches using GPUs.
Long term video segmentation through pixel level spectral clustering on GPUs.
On building an accurate stereo matching system on graphics hardware.
Mid-air interactive display using modulated display light.
MoPaCo: High telepresence video communication system using motion parallax with monocular camera.
Demonstrations and live evaluation for the gesture recognition challenge.
Real time feature point tracking with automatic model selection.
Freehand 3D scanning in a mobile environment using video.
Developer-centred interface design for computer vision.
Monocular omnidirectional head motion capture in the visible light spectrum.
Measuring and reducing observational latency when recognizing actions.
Robust validation of Visual Focus of Attention using adaptive fusion of head and eye gaze patterns.
Using segmented 3D point clouds for accurate likelihood approximation in human pose tracking.
Volumetric 3D graphics on commodity displays using active gaze tracking.
Real-time upper body tracking with online initialization using a range sensor.
Real-time sign language letter and word recognition from depth data.
Structure from Motion using full spherical panoramic cameras.
Multiple Hypothesis Tracking in camera networks.
Scene structure recovery from a single omnidirectional image.
Optimum alignment of panoramic images for stereoscopic navigation in image-based telepresence systems.
Adapting a real-time monocular visual SLAM from conventional to omnidirectional cameras.
A spherical representation for efficient visual loop closing.
Tracking moving objects with a catadioptric sensor using particle filter.
Matching cylindrical panorama sequences using planar reprojections.
An insect-inspired omnidirectional vision system including UV-sensitivity and polarisation.
Underwater sensing with omni-directional stereo camera.
3D environment measurement using binocular stereo and motion stereo by mobile robot with omnidirectional stereo camera.
Calibration of radially symmetric distortion based on linearity in the calibrated image.
Indoor SLAM using a range-augmented omnidirectional vision.
Self-localization of mobile robot equipped with omnidirectional camera using image matching and 3D-2D edge matching.
Non-sequential structure from motion.
Classification and reconstruction of surfaces from point clouds of man-made objects.
Monitoring changes of 3D building elements from unordered photo collections.
Incremental import vector machines for large area land cover classification.
Classification of multitemporal remote sensing data of different resolution using Conditional Random Fields.
Text detection and recognition in urban scenes.
Computer vision for the remote sensing of atmospheric visibility.
3D roof details by 3D aerial vision.
Graph matching in 3D space for structural seismic damage assessment.
A hierarchical conditional random field model for labeling and classifying images of man-made scenes.
Multi-view manhole detection, recognition, and 3D localisation.
Robust outliers detection in image point matching.
Lane formation in a microscopic model and the corresponding partial differential equation.
Calibrating dynamic pedestrian route choice with an Extended Range Telepresence System.
T-junction: Experiments, trajectory collection, and analysis.
Integrating pedestrian simulation, tracking and event detection for crowd analysis.
Analyzing pedestrian behavior in crowds for automatic detection of congestions.
Optimizing interaction force for global anomaly detection in crowded scenes.
Virtual Tawaf: A case study in simulating the behavior of dense, heterogeneous crowds.
Everybody needs somebody: Modeling social and grouping behavior on a linear programming multiple people tracker.
MCMC-based tracking and identification of leaders in groups.
"Dancing icons" detection.
Visual localization by linear combination of image descriptors.
A mobile structured light system for food volume estimation.
Wi-Fi and keygraphs for localization with cell phones.
A fast and accurate cascade subspace face/eye detector on mobile devices.
FaceSimile: A mobile application for face image search based on interactive shape manipulation.
Visual localization for mobile surveillance.
Good features to track: A view geometric approach.
Towards robust and efficient text sign reading from a mobile phone.
Epipolar geometry estimation for wide-baseline omnidirectional street view images.
Automatic text detection for mobile augmented reality translation.
Face authentication using graph-based low-rank representation of facial local structures for mobile vision applications.
Compressing Feature Sets with Digital Search Trees.
Augmented faces.
Structure and motion estimation from rolling shutter video.
High-quality video denoising for motion-based exposure control.
Stabilizing cell phone video using inertial measurement sensors.