iccv20

iccv 2019 论文列表

2019 IEEE/CVF International Conference on Computer Vision Workshops, ICCV Workshops 2019, Seoul, Korea (South), October 27-28, 2019.

Content-Consistent Generation of Realistic Eyes with Style.
ASSR: A Lightweight Super Resolution Network with Aggregative Structure.
Access Characteristic-based Cache Replacement Policy in an SSD.
Towards Generalizable Distance Estimation By Leveraging Graph Information.
Extending Convolutional Pose Machines for Facial Landmark Localization in 3D Point Clouds.
An Empirical Study of the Relation Between Network Architecture and Complexity.
Learning to Inpaint by Progressively Growing the Mask Regions.
Learning Representational Invariance Instead of Categorization.
Scene Graph Contextualization in Visual Commonsense Reasoning.
Lossy GIF Compression Using Deep Intrinsic Parameterization.
An Adversarial Approach to Discriminative Modality Distillation for Remote Sensing Image Classification.
Joint Wasserstein Autoencoders for Aligning Multimodal Embeddings.
Aesthetic Image Captioning From Weakly-Labelled Photographs.
Distance Based Training for Cross-Modality Person Re-Identification.
Sequential Learning for Cross-Modal Retrieval.
DECCNet: Depth Enhanced Crowd Counting.
Two-Stream Video Classification with Cross-Modality Attention.
Do Cross Modal Systems Leverage Semantic Relationships?
Smile, Be Happy : ) Emoji Embedding for Visual Sentiment Analysis.
Harvesting Information from Captions for Weakly Supervised Semantic Segmentation.
Seeing and Hearing Egocentric Actions: How Much Can We Learn?
EPIC-Tent: An Egocentric Video Dataset for Camping Tent Assembly.
Assessment of Optical See-Through Head Mounted Display Calibration for Interactive Augmented Reality.
An Analysis of How Driver Experience Affects Eye-Gaze Behavior for Robotic Wheelchair Operation.
First-Person Camera System to Evaluate Tender Dementia-Care Skill.
Learning Spatiotemporal Attention for Egocentric Action Recognition.
Weakly-Supervised Degree of Eye-Closeness Estimation.
The Applicability of Cycle GANs for Pupil and Eyelid Segmentation, Data Generation and Image Refinement.
Multitask Learning to Improve Egocentric Action Recognition.
Manipulation-Skill Assessment from Videos with Spatial Attention Network.
Ego-Semantic Labeling of Scene from Depth Image for Visually Impaired and Blind People.
Simultaneous Segmentation and Recognition: Towards More Accurate Ego Gesture Recognition.
EgoVQA - An Egocentric Video Question Answering Benchmark Dataset.
Multispectral Reconstruction From Reference Objects in the Scene.
Learning From Synthetic Photorealistic Raindrop for Single Image Raindrop Removal.
Multi-Level and Multi-Scale Spatial and Spectral Fusion CNN for Hyperspectral Image Super-Resolution.
Event-Driven Video Frame Synthesis.
Single Image Intrinsic Decomposition with Discriminative Feature Encoding.
Large Scale Multimodal Data Capture, Evaluation and Maintenance Framework for Autonomous Driving Datasets.
On Generalizing Detection Models for Unconstrained Environments.
Real-Time Vehicle Localization using on-Board Visual SLAM for Detection and Tracking.
Motion Segmentation via Synchronization.
Desoiling Dataset: Restoring Soiled Areas on Automotive Fisheye Cameras.
Multi-Task Learning via Scale Aware Feature Pyramid Networks and Effective Joint Head.
Class Feature Pyramids for Video Explanation.
Interpreting Undesirable Pixels for Image Classification on Black-Box Models.
Free-Lunch Saliency via Attention in Atari Agents.
Decision explanation and feature importance for invertible networks.
Propagated Perturbation of Adversarial Attack for well-known CNNs: Empirical Study and its Explanation.
Assisting human experts in the interpretation of their visual process: A case study on assessing copper surface adhesive potency.
Semantically Interpretable Activation Maps: what-where-how explanations within CNNs.
Characterizing Sources of Uncertainty to Proxy Calibration and Disambiguate Annotator and Data Bias.
Towards A Rigorous Evaluation Of XAI Methods On Time Series.
Bin-wise Temperature Scaling (BTS): Improvement in Confidence Calibration Performance through Simple Scaling Techniques.
Understanding Convolutional Networks Using Linear Interpreters - Extended Abstract.
Explaining Convolutional Neural Networks using Softmax Gradient Layer-wise Relevance Propagation.
Explaining Visual Models by Causal Attribution.
Occlusions for Effective Data Augmentation in Image Classification.
Why are Saliency Maps Noisy? Cause of and Solution to Noisy Saliency Maps.
SpiralNet++: A Fast and Highly Efficient Mesh Convolution Operator.
Learning to Reconstruct Symmetric Shapes using Planar Parameterization of 3D Surface.
Residual Attention Graph Convolutional Network for Geometric 3D Scene Classification.
Render4Completion: Synthesizing Multi-View Depth Maps for 3D Shape Completion.
Rethinking Task and Metrics of Instance Segmentation on 3D Point Clouds.
A Geometric Approach to Obtain a Bird's Eye View From an Image.
Momen^et: Flavor the Moments in Learning to Classify Shapes.
Hyperparameter-Free Losses for Model-Based Monocular Reconstruction.
Floors are Flat: Leveraging Semantics for Real-Time Surface Normal Prediction.
Lifting AutoEncoders: Unsupervised Learning of a Fully-Disentangled 3D Morphable Model Using Deep Non-Rigid Structure From Motion.
Self-Supervised Learning of Depth and Motion Under Photometric Inconsistency.
Patch-Based Reconstruction of a Textureless Deformable 3D Surface from a Single RGB Image.
Generalizing Monocular 3D Human Pose Estimation in the Wild.
Auto-Encoding Meshes of any Topology with the Current-Splatting and Exponentiation Layers.
Parametric Human Shape Reconstruction via Bidirectional Silhouette Guidance.
Multi-View PointNet for 3D Scene Understanding.
A Decoder-Free Approach for Unsupervised Clustering and Manifold Learning with Random Triplet Mining.
Deep Compressive Sensing for Visual Privacy Protection in FlatCam Imaging.
Deep Learning-Based Imaging using Single-Lens and Multi-Aperture Diffractive Optical Systems.
SUPER Learning: A Supervised-Unsupervised Framework for Low-Dose CT Image Reconstruction.
Deep Plug-and-Play Prior for Parallel MRI Reconstruction.
A Simple and Robust Deep Convolutional Approach to Blind Image Denoising.
Blind Unitary Transform Learning for Inverse Problems in Light-Field Imaging.
Integrating Data and Image Domain Deep Learning for Limited Angle Tomography using Consensus Equilibrium.
CNN-Based Cross-Dataset No-Reference Image Quality Assessment.
A HVS-Inspired Attention to Improve Loss Metrics for CNN-Based Perception-Oriented Super-Resolution.
RIDNet: Recursive Information Distillation Network for Color Image Denoising.
Online Regularization by Denoising with Applications to Phase Retrieval.
Image Super-Resolution via Residual Block Attention Networks.
Deep Camera: A Fully Convolutional Neural Network for Image Signal Processing.
Semi-Supervised Eye Makeup Transfer by Swapping Learned Representation.
Flickr1024: A Large-Scale Dataset for Stereo Image Super-Resolution.
Deep Hyperspectral Prior: Single-Image Denoising, Inpainting, Super-Resolution.
Adaptive Ptych: Leveraging Image Adaptive Generative Priors for Subsampled Fourier Ptychography.
Deep Video Deblurring: The Devil is in the Details.
Lightweight and Accurate Recursive Fractal Network for Image Super-Resolution.
Removing Imaging Artifacts in Electron Microscopy using an Asymmetrically Cyclic Adversarial Network without Paired Training Data.
A System Framework for Localization and Mapping using High Resolution Cameras of Mobile Devices.
How to Improve CNN-Based 6-DoF Camera Pose Estimation.
Adversarial Networks for Camera Pose Regression and Refinement.
Camera Relocalization by Exploiting Multi-View Constraints for Scene Coordinates Regression.
SLAMANTIC - Leveraging Semantics to Improve VSLAM in Dynamic Environments.
Spatial Perception by Object-Aware Visual Scene Representation.
TriDepth: Triangular Patch-Based Deep Depth Prediction.
Multi-Modal Pyramid Feature Combination for Human Action Recognition.
Audio-Video Based Emotion Recognition Using Minimum Cost Flow Algorithm.
DIFRINT: Deep Iterative Frame Interpolation for Full-Frame Video Stabilization.
Summarizing Long-Length Videos with GAN-Enhanced Audio/Visual Features.
Multi-Modal Domain Adaptation for Fine-Grained Action Recognition.
Supplementary Material: AVA-ActiveSpeaker: An Audio-Visual Dataset for Active Speaker Detection.
Learning to Detect and Retrieve Objects From Unlabeled Videos.
A Tale of Two Modalities for Video Captioning.
FaceSyncNet: A Deep Learning-Based Approach for Non-Linear Synchronization of Facial Performance Videos.
RITnet: Real-time Semantic Segmentation of the Eye for Gaze Tracking.
Eye Semantic Segmentation with A Lightweight Model.
EyeNet: Attention Based Convolutional Encoder-Decoder Network for Eye Region Segmentation.
EyeNet: A Multi-Task Deep Network for Off-Axis Eye Gaze Estimation.
D-ID-Net: Two-Stage Domain and Identity Learning for Identity-Preserving Image Generation From Semantic Segmentation.
MinENet: A Dilated CNN for Semantic Segmentation of Eye Features.
Eye-MMS: Miniature Multi-Scale Segmentation Network of Key Eye-Regions in Embedded Applications.
U2Eyes: A Binocular Dataset for Eye Tracking and Gaze Estimation.
Assessment of Shift-Invariant CNN Gaze Mappings for PS-OG Eye Movement Sensors.
Dual Reconstruction with Densely Connected Residual Network for Single Image Super-Resolution.
W-Net: Two-Stage U-Net With Misaligned Data for Raw-to-RGB Mapping.
AI Benchmark: All About Deep Learning on Smartphones in 2019.
The Vid3oC and IntVID Datasets for Video Super Resolution and Quality Mapping.
Frequency Separation for Real-World Super-Resolution.
AIM 2019 Challenge on Bokeh Effect Synthesis: Methods and Results.
AIM 2019 Challenge on RAW to RGB Mapping: Methods and Results.
AIM 2019 Challenge on Real-World Image Super-Resolution: Methods and Results.
AIM 2019 Challenge on Constrained Super-Resolution: Methods and Results.
AIM 2019 Challenge on Image Extreme Super-Resolution: Methods and Results.
Towards Spectral Estimation from a Single RGB Image in the Wild.
AIM 2019 Challenge on Image Demoireing: Methods and Results.
AIM 2019 Challenge on Image Demoireing: Dataset and Study.
Image Super-Resolution via Attention Based Back Projection Networks.
DIV8K: DIVerse 8K Resolution Image Dataset.
PoSNet: 4x Video Frame Interpolation Using Position-Specific Flow.
Robust Temporal Super-Resolution for Dynamic Motion Videos.
Multi-Scale Dynamic Feature Encoding Network for Image Demoiréing.
Efficient Video Super-Resolution through Recurrent Latent Space Propagation.
AIM 2019 Challenge on Video Extreme Super-Resolution: Methods and Results.
Un-Paired Real World Super-Resolution with Degradation Consistency.
Saliency Map-Aided Generative Adversarial Network for RAW to RGB Mapping.
HighEr-Resolution Network for Image Demosaicing and Enhancing.
Adaptive Densely Connected Single Image Super-Resolution.
Quadratic Video Interpolation for VTSR Challenge.
Depth-Guided Dense Dynamic Filtering Network for Bokeh Effect Rendering.
Unsupervised Learning for Real-World Super-Resolution.
MGBPv2: Scaling Up Multi-Grid Back-Projection Networks.
AIM 2019 Challenge on Video Temporal Super-Resolution: Methods and Results.
NoUCSR: Efficient Super-Resolution Network without Upsampling Convolution.
Extremely Weak Supervised Image-to-Image Translation for Semantic Segmentation.
Augmented Reality Based Recommendations Based on Perceptual Shape Style Compatibility with Objects in the Viewpoint and Color Compatibility with the Background.
Quotienting Impertinent Camera Kinematics for 3D Video Stabilization.
Can Generative Adversarial Networks Teach Themselves Text Segmentation?
PFAGAN: An Aesthetics-Conditional GAN for Generating Photographic Fine Art.
Image Disentanglement and Uncooperative Re-Entanglement for High-Fidelity Image-to-Image Translation.
Blind Single Image Reflection Suppression for Face Images using Deep Generative Priors.
3SGAN: 3D Shape Embedded Generative Adversarial Networks.
SteReFo: Efficient Image Refocusing with Stereo Vision.
SMIT: Stochastic Multi-Label Image-to-Image Translation.
Edge-Informed Single Image Super-Resolution.
EdgeConnect: Structure Guided Image Inpainting using Edge Prediction.
Unsupervised Domain Adaptation using Deep Networks with Cross-Grafted Stacks.
Cross Domain Image Matching in Presence of Outliers.
Towards Efficient Instance Segmentation with Hierarchical Distillation.
Domain Adaptation for Vehicle Detection from Bird's Eye View LiDAR Point Cloud Data.
Hallucinating Agnostic Images to Generalize Across Domains.
Improving CNN Classifiers by Estimating Test-Time Priors.
Multi-Level Domain Adaptive Learning for Cross-Domain Detection.
Incremental Learning Techniques for Semantic Segmentation.
DeepMark: One-Shot Clothing Detection.
Leveraging Class Hierarchy in Fashion Classification.
Powering Virtual Try-On via Auxiliary Human Segmentation Learning.
Class-Based Styling: Real-Time Localized Style Transfer with Semantic Segmentation.
Walking Through Shanshui: Generating Chinese Shanshui Paintings via Real-Time Tracking of Human Position.
Unsupervised Image-to-Video Clothing Transfer.
A Weakly Supervised Adaptive Triplet Loss for Deep Metric Learning.
Fourier-CPPNs for Image Synthesis.
Generative Modelling of Semantic Segmentation Data in the Fashion Domain.
Deep Metric Learning for Cross-Domain Fashion Instance Retrieval.
Generating High-Resolution Fashion Model Images Wearing Custom Outfits.
Artist-Guided Semiautomatic Animation Colorization.
A Global-Local Embedding Module for Fashion Landmark Detection.
Regularized Adversarial Training for Single-Shot Virtual Try-On.
Clothing Recognition in the Wild using the Amazon Catalog.
Deep Garment Image Matting for a Virtual Try-on System.
Dropout Induced Noise for Co-Creative GAN Systems.
Semantic Segmentation of Fashion Images Using Feature Pyramid Networks.
LA-VITON: A Network for Looking-Attractive Virtual Try-On.
Pose Guided Attention for Multi-Label Fashion Image Classification.
Semantically Consistent Hierarchical Text to Fashion Image Synthesis with an Enhanced-Attentional Generative Adversarial Network.
Translating Visual Art Into Music.
The iMaterialist Fashion Attribute Dataset.
Fashion Image Retrieval with Capsule Networks.
UVTON: UV Mapping to Consider the 3D Structure of a Human in Image-Based Virtual Try-On Network.
Improving Fashion Landmark Detection by Dual Attention Feature Enhancement.
Multi-View 3D Face Reconstruction in the Wild Using Siamese Networks.
3D Face Shape Regression From 2D Videos with Multi-Reconstruction and Mesh Retrieval.
The 2nd 3D Face Alignment in the Wild Challenge (3DFAW-Video): Dense Reconstruction From Video.
Efficient Learning on Point Clouds with Basis Point Sets.
Sparse Generative Adversarial Network.
DHA: Supervised Deep Learning to Hash with an Adaptive Loss Function.
Dynamic Block Sparse Reparameterization of Convolutional Neural Networks.
Age Estimation From Facial Parts Using Compact Multi-Stream Convolutional Neural Networks.
Deep Total Variation Support Vector Networks.
Self-Supervised Learning of Class Embeddings from Video.
Low-bit Quantization of Neural Networks for Efficient Inference.
Differential-Evolution-Based Generative Adversarial Networks for Edge Detection.
Event-Based Incremental Broad Learning System for Object Classification.
VACL: Variance-Aware Cross-Layer Regularization for Pruning Deep Residual Networks.
Large Scale Near-Duplicate Image Retrieval via Patch Embedding.
Augmentation Invariant Training.
Beyond Attributes: High-Order Attribute Features for Zero-Shot Learning.
MuffNet: Multi-Layer Feature Federation for Mobile Deep Learning.
Compact and Efficient Multitask Learning in Vision, Language and Speech.
More About Covariance Descriptors for Image Set Coding: Log-Euclidean Framework Based Kernel Matrix Representation.
DAME WEB: DynAmic MEan with Whitening Ensemble Binarization for Landmark Retrieval without Human Annotation.
Metric-Based Regularization and Temporal Ensemble for Multi-Task Learning using Heterogeneous Unsupervised Tasks.
Unsupervised Extraction of Local Image Descriptors via Relative Distance Ranking Loss.
Embarrassingly Simple Binary Representation Learning.
The Jester Dataset: A Large-Scale Video Dataset of Human Gestures.
3D Hand Pose Estimation from RGB Using Privileged Learning with Depth Data.
Hand Pose Ensemble Learning Based on Grouping Features of Hand Point Sets.
Disentangling Pose from Appearance in Monochrome Hand Images.
Talking With Your Hands: Scaling Hand Gestures and Recognition With CNNs.
Explicit Spatiotemporal Joint Relation Learning for Tracking Human Pose.
Satellite Pose Estimation with Deep Landmark Regression and Nonlinear Pose Refinement.
CorNet: Generic 3D Corners for 6D Pose Estimation of New Objects without Retraining.
A Refined 3D Pose Dataset for Fine-Grained Object Categories.
An Annotation Saved is an Annotation Earned: Using Fully Synthetic Training for Object Detection.
Unsupervised Joint 3D Object Model Learning and 6D Pose Estimation for Depth-Based Instance Segmentation.
HomebrewedDB: RGB-D Dataset for 6D Pose Estimation of 3D Objects.
CullNet: Calibrated and Pose Aware Confidence Scores for Object Pose Estimation.
5G-CAGE: A Context and Situational Awareness System for City Public Safety with Video Processing at a Virtualized Ecosystem.
LIP: Learning Instance Propagation for Video Object Segmentation.
Multi-Video Temporal Synchronization by Matching Pose Features of Shared Moving Subjects.
ShuffleFaceNet: A Lightweight Face Architecture for Efficient and Highly-Accurate Face Recognition.
A Graph Based Unsupervised Feature Aggregation for Face Recognition.
A Progressive Learning Framework for Unconstrained Face Recognition.
Towards Flops-Constrained Face Recognition.
Factorizing and Reconstituting Large-Kernel MBConv for Lightweight Face Recognition.
Effective Methods for Lightweight Image-Based and Video-Based Face Recognition.
AirFace: Lightweight and Efficient Model for Face Recognition.
Geometry Guided Feature Aggregation in Video Face Recognition.
Unknown Identity Rejection Loss: Utilizing Unlabeled Data for Face Recognition.
Improved Knowledge Distillation for Training Fast Low Resolution Face Recognition Model.
VarGFaceNet: An Efficient Variable Group Convolutional Neural Network for Lightweight Face Recognition.
Lightweight Face Recognition Challenge.
Joint Trajectory and Fatigue Analysis in Wheelchair Users.
Home-Based Physical Therapy with an Interactive Computer Vision System.
Video Indexing Using Face Appearance and Shot Transition Detection.
Active 3D Classification of Multiple Objects in Cluttered Scenes.
Street Crossing Aid Using Light-Weight CNNs for the Visually Impaired.
Object Captioning and Retrieval with Natural Language.
A Realistic Face-to-Face Conversation System Based on Deep Neural Networks.
Social and Scene-Aware Trajectory Prediction in Crowded Spaces.
Dynamic Subtitles: A Multimodal Video Accessibility Enhancement Dedicated to Deaf and Hearing Impaired Users.
Deep Learning Based Wearable Assistive System for Visually Impaired People.
Salient Contour-Aware Based Twice Learning Strategy for Saliency Detection.
Deep Learning Performance in the Presence of Significant Occlusions - An Intelligent Household Refrigerator Case.
Learning to Navigate Robotic Wheelchairs from Demonstration: Is Training in Simulation Viable?
Forced Spatial Attention for Driver Foot Activity Classification.
On-Device Image Classification with Proxyless Neural Architecture Search and Quantization-Aware Fine-Tuning.
Automated Multi-Stage Compression of Neural Networks.
512KiB RAM Is Enough! Live Camera Face Recognition DNN on MCU.
Real-Time Object Detection On Low Power Embedded Platforms.
Enriching Variety of Layer-Wise Learning Information by Gradient Combination.
Low-Power Neural Networks for Semantic Segmentation of Satellite Images.
A System-Level Solution for Low-Power Object Detection.
Efficient Single Image Super-Resolution via Hybrid Residual Feature Learning with Compact Back-Projection Network.
Direct Feedback Alignment Based Convolutional Neural Network Training for Low-Power Online Learning Processor.
DBUS: Human Driving Behavior Understanding System.
Towards Learning Multi-Agent Negotiations via Self-Play.
On Control Transitions in Autonomous Driving: A Framework and Analysis for Characterizing Scene Complexity.
NADS-Net: A Nimble Architecture for Driver and Seat Belt Detection via Convolutional Neural Networks.
Fishyscapes: A Benchmark for Safe Semantic Segmentation in Autonomous Driving.
FuseMODNet: Real-Time Camera and LiDAR Based Moving Object Detection for Robust Low-Light Autonomous Driving.
Soft Prototyping Camera Designs for Car Detection Based on a Convolutional Neural Network.
RotInvMTL: Rotation Invariant MultiNet on Fisheye Images for Autonomous Driving Applications.
Advanced Pedestrian Dataset Augmentation for Autonomous Driving.
Multi-View Reprojection Architecture for Orientation Estimation.
Spatio-Temporal Action Graph Networks.
DeepTrailerAssist: Deep Learning Based Trailer Detection, Tracking and Articulation Angle Estimation on Automotive Rear-View Camera.
Adherent Raindrop Removal with Self-Supervised Attention Maps and Spatio-Temporal Generative Adversarial Networks.
Range Adaptation for 3D Object Detection in LiDAR.
Conditional Vehicle Trajectories Prediction in CARLA Urban Environment.
Intra-Frame Object Tracking by Deblatting.
Visual Tracking by Means of Deep Reinforcement Learning and an Expert Demonstrator.
Fast Visual Object Tracking using Ellipse Fitting for Rotated Bounding Boxes.
Visual Object Tracking by Using Ranking Loss.
Multi-Adapter RGBT Tracking.
Multi-Modal Fusion for End-to-End RGB-T Tracking.
Semi-Automatic Annotation of Objects in Visual-Thermal Video.
The Seventh Visual Object Tracking VOT2019 Challenge Results.

Matej Kristan Amanda Berg Linyu Zheng Litu Rout Luc Van Gool Luca Bertinetto Martin Danelljan Matteo Dunnhofer Meng Ni Min Young Kim Ming Tang Ming-Hsuan Yang Abdelrahman Eldesokey Naveen Paluru Niki Martinel Pengfei Xu Pengfei Zhang Pengkun Zheng Pengyu Zhang Philip H. S. Torr Qi Zhang Qiang Wang Qing Guo Radu Timofte Jani Käpylä Rama Krishna Sai Subrahmanyam Gorthi Richard M. Everson Ruize Han Ruohan Zhang Shan You Shao-Chuan Zhao Shengwei Zhao Shihu Li Shikun Li Shiming Ge Gustavo Fernández Shuai Bai Shuosen Guan Tengfei Xing Tianyang Xu Tianyu Yang Ting Zhang Tomás Vojír Wei Feng Weiming Hu Weizhao Wang Abel Gonzalez-Garcia Wenjie Tang Wenjun Zeng Wenyu Liu Xi Chen Xi Qiu Xiang Bai Xiao-Jun Wu Xiaoyun Yang Xier Chen Xin Li Alireza Memarmoghadam Xing Sun Xingyu Chen Xinmei Tian Xu Tang Xuefeng Zhu Yan Huang Yanan Chen Yanchao Lian Yang Gu Yang Liu Andong Lu Yanjie Chen Yi Zhang Yinda Xu Yingming Wang Yingping Li Yu Zhou Yuan Dong Yufei Xu Yunhua Zhang Yunkun Li Anfeng He Zeyu Wang Zhao Luo Zhaoliang Zhang Zhen-Hua Feng Zhenyu He Zhichao Song Zhihao Chen Zhipeng Zhang Zhirong Wu Zhiwei Xiong Zhongjian Huang Anton Varfolomieiev Zhu Teng Zihan Ni Antoni B. Chan Jirí Matas Ardhendu Shekhar Tripathi Arnold W. M. Smeulders Bala Suraj Pedasingu Bao Xin Chen Baopeng Zhang Baoyuan Wu Bi Li Bin He Bin Yan Bing Bai Ales Leonardis Bing Li Bo Li Byeong Hak Kim Chao Ma Chen Fang Chen Qian Cheng Chen Chenglong Li Chengquan Zhang Chi-Yi Tsai Michael Felsberg Chong Luo Christian Micheloni Chunhui Zhang Dacheng Tao Deepak Gupta Dejia Song Dong Wang Efstratios Gavves Eunu Yi Fahad Shahbaz Khan Roman P. Pflugfelder Fangyi Zhang Fei Wang Fei Zhao George De Ath Goutam Bhat Guangqi Chen Guangting Wang Guoxuan Li Hakan Cevikalp Hao Du Joni-Kristian Kämäräinen Haojie Zhao Hasan Saribas Ho Min Jung Hongliang Bai Hongyuan Yu Houwen Peng Huchuan Lu Hui Li Jiakun Li Luka Cehovin Zajc Jianhua Li Jianlong Fu Jie Chen Jie Gao Jie Zhao Jin Tang Jing Li Jingjing Wu Jingtuo Liu Jinqiao Wang Ondrej Drbohlav Jinqing Qi Jinyue Zhang John K. Tsotsos Jong Hyuk Lee Joost van de Weijer Josef Kittler Jun Ha Lee Junfei Zhuang Kangkai Zhang Kangkang Wang Alan Lukezic Kenan Dai Lei Chen Lei Liu Leida Guo Li Zhang Liang Wang Liangliang Wang Lichao Zhang Lijun Wang Lijun Zhou

3D Texturing From Multi-Date Satellite Images.
To Bundle Adjust or Not: A Comparison of Relative Geolocation Correction Strategies for Satellite Multi-View Stereo.
HumanMeshNet: Polygonal Mesh Recovery of Humans.
FullFusion: A Framework for Semantic Reconstruction of Dynamic Scenes.
Learning Dense Wide Baseline Stereo Matching for People.
3D Shape Reconstruction of Plant Roots in a Cylindrical Tank From Multiview Images.
Leveraging Vision Reconstruction Pipelines for Satellite Imagery.
Counterfactual Depth from a Single RGB Image.
A Direct Least-Squares Solution to Multi-View Absolute and Relative Pose from 2D-3D Perspective Line Pairs.
SharpNet: Fast and Accurate Recovery of Occluding Contours in Monocular Depth Estimation.
Landmark-Guided Deformation Transfer of Template Facial Expressions for Automatic Generation of Avatar Blendshapes.
Learned Semantic Multi-Sensor Depth Map Fusion.
Silhouette-Assisted 3D Object Instance Reconstruction from a Cluttered Scene.
CNN-Based Cost Volume Analysis as Confidence Measure for Dense Matching.
Towards Dense 3D Reconstruction for Mixed Reality in Healthcare: Classical Multi-View Stereo vs Deep Learning.
Why Does Data-Driven Beat Theory-Driven Computer Vision?
Investigating Convolutional Neural Networks using Spatial Orderness.
Layer-Wise Invertibility for Extreme Memory Cost Reduction of CNN Training.
AdvGAN++: Harnessing Latent Layers for Adversary Generation.
Searching for Accurate Binary Neural Architectures.
HoloGAN: Unsupervised Learning of 3D Representations From Natural Images.
MSNet: Structural Wired Neural Architecture Search for Internet of Things.
Localizing Occluders with Compositional Convolutional Networks.
Matrix Nets: A New Deep Architecture for Object Detection.
SqueezeNAS: Fast Neural Architecture Search for Faster Semantic Segmentation.
Adaptive Activation Functions Using Fractional Calculus.
Adaptive Convolutional Kernels.
4-Connected Shift Residual Networks.
Attention Routing Between Capsules.
GCNet: Non-Local Networks Meet Squeeze-Excitation Networks and Beyond.
Efficient Structured Pruning and Architecture Searching for Group Convolution.
BinaryDenseNet: Developing an Architecture for Binary Neural Networks.
HM-NAS: Efficient Neural Architecture Search via Hierarchical Masking.
Understanding the Effects of Pre-Training for Object Detectors via Eigenspectrum.
Cross-Granularity Attention Network for Semantic Segmentation.
Resource Efficient 3D Convolutional Neural Networks.
Sequentially Aggregated Convolutional Networks.
EENA: Efficient Evolution of Neural Architecture.
Evaluating Text-to-Image Matching using Binary Image Selection (BISON).
SUN-Spot: An RGB-D Dataset With Spatial Referring Expressions.
Are we Asking the Right Questions in MovieQA?
Instance-Based Video Search via Multi-Task Retrieval and Re-Ranking.
Fusion of Multimodal Embeddings for Ad-Hoc Video Search.
Unsupervised Teacher-Student Model for Large-Scale Video Retrieval.
Cast Search via Two-Stream Label Propagation.
A Semi-Supervised Maximum Margin Metric Learning Approach for Small Scale Person Re-Identification.
A Densenet Based Robust Face Detection Framework.
Single-Stage Joint Face Detection and Alignment.
Bayesian Gait-Based Gender Identification (BGGI) Network on Individuals Wearing Loosely Fitted Clothing.
What Face and Body Shapes Can Tell Us About Height.
Fusing Two Directions in Cross-Domain Adaption for Real Life Person Search by Language.
Part Matching with Multi-Level Attention for Person Re-Identification.
Cross-Modal Person Search: A Coarse-to-Fine Framework using Bi-Directional Text-Image Matching.
A Topological Graph-Based Representation for Denoising Low Quality Binary Images.
Triplet-Aware Scene Graph Embeddings.
Scene Graph Prediction with Limited Labels.
SynthRel0: Towards a Diagnostic Dataset for Relational Representation Learning.
Attention-Translation-Relation Network for Scalable Scene Graph Generation.
Detecting Visual Relationships Using Box Attention.
Spatial Residual Layer and Dense Connection Block Enhanced Spatial Temporal Graph Convolutional Network for Skeleton-Based Action Recognition.
Visual Relationships as Functions: Enabling Few-Shot Scene Graph Prediction.
Predicting Heart Rate Variations of Deepfake Videos using Neural ODE.
Modeling on the Feasibility of Camera-Based Blood Glucose Measurement.
Combating the Impact of Video Compression on Non-Contact Vital Sign Measurement Using Supervised Learning.
Impact of Sympathetic Activation in Imaging Photoplethysmography.
Architectural Tricks for Deep Learning in Remote Photoplethysmography.
A Thermal Camera Based Continuous Body Temperature Measurement System.
Micro Expression Classification using Facial Color and Deep Learning Methods.
Contact-Free Monitoring of Physiological Parameters in People With Profound Intellectual and Multiple Disabilities.
Camera-Based On-Line Short Cessation of Breathing Detection.
Performance Evaluation of Visual Object Detection and Tracking Algorithms Used in Remote Photoplethysmography.
Clinical Scene Segmentation with Tiny Datasets.
Single-Image Facial Expression Recognition Using Deep 3D Re-Centralization.
Multimodal Deep Models for Predicting Affective Responses Evoked by Movies.
Dynamic Facial Models for Video-Based Dimensional Affect Estimation.
Who Goes There? Exploiting Silhouettes and Wearable Signals for Subject Identification in Multi-Person Environments.
A-MAL: Automatic Motion Assessment Learning from Properly Performed Motions in 3D Skeleton Videos.
On the Vector Space in Photoplethysmography Imaging.
Efficient Real-Time Camera Based Estimation of Heart Rate and Its Variability.
Comprehensive Video Understanding: Video Summarization with Content-Based Video Recommender Design.
Video Multitask Transformer Network.
Video Summarization by Learning Relationships between Action and Scene.
Temporal U-Nets for Video Summarization with Scene and Action Recognition.
Enhancing Temporal Action Localization with Transfer Learning from Action Recognition.
Markov Decision Process for Video Generation.
Interpretable Spatio-Temporal Attention for Video Action Recognition.
Video-Text Compliance: Activity Verification Based on Natural Language Instructions.
Towards Segmenting Anything That Moves.
Video Representation Learning by Dense Predictive Coding.
End-to-End Video Captioning.
Level Selector Network for Optimizing Accuracy-Specificity Trade-Offs.
Recurrent Convolutions for Causal 3D CNNs.
End-to-End Partial Convolutions Neural Networks for Dunhuang Grottoes Wall-Painting Restoration.
Attention-Aware Age-Agnostic Visual Place Recognition.
Craquelure as a Graph: Application of Image Processing and Graph Neural Networks to the Description of Fracture Patterns.
PotSAC: A Robust Axis Estimator for Axially Symmetric Pot Fragments.
Zero-Shot Hyperspectral Image Denoising With Separable Image Prior.
Object Grounding via Iterative Context Reasoning.
Weakly Supervised One Shot Segmentation.
Adversarial Joint-Distribution Learning for Novel Class Sketch-Based Image Retrieval.
Meta Module Generation for Fast Few-Shot Incremental Learning.
Retro-Actions: Learning 'Close' by Time-Reversing 'Open' Videos.
Zero-Shot Semantic Segmentation via Variational Mapping.
Picking Groups Instead of Samples: A Close Look at Static Pool-Based Meta-Active Learning.
Input and Weight Space Smoothing for Semi-Supervised Learning.
Bayesian 3D ConvNets for Action Recognition from Few Examples.
Task-Discriminative Domain Alignment for Unsupervised Domain Adaptation.
Deep Metric Transfer for Label Propagation with Limited Annotated Data.
ProtoGAN: Towards Few Shot Learning for Action Recognition.
Enhancing Visual Embeddings through Weakly Supervised Captioning for Zero-Shot Learning.
Temporal Accumulative Features for Sign Language Recognition.
FakeTalkerDetect: Effective and Practical Realistic Neural Talking Head Detection with a Highly Unbalanced Dataset.
The Instantaneous Accuracy: a Novel Metric for the Problem of Online Human Behaviour Recognition in Untrimmed Videos.
Robust Cloth Warping via Multi-Scale Patch Adversarial Loss for Virtual Try-On Framework.
Pose and Expression Robust Age Estimation via 3D Face Reconstruction from a Single Image.
Voice Activity Detection by Upper Body Motion Analysis and Unsupervised Domain Adaptation.
Falls Prediction Based on Body Keypoints and Seq2Seq Architecture.
Fitting, Comparison, and Alignment of Trajectories on Positive Semi-Definite Matrices with Application to Action Recognition.
Facial Pose Estimation by Deep Learning from Label Distributions.
Measuring Crowd Collectiveness via Global Motion Correlation.
Attributes Preserving Face De-Identification.
Dance Dance Generation: Motion Transfer for Internet Videos.
Deepfake Video Detection through Optical Flow Based CNN.
Uncertainty-Aware Anticipation of Activities.
Weakly-Supervised Completion Moment Detection using Temporal Attention.
Neighbourhood Context Embeddings in Deep Inverse Reinforcement Learning for Predicting Pedestrian Motion Over Long Time Horizons.
SalGaze: Personalizing Gaze Estimation using Visual Saliency.
RT-BENE: A Dataset and Baselines for Real-Time Blink Estimation in Natural Environments.
On-Device Few-Shot Personalization for Real-Time Gaze Estimation.
Learning to Personalize in Appearance-Based Gaze Tracking.
A Generalized and Robust Method Towards Practical Gaze Estimation on Smart Phone.
Recognizing Tiny Faces.
Real-Time Age-Invariant Face Recognition in Videos Using the ScatterNet Inception Hybrid Network (SIHN).
State-of-the-Art in Action: Unconstrained Text Detection.
GSR-MAR: Global Super-Resolution for Person Multi-Attribute Recognition.
Unsupervised Outlier Detection in Appearance-Based Gaze Estimation.
Intra-Camera Supervised Person Re-Identification: A New Benchmark.
Indoor Depth Completion with Boundary Consistency and Self-Attention.
Unsupervised Deep Feature Transfer for Low Resolution Image Classification.
Generatively Inferential Co-Training for Unsupervised Domain Adaptation.
Are Adversarial Robustness and Common Perturbation Robustness Independent Attributes ?
Non-Discriminative Data or Weak Model? On the Relative Importance of Data and Model Resolution.
Low Quality Video Face Recognition: Multi-Mode Aggregation Recurrent Network (MARN).
SNIDER: Single Noisy Image Denoising and Rectification for Improving License Plate Recognition.
Evidence Based Feature Selection and Collaborative Representation Towards Learning Based PSF Estimation for Motion Deblurring.
Recognizing Compressed Videos: Challenges and Promises.
Feature Aggregation Network for Video Face Recognition.
Image Deconvolution with Deep Image and Kernel Priors.
Online Multi-Task Clustering for Human Motion Segmentation.
Extreme Low Resolution Action Recognition with Spatial-Temporal Multi-Head Self-Attention and Knowledge Distillation.
I Bet You Are Wrong: Gambling Adversarial Networks for Structured Semantic Segmentation.
LU-Net: An Efficient Network for 3D LiDAR Point Cloud Semantic Segmentation Based on End-to-End-Learned 3D Features and U-Net.
Exploiting Temporality for Semi-Supervised Video Segmentation.
Vehicle Detection With Automotive Radar Using Deep Learning on Range-Azimuth-Doppler Tensors.
Temporal Coherence for Active Learning in Videos.
End-to-end Lane Detection through Differentiable Least-Squares Fitting.
Robust Absolute and Relative Pose Estimation of a Central Camera System from 2D-3D Line Correspondences.
Small Obstacle Avoidance Based on RGB-D Semantic Segmentation.
Reverse and Boundary Attention Network for Road Segmentation.
Road Scene Understanding by Occupancy Grid Learning from Sparse Radar Clusters using Semantic Segmentation.
Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud.
ShelfNet for Fast Semantic Segmentation.
Boxy Vehicle Detection in Large Images.
Unsupervised Labeled Lane Markers Using Maps.
Probabilistic Vehicle Reconstruction Using a Multi-Task CNN.
Short-Term Prediction and Multi-Camera Fusion on Semantic Grids.
Estimation of Absolute Scale in Monocular SLAM Using Synthetic Data.
Attack Agnostic Statistical Method for Adversarial Detection.
On the Geometry of Rectifier Convolutional Neural Networks.
Stochastic Relational Network.
OpenVINO Deep Learning Workbench: Comprehensive Analysis and Tuning of Neural Networks Inference.
UGLLI Face Alignment: Estimating Uncertainty with Gaussian Log-Likelihood Loss.
Efficient Priors for Scalable Variational Inference in Bayesian Deep Neural Networks.
A Novel Adversarial Inference Framework for Video Prediction with Action Control.
Lautum Regularization for Semi-Supervised Transfer Learning.
Direct Validation of the Information Bottleneck Principle for Deep Nets.
Open Set Recognition Through Deep Neural Network Uncertainty: Does Out-of-Distribution Detection Require Generative Classifiers?
Function Norms for Neural Networks.
Interpreting Intentionally Flawed Models with Linear Probes.
Exploring Dynamic Routing As A Pooling Layer.
Learning to Find Correlated Features by Maximizing Information Flow in Convolutional Neural Networks.
Squeezed Bilinear Pooling for Fine-Grained Visual Categorization.
Spatio-Temporal Attention Network for Video Instance Segmentation.
Temporal Feature Augmented Network for Video Instance Segmentation.
Dual Embedding Learning for Video Instance Segmentation.
An Empirical Study of Detection-Based Video Instance Segmentation.
Video Instance Segmentation 2019: A Winning Approach for Combined Detection, Segmentation, Classification and Tracking.
Exploring the Combination of PReMVOS, BoLTVOS and UnOVOST for the 2019 YouTube-VOS Challenge.
Towards Good Practices for Video Object Segmentation.
Going Deeper Into Embedding Learning for Video Object Segmentation.
Motion-Guided Spatial Time Attention for Video Object Segmentation.
Enhanced Memory Network for Video Segmentation.
Adaptive Online k-Subspaces with Cooperative Re-Initialization.
Classifying and Comparing Approaches to Subspace Clustering with Missing Data.
Complete Moving Object Detection in the Context of Robust Subspace Learning.
Topological Labelling of Scene using Background/Foreground Separation and Epipolar Geometry.
Panoramic Video Separation with Online Grassmannian Robust Subspace Estimation.
Deep Closed-Form Subspace Clustering.
Structure-Constrained Feature Extraction by Autoencoders for Subspace Clustering.
Structuring Autoencoders.
Low-Rank Tensor Tracking.
Tensor Train Decomposition for Efficient Memory Saving in Perceptual Feature-Maps.
Tensor Subspace Learning and Classification: Tensor Local Discriminant Embedding for Hyperspectral Image.
Robust Discrimination and Generation of Faces using Compact, Disentangled Embeddings.
Uncalibrated Non-Rigid Factorisation by Independent Subspace Analysis.
Learning Disentangled Representations via Independent Subspaces.
Face Representation Learning using Composite Mini-Batches.
Disguised Faces in the Wild 2019.
Feature Ensemble Networks with Re-Ranking for Recognizing Disguised Faces in the Wild.
Tensor Linear Regression and Its Application to Color Face Recognition.
Enhancing Human Face Recognition with an Interpretable Neural Network.
Dual Attention MobDenseNet(DAMDNet) for Robust 3D Face Alignment.
BASN: Enriching Feature Representation Using Bipartite Auxiliary Supervisions for Face Anti-Spoofing.
ArcFace for Disguised Face Recognition.
WhiteNNer-Blind Image Denoising via Noise Whiteness Priors.
Branding - Fusion of Meta Data and Musculoskeletal Radiographs for Multi-Modal Diagnostic Recognition.
Photometric Transformer Networks and Label Adjustment for Breast Density Prediction.
Improving Robustness of Deep Learning Based Knee MRI Segmentation: Mixup and Adversarial Domain Adaptation.
Building a Breast-Sentence Dataset: Its Usefulness for Computer-Aided Diagnosis.
Prostate Cancer Inference via Weakly-Supervised Learning using a Large Collection of Negative MRI.
RethNet: Object-by-Object Learning for Detecting Facial Skin Problems.
UPI-Net: Semantic Contour Detection in Placental Ultrasound.
Bi-Directional ConvLSTM U-Net with Densley Connected Convolutions.
Using the Triplet Loss for Domain Adaptation in WCE.
CGC-Net: Cell Graph Convolutional Network for Grading of Colorectal Cancer Histology Images.
Retinal Image Classification via Vasculature-Guided Sequential Attention.
Breast Tumor Cellularity Assessment Using Deep Neural Networks.
DeepLIMa: Deep Learning Based Lesion Identification in Mammograms.
KNEEL: Knee Anatomical Landmark Localization Using Hourglass Networks.
Deep Multiresolution Cellular Communities for Semantic Segmentation of Multi-Gigapixel Histology Images.
Unimodal-Uniform Constrained Wasserstein Training for Medical Diagnosis.
Domain-Agnostic Learning With Anatomy-Consistent Embedding for Cross-Modality Liver Segmentation.
Part-Pose Guided Amur Tiger Re-Identification.
Fast and Efficient Model for Real-Time Tiger Detection In The Wild.
A Strong Baseline for Tiger Re-ID and its Bag of Tricks.
A Hybrid Approach to Tiger Re-Identification.
Pose-Guided Complementary Features Learning for Amur Tiger Re-Identification.
Learning Deep Features for Giant Panda Gender Classification using Face Images.
DeepBees - Building and Scaling Convolutional Neuronal Nets For Fast and Large-Scale Visual Monitoring of Bee Hives.
ELPephants: A Fine-Grained Dataset for Elephant Re-Identification.
Great Ape Detection in Challenging Jungle Camera Trap Footage via Attention-Based Spatial and Temporal Feature Blending.
Geo-Aware Networks for Fine-Grained Recognition.
Count, Crop and Recognise: Fine-Grained Recognition in the Wild.
VisDrone-VID2019: The Vision Meets Drone Object Detection in Video Challenge Results.
VisDrone-DET2019: The Vision Meets Drone Object Detection in Image Challenge Results.
VisDrone-SOT2019: The Vision Meets Drone Single Object Tracking Challenge Results.
VisDrone-MOT2019: The Vision Meets Drone Multiple Object Tracking Challenge Results.
A Novel Spatial and Temporal Context-Aware Approach for Drone-Based Video Object Detection.
Flow Guided Short-Term Trackers with Cascade Detection for Long-Term Tracking.
Multiple Object Tracking with Motion and Appearance Cues.
Vision-Based Online Localization and Trajectory Smoothing for Fixed-Wing UAV Tracking a Moving Target.
Real-Time UAV Tracking Based on PSR Stability.
Multi-Object Tracking Hierarchically in Visual Data Taken From Drones.
Patch-Level Augmentation for Object Detection in Aerial Images.
Dense and Small Object Detection in UAV Vision Based on Cascade Network.
Accuracy and Long-Term Tracking via Overlap Maximization Integrated with Motion Continuity.
RRNet: A Hybrid Detector for Object Detection in Drone-Captured Images.
Deep Adaptive Fusion Network for High Performance RGBT Tracking.
An Indoor Crowd Detection Network Framework Based on Feature Aggregation Module and Hybrid Attention Selection Module.
Real-Time Aerial Suspicious Analysis (ASANA) System for the Identification and Re-Identification of Suspicious Individuals using the Bayesian ScatterNet Hybrid (BSH) Network.
Spatial Attention for Multi-Scale Feature Refinement for Object Detection.
i-Siam: Improving Siamese Tracker with Distractors Suppression and Long-Term Strategies.
Multi Target Tracking from Drones by Learning from Generalized Graph Differences.
SlimYOLOv3: Narrower, Faster and Better for Real-Time UAV Applications.
Learning Cascaded Context-Aware Framework for Robust Visual Tracking.
Crowd Counting on Images with Scale Variation and Isolated Clusters.
Few-Shot Structured Domain Adaptation for Virtual-to-Real Scene Parsing.
How to Fully Exploit The Abilities of Aerial Image Detectors.