
NeurIPS(NIPS) 2018 论文列表

Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada.

GroupReduce: Block-Wise Low-Rank Approximation for Neural Language Model Shrinking.
The Price of Fair PCA: One Extra dimension.
Transfer of Deep Reactive Policies for MDP Planning.
Multiple Instance Learning for Efficient Sequential Data Classification on Resource-constrained Devices.
Sparse PCA from Sparse Linear Regression.
Computationally and statistically efficient learning of causal Bayes nets using path queries.
Point process latent variable models of larval zebrafish behavior.
Contrastive Learning from Pairwise Measurements.
Topkapi: Parallel and Fast Sketches for Finding Top-K Frequent Elements.
Removing Hidden Confounding by Experimental Grounding.
Semidefinite relaxations for certifying robustness to adversarial examples.
MixLasso: Generalized Mixed Regression via Convex Atomic-Norm Regularization.
Smoothed Analysis of Discrete Tensor Decomposition and Assemblies of Neurons.
Domain Adaptation by Using Causal Inference to Predict Invariant Conditional Distributions.
Multi-value Rule Sets for Interpretable Classification with Feature-Efficient Representations.
Differentially Private Change-Point Detection.
Support Recovery for Orthogonal Matching Pursuit: Upper and Lower bounds.
Fast and Effective Robustness Certification.
Bias and Generalization in Deep Generative Models: An Empirical Study.
Learning Temporal Point Processes via Reinforcement Learning.
Joint Autoregressive and Hierarchical Priors for Learned Image Compression.
Decentralize and Randomize: Faster Algorithm for Wasserstein Barycenters.
With Friends Like These, Who Needs Adversaries?
Forward Modeling for Partial Observation Strategy Games - A StarCraft Defogger.
DropBlock: A regularization method for convolutional networks.
Autoconj: Recognizing and Exploiting Conjugacy Without a Domain-Specific Language.
Analysis of Krylov Subspace Solutions of Regularized Non-Convex Quadratic Problems.
Mean Field for the Stochastic Blockmodel: Optimization Landscape and Convergence Issues.
Robust Subspace Approximation in a Stream.
Thermostat-assisted continuously-tempered Hamiltonian Monte Carlo for Bayesian learning.
Benefits of over-parameterization with EM.
Learning Beam Search Policies via Imitation Learning.
Data-Driven Clustering via Parameterized Lloyd's Families.
Understanding Regularized Spectral Clustering via Graph Conductance.
Fully Neural Network Based Speech Recognition on Mobile and Embedded Devices.
Connecting Optimization and Regularization Paths.
Sketching Method for Large Scale Combinatorial Inference.
Regret Bounds for Online Portfolio Selection with a Cardinality Constraint.
Improved Network Robustness with Adversary Critic.
Fast deep reinforcement learning using online adjustments from the past.
Streamlining Variational Inference for Constraint Satisfaction Problems.
Learning a Warping Distance from Unlabeled Time Series Using Sequence Autoencoders.
Complex Gated Recurrent Neural Networks.
Bayesian Structure Learning by Recursive Bootstrap.
The Sparse Manifold Transform.
Deep Generative Models with Learnable Knowledge Constraints.
Diversity-Driven Exploration Strategy for Deep Reinforcement Learning.
Regret bounds for meta Bayesian optimization with an unknown Gaussian process prior.
Discretely Relaxing Continuous Variables for tractable Variational Inference.
Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise.
Temporal alignment and latent Gaussian process factor inference in population spike trains.
Bounded-Loss Private Prediction Markets.
Learning Abstract Options.
Mesh-TensorFlow: Deep Learning for Supercomputers.
Convex Elicitation of Continuous Properties.
Context-aware Synthesis and Placement of Object Instances.
3D Steerable CNNs: Learning Rotationally Equivariant Features in Volumetric Data.
Gaussian Process Prior Variational Autoencoders.
Adversarial Risk and Robustness: General Definitions and Implications for the Uniform Distribution.
Unsupervised Image-to-Image Translation Using Domain-Specific Variational Information Bound.
Learning and Inference in Hilbert Space with Quantum Graphical Models.
Lifted Weighted Mini-Bucket.
Learning to Solve SMT Formulas.
PCA of high dimensional random walks with comparison to neural network training.
Improving Simple Models with Confidence Profiles.
Robust Learning of Fixed-Structure Bayesian Networks.
Robustness of conditional GANs to noisy labels.
Predictive Approximate Bayesian Computation via Saddle Points.
Learning to Share and Hide Intentions using Information Regularization.
Generalizing Point Embeddings using the Wasserstein Space of Elliptical Distributions.
Nonparametric Density Estimation under Adversarial Losses.
Glow: Generative Flow with Invertible 1x1 Convolutions.
Total stochastic gradient algorithms and applications in reinforcement learning.
Learning with SGD and Random Features.
Backpropagation with Callbacks: Foundations for Efficient and Expressive Differentiable Programming.
Learning To Learn Around A Common Mean.
Human-in-the-Loop Interpretability Prior.
Synaptic Strength For Convolutional Neural Network.
A Spectral View of Adversarially Robust Features.
Bayesian Nonparametric Spectral Estimation.
Clebsch-Gordan Nets: a Fully Fourier Space Spherical Convolutional Neural Network.
A Simple Cache Model for Image Recognition.
Low-Rank Tucker Decomposition of Large Tensors Using TensorSketch.
Blockwise Parallel Decoding for Deep Autoregressive Models.
Thwarting Adversarial Examples: An L_0-Robust Sparse Fourier Transform.
Testing for Families of Distributions via the Fourier Transform.
A Retrieve-and-Edit Framework for Predicting Structured Outputs.
Scalable Laplacian K-modes.
Blind Deconvolutional Phase Retrieval via Convex Programming.
Neural Voice Cloning with a Few Samples.
Persistence Fisher Kernel: A Riemannian Manifold Kernel for Persistence Diagrams.
Memory Augmented Policy Optimization for Program Synthesis and Semantic Parsing.
Learning to Reason with Third Order Tensor Products.
Post: Device Placement with Cross-Entropy Minimization and Proximal Policy Optimization.
Using Large Ensembles of Control Variates for Variational Inference.
Non-delusional Q-learning and value-iteration.
Learning Invariances using the Marginal Likelihood.
Uplift Modeling from Separate Labels.
Online Robust Policy Learning in the Presence of Unknown Adversaries.
Variance-Reduced Stochastic Gradient Descent on Streaming Data.
On Markov Chain Gradient Descent.
Maximizing acquisition functions for Bayesian optimization.
Life-Long Disentangled Representation Learning with Cross-Domain Latent Homologies.
Dynamic Network Model from Partial Observations.
ATOMO: Communication-efficient Learning via Atomic Sparsification.
Reinforcement Learning for Solving the Vehicle Routing Problem.
Scalable End-to-End Autonomous Vehicle Testing via Rare-event Simulation.
Adaptive Skip Intervals: Temporal Abstraction for Recurrent Dynamical Models.
Object-Oriented Dynamics Predictor.
Adaptive Methods for Nonconvex Optimization.
Entropy Rate Estimation for Markov Chains with Large State Space.
Large Scale computation of Means and Clusters for Persistence Diagrams using Optimal Transport.
Deep Anomaly Detection Using Geometric Transformations.
Generalization Bounds for Uniformly Stable Algorithms.
Unsupervised Depth Estimation, 3D Face Rotation and Replacement.
Towards Deep Conversational Recommendations.
Latent Alignment and Variational Attention.
Improving Explorability in Variational Inference with Annealed Variational Objectives.
Coupled Variational Bayes via Optimization Embedding.
Theoretical guarantees for EM under misspecified Gaussian mixture models.
Global Non-convex Optimization with Discretized Diffusions.
Improving Online Algorithms via ML Predictions.
Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization.
Ex ante coordination and collusion in zero-sum multi-player extensive-form games.
Invertibility of Convolutional Generative Networks from Partial Measurements.
Trading robust representations for sample complexity through self-supervised visual experience.
An intriguing failing of convolutional neural networks and the CoordConv solution.
Optimal Algorithms for Continuous Non-monotone Submodular and DR-Submodular Maximization.
Towards Understanding Learning Representations: To What Extent Do Different Neural Networks Learn the Same Representation.
Neural Proximal Gradient Descent for Compressive Imaging.
Learning convex bounds for linear quadratic control policy synthesis.
Fast Approximate Natural Gradient Descent in a Kronecker Factored Eigenbasis.
e-SNLI: Natural Language Inference with Natural Language Explanations.
Reinforcement Learning with Multiple Experts: A Bayesian Model Combination Approach.
Probabilistic Model-Agnostic Meta-Learning.
Sanity Checks for Saliency Maps.
Multi-objective Maximization of Monotone Submodular Functions with Cardinality Constraint.
PAC-Bayes Tree: weighted subtrees with guarantees.
DAGs with NO TEARS: Continuous Optimization for Structure Learning.
Implicit Bias of Gradient Descent on Linear Convolutional Networks.
Learning and Testing Causal Models with Interventions.
Deepcode: Feedback Codes via Deep Learning.
Identification and Estimation of Causal Effects from Dependent Data.
The Global Anchor Method for Quantifying Linguistic Shifts and Domain Adaptation.
Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks.
The emergence of multiple retinal cell types through efficient coding of natural movies.
Learning Attractor Dynamics for Generative Memory.
Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures.
Statistical and Computational Trade-Offs in Kernel K-Means.
Co-regularized Alignment for Unsupervised Domain Adaptation.
Hardware Conditioned Policies for Multi-Robot Transfer Learning.
The Sample Complexity of Semi-Supervised Learning with Nonparametric Mixture Models.
SNIPER: Efficient Multi-Scale Training.
The Effect of Network Width on the Performance of Large-batch Training.
Representer Point Selection for Explaining Deep Neural Networks.
The Importance of Sampling inMeta-Reinforcement Learning.
Confounding-Robust Policy Improvement.
Deep Dynamical Modeling and Control of Unsteady Fluid Flows.
Coordinate Descent with Bandit Sampling.
The Limit Points of (Optimistic) Gradient Descent in Min-Max Optimization.
Beyond Grids: Learning Graph Representations for Visual Recognition.
PAC-Bayes bounds for stable algorithms with instance-dependent priors.
Deep Predictive Coding Network with Local Recurrent Processing for Object Recognition.
Visual Reinforcement Learning with Imagined Goals.
Watch Your Step: Learning Node Embeddings via Graph Attention.
A Stein variational Newton method.
Reducing Network Agnostophobia.
Quadrature-based features for kernel approximation.
Phase Retrieval Under a Generative Prior.
Learning SMaLL Predictors.
Bayesian multi-domain learning for cancer subtype discovery from next-generation sequencing count data.
Learning Safe Policies with Expert Guidance.
Robot Learning in Homes: Improving Generalization and Reducing Dataset Bias.
Invariant Representations without Adversarial Training.
Iterative Value-Aware Model Learning.
Theoretical Linear Convergence of Unfolded ISTA and Its Practical Weights and Thresholds.
Learning Compressed Transforms with Low Displacement Rank.
SING: Symbol-to-Instrument Neural Generator.
Reversible Recurrent Neural Networks.
FastGRNN: A Fast, Accurate, Stable and Tiny Kilobyte Sized Gated Recurrent Neural Network.
Efficient High Dimensional Bayesian Optimization with Additivity and Quadrature Fourier Features.
A Structured Prediction Approach for Label Ranking.
Inferring Latent Velocities from Weather Radar Data using Gaussian Processes.
Wavelet regression and additive models for irregularly spaced data.
Online Learning of Quantum States.
GLoMo: Unsupervised Learning of Transferable Relational Graphs.
Policy-Conditioned Uncertainty Sets for Robust Markov Decision Processes.
Adaptive Path-Integral Autoencoders: Representation Learning and Planning for Dynamical Systems.
Improving Neural Program Synthesis with Inferred Execution Traces.
Distributed Multitask Reinforcement Learning with Quadratic Convergence.
Balanced Policy Evaluation and Learning.
A Statistical Recurrent Model on the Manifold of Symmetric Positive Definite Matrices.
Exploration in Structured Reinforcement Learning.
Differential Privacy for Growing Databases.
Stein Variational Gradient Descent as Moment Matching.
Group Equivariant Capsule Networks.
Data Amplification: A Unified and Competitive Approach to Property Estimation.
Reinforcement Learning of Theorem Proving.
Legendre Decomposition for Tensors.
Flexible neural representation for physics prediction.
Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs.
Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels.
A Bayesian Nonparametric View on Count-Min Sketch.
Automatic differentiation in ML: Where we are and where we should be going.
Uniform Convergence of Gradients for Non-Convex Learning and Optimization.
Learning Plannable Representations with Causal InfoGAN.
Dendritic cortical microcircuits approximate the backpropagation algorithm.
Orthogonally Decoupled Variational Gaussian Processes.
Searching for Efficient Multi-Scale Architectures for Dense Image Prediction.
HOUDINI: Lifelong Learning as Program Synthesis.
DeepPINK: reproducible feature selection in deep neural networks.
Estimators for Multivariate Information Measures in General Probability Spaces.
Multilingual Anchoring: Interactive Topic Modeling and Alignment Across Languages.
Learning without the Phase: Regularized PhaseMax Achieves Optimal Sample Complexity.
Compact Representation of Uncertainty in Clustering.
Randomized Prior Functions for Deep Reinforcement Learning.
Sequential Attend, Infer, Repeat: Generative Modelling of Moving Objects.
A Likelihood-Free Inference Framework for Population Genetic Data using Exchangeable Neural Networks.
Contextual Stochastic Block Models.
Neural Tangent Kernel: Convergence and Generalization in Neural Networks.
Adversarial Multiple Source Domain Adaptation.
A convex program for bilinear inversion of sparse vectors.
Variational Inverse Control with Events: A General Framework for Data-Driven Reward Definition.
Co-teaching: Robust training of deep neural networks with extremely noisy labels.
Clustering Redemption-Beyond the Impossibility of Kleinberg's Axioms.
Adversarial Regularizers in Inverse Problems.
Graph Oracle Models, Lower Bounds, and Gaps for Parallel Stochastic Optimization.
Generalisation of structural knowledge in the hippocampal-entorhinal system.
Wasserstein Distributionally Robust Kalman Filtering.
Teaching Inverse Reinforcement Learners via Features and Demonstrations.
Dimensionality Reduction has Quantifiable Imperfections: Two Geometric Bounds.
Deep Poisson gamma dynamical systems.
Data-dependent PAC-Bayes priors via differential privacy.
Almost Optimal Algorithms for Linear Stochastic Bandits with Heavy-Tailed Payoffs.
Deep Network for the Integrated 3D Sensing of Multiple People in Natural Images.
Scaling provable adversarial defenses.
Learning to Play With Intrinsically-Motivated, Self-Aware Agents.
On preserving non-discrimination when combining expert advice.
Stochastic Primal-Dual Method for Empirical Risk Minimization with O(1) Per-Iteration Complexity.
Transfer Learning with Neural AutoML.
Distributionally Robust Graphical Models.
Learning Conditioned Graph Structures for Interpretable Visual Question Answering.
Information-theoretic Limits for Community Detection in Network Models.
Constructing Unrestricted Adversarial Examples with Generative Models.
Bilevel learning of the Group Lasso structure.
Differentiable MPC for End-to-end Planning and Control.
How SGD Selects the Global Minima in Over-parameterized Learning: A Dynamical Stability Perspective.
The promises and pitfalls of Stochastic Gradient Langevin Dynamics.
Online Reciprocal Recommendation with Theoretical Performance Guarantees.
Algorithms and Theory for Multiple-Source Adaptation.
Efficient Online Portfolio with Logarithmic Regret.
Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion.
Variational Bayesian Monte Carlo.
Statistical mechanics of low-rank tensor decomposition.
Graphical model inference: Sequential Monte Carlo meets deterministic approximations.
Modelling and unsupervised learning of symmetric deformable object categories.
Hamiltonian Variational Auto-Encoder.
Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data.
Bayesian Control of Large MDPs with Unknown Dynamics in Data-Poor Environments.
Proximal Graphical Event Models.
Does mitigating ML's impact disparity require treatment disparity?
Statistical Optimality of Stochastic Gradient Descent on Hard Learning Problems through Multiple Passes.
Credit Assignment For Collective Multiagent RL With Global Rewards.
A Lyapunov-based Approach to Safe Reinforcement Learning.
Learning to Specialize with Knowledge Distillation for Visual Question Answering.
Efficient Anomaly Detection via Matrix Sketching.
Improved Expressivity Through Dendritic Neural Networks.
Pipe-SGD: A Decentralized Pipelined SGD Framework for Distributed Deep Net Training.
Neural Arithmetic Logic Units.
Approximate Knowledge Compilation by Online Collapsed Importance Sampling.
Reward learning from human preferences and demonstrations in Atari.
Spectral Signatures in Backdoor Attacks.
The challenge of realistic music generation: modelling raw audio at scale.
Submodular Maximization via Gradient Ascent: The Case of Deep Submodular Functions.
Stochastic Expectation Maximization with Variance Reduction.
Dirichlet belief networks for topic structure learning.
Layer-Wise Coordination between Encoder and Decoder for Neural Machine Translation.
Learning to Repair Software Vulnerabilities with Generative Adversarial Networks.
Monte-Carlo Tree Search for Constrained POMDPs.
Robust Detection of Adversarial Attacks by Modeling the Intrinsic Properties of Deep Neural Networks.
Robust Hypothesis Testing Using Wasserstein Uncertainty Sets.
RenderNet: A deep convolutional network for differentiable rendering from 3D shapes.
Cluster Variational Approximations for Structure Learning of Continuous-Time Bayesian Networks from Incomplete Data.
A Reduction for Efficient LDA Topic Reconstruction.
A General Method for Amortizing Variational Filtering.
Beyond Log-concavity: Provable Guarantees for Sampling Multi-modal Distributions using Simulated Tempering Langevin Monte Carlo.
Distributed k-Clustering for Data with Heavy Noise.
Preference Based Adaptation for Learning Objectives.
Neural Architecture Optimization.
Learning Libraries of Subroutines for Neurally-Guided Bayesian Program Induction.
Constrained Graph Variational Autoencoders for Molecule Design.
Deep State Space Models for Time Series Forecasting.
Towards Robust Interpretability with Self-Explaining Neural Networks.
Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization.
Learning Loop Invariants for Program Verification.
Breaking the Activation Function Bottleneck through Adaptive Parameterization.
On Neuronal Capacity.
Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples.
Adversarial Scene Editing: Automatic Object Removal from Weak Supervision.
Understanding Batch Normalization.
Scalar Posterior Sampling with Applications.
Training Deep Neural Networks with 8-bit Floating Point Numbers.
Depth-Limited Solving for Imperfect-Information Games.
Communication Compression for Decentralized Training.
Sparse Attentive Backtracking: Temporal Credit Assignment Through Reminding.
Improved Algorithms for Collaborative PAC Learning.
Rectangular Bounding Process.
VideoCapsuleNet: A Simplified Network for Action Detection.
Simple, Distributed, and Accelerated Probabilistic Programming.
Diffusion Maps for Textual Network Embedding.
GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration.
cpSGD: Communication-efficient and differentially-private distributed SGD.
Towards Text Generation with Adversarially Learned Neural Outlines.
Generalisation in humans and deep neural networks.
Non-Adversarial Mapping with VAEs.
Knowledge Distillation by On-the-Fly Native Ensemble.
Inference in Deep Gaussian Processes using Stochastic Gradient Hamiltonian Monte Carlo.
Generative modeling for protein structures.
Forecasting Treatment Responses Over Time Using Recurrent Marginal Structural Networks.
Adaptive Learning with Unknown Information Flows.
Multi-Agent Generative Adversarial Imitation Learning.
Constrained Cross-Entropy Method for Safe Reinforcement Learning.
Plug-in Estimation in High-Dimensional Linear Inverse Problems: A Rigorous Analysis.
A Bayesian Approach to Generative Adversarial Imitation Learning.
Constant Regret, Generalized Mixability, and Mirror Descent.
How to tell when a clustering is (approximately) correct using convex relaxations.
Revisiting (\epsilon, \gamma, \tau)-similarity learning for domain adaptation.
Stochastic Chebyshev Gradient Descent for Spectral Optimization.
Out-of-Distribution Detection using Multiple Semantic Label Representations.
Learning Signed Determinantal Point Processes through the Principal Minor Assignment Problem.
Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces.
Disconnected Manifold Learning for Generative Adversarial Networks.
Bayesian Model-Agnostic Meta-Learning.
REFUEL: Exploring Sparse Features in Deep Reinforcement Learning for Fast Disease Diagnosis.
Streaming Kernel PCA with \tilde{O}(\sqrt{n}) Random Features.
Relational recurrent neural networks.
Unsupervised Text Style Transfer using Language Models as Discriminators.
Bandit Learning with Implicit Feedback.
Training Deep Models Faster with Robust, Approximate Importance Sampling.
Learning Attentional Communication for Multi-Agent Cooperation.
Implicit Probabilistic Integrators for ODEs.
Chaining Mutual Information and Tightening Generalization Bounds.
Efficient Loss-Based Decoding on Graphs for Extreme Classification.
Distributed Multi-Player Bandits - a Game of Thrones Approach.
Mapping Images to Scene Graphs with Permutation-Invariant Structured Prediction.
Stimulus domain transfer in recurrent models for large scale cortical population prediction on video.
BRUNO: A Deep Recurrent Model for Exchangeable Data.
End-to-End Differentiable Physics for Learning and Control.
A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks.
Hierarchical Reinforcement Learning for Zero-shot Generalization with Subtask Dependencies.
Parameters as interacting particles: long time convergence and asymptotic error scaling of neural networks.
Deep Homogeneous Mixture Models: Representation, Separation, and Approximation.
Provably Correct Automatic Sub-Differentiation for Qualified Programs.
Constrained Generation of Semantically Valid Graphs via Regularizing Variational Autoencoders.
Model-Agnostic Private Learning.
On the Convergence and Robustness of Training GANs with Regularized Optimal Transport.
Manifold-tiling Localized Receptive Fields are Optimal in Similarity-preserving Neural Networks.
A probabilistic population code based on neural samples.
Dual Policy Iteration.
Predictive Uncertainty Estimation via Prior Networks.
GILBO: One Metric to Measure Them All.
Efficient online algorithms for fast-rate regret bounds under sparsity.
Gen-Oja: Simple & Efficient Algorithm for Streaming Generalized Eigenvector Computation.
Hybrid Macro/Micro Level Backpropagation for Training Deep Spiking Neural Networks.
Bayesian Alignments of Warped Multi-Output Gaussian Processes.
Causal Inference via Kernel Deviance Measures.
Unorganized Malicious Attacks Detection.
A Probabilistic U-Net for Segmentation of Ambiguous Images.
Uncertainty Sampling is Preconditioned Stochastic Gradient Descent on Zero-One Loss.
Online Structure Learning for Feed-Forward and Recurrent Sum-Product Networks.
rho-POMDPs have Lipschitz-Continuous epsilon-Optimal Value Functions.
Causal Inference with Noisy and Missing Covariates via Matrix Factorization.
Maximizing Induced Cardinality Under a Determinantal Point Process.
Efficient Convex Completion of Coupled Tensors using Coupled Nuclear Norms.
Bayesian Adversarial Learning.
Differentially Private Testing of Identity and Closeness of Discrete Distributions.
Scaling Gaussian Process Regression with Derivatives.
Stochastic Nonparametric Event-Tensor Decomposition.
Scalable Hyperparameter Transfer Learning.
Diminishing Returns Shape Constraints for Interpretability and Regularization.
Generative Probabilistic Novelty Detection with Adversarial Autoencoders.
Efficient Gradient Computation for Structured Output Learning with Rational and Tropical Losses.
Extracting Relationships by Multi-Domain Matching.
M-Walk: Learning to Walk over Graphs using Monte Carlo Tree Search.
BRITS: Bidirectional Recurrent Imputation for Time Series.
Provable Gaussian Embedding with One Observation.
Banach Wasserstein GAN.
A Theory-Based Evaluation of Nearest Neighbor Models Put Into Practice.
Policy Regret in Repeated Games.
Large-Scale Stochastic Sampling from the Probability Simplex.
Heterogeneous Multi-output Gaussian Process Prediction.
On gradient regularizers for MMD GANs.
Model-based targeted dimensionality reduction for neuronal population data.
Representation Learning of Compositional Data.
Modeling Dynamic Missingness of Implicit Feedback for Recommendation.
Training Neural Networks Using Features Replay.
Query K-means Clustering and the Double Dixie Cup Problem.
CatBoost: unbiased boosting with categorical features.
Incorporating Context into Language Encoding Models for fMRI.
An Improved Analysis of Alternating Minimization for Structured Multi-Response Regression.
Contamination Attacks and Mitigation in Multi-Party Machine Learning.
Approximating Real-Time Recurrent Learning with Random Kronecker Factors.
Unsupervised Learning of Artistic Styles with Archetypal Style Analysis.
Neural Ordinary Differential Equations.
On Coresets for Logistic Regression.
Proximal SCOPE for Distributed Sparse Learning.
Lipschitz-Margin Training: Scalable Certification of Perturbation Invariance for Deep Neural Networks.
The Everlasting Database: Statistical Validity at a Fair Price.
On the Local Hessian in Back-propagation.
Compact Generalized Non-local Network.
Online Adaptive Methods, Universality and Acceleration.
Size-Noise Tradeoffs in Generative Networks.
Multi-View Silhouette and Depth Decomposition for High Resolution 3D Object Representation.
Learning to Teach with Dynamic Loss Functions.
Turbo Learning for CaptionBot and DrawingBot.
Learning Latent Subspaces in Variational Autoencoders.
L4: Practical loss-based stepsize adaptation for deep learning.
On Controllable Sparse Alternatives to Softmax.
Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation.
The Limits of Post-Selection Generalization.
Visualizing the Loss Landscape of Neural Nets.
Bayesian Distributed Stochastic Gradient Descent.
Efficient Formal Safety Analysis of Neural Networks.
A no-regret generalization of hierarchical softmax to extreme multi-label classification.
Distributed Learning without Distress: Privacy-Preserving Empirical Risk Minimization.
Sequential Test for the Lowest Mean: From Thompson to Murphy Sampling.
Deep Structured Prediction with Nonlinear Output Transformations.
Navigating with Graph Representations for Fast and Scalable Decoding of Neural Language Models.
Algebraic tests of general Gaussian latent tree models.
Exponentially Weighted Imitation Learning for Batched Historical Data.
Privacy Amplification by Subsampling: Tight Analyses via Couplings and Divergences.
Multi-domain Causal Structure Learning in Linear Systems.
Tangent: Automatic differentiation using source-code transformation for dynamically typed array programming.
SLANG: Fast Structured Covariance Approximations for Bayesian Deep Learning with Natural Gradient.
LF-Net: Learning Local Features from Images.
Learning towards Minimum Hyperspherical Energy.
Deep Neural Networks with Box Convolutions.
Sharp Bounds for Generalized Uniformity Testing.
The Cluster Description Problem - Complexity Results, Formulations and Approximations.
Transfer of Value Functions via Variational Methods.
ResNet with one-neuron hidden layers is a Universal Approximator.
Deep State Space Models for Unconditional Word Generation.
Predict Responsibly: Improving Fairness and Accuracy by Learning to Defer.
Online convex optimization for cumulative constraints.
Recurrent Transformer Networks for Semantic Correspondence.
Information Constraints on Auto-Encoding Variational Bayes.
Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks.
MacNet: Transferring Knowledge from Machine Comprehension to Sequence-to-Sequence Models.
Variational Learning on Aggregate Outputs with Gaussian Processes.
Graphical Generative Adversarial Networks.
Learning to Infer Graphics Programs from Hand-Drawn Images.
Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural Networks.
Approximation algorithms for stochastic clustering.
Dimensionally Tight Bounds for Second-Order Hamiltonian Monte Carlo.
Multi-Task Zipping via Layer-wise Neuron Sharing.
Dirichlet-based Gaussian Processes for Large-scale Calibrated Classification.
Stacked Semantics-Guided Attention Model for Fine-Grained Zero-Shot Learning.
Automating Bayesian optimization with Bayesian optimization.
The Convergence of Sparsified Gradient Methods.
Memory Replay GANs: Learning to Generate New Categories without Forgetting.
Constructing Fast Network through Deconstruction of Convolution.
Exact natural gradient in deep linear networks and its application to the nonlinear case.
Deep Generative Models for Distribution-Preserving Lossy Compression.
Binary Classification from Positive-Confidence Data.
Diverse Ensemble Evolution: Curriculum Data-Model Marriage.
Dual Swap Disentangling.
A Bayes-Sard Cubature Method.
Practical Deep Stereo (PDS): Toward applications-friendly deep stereo matching.
Differential Properties of Sinkhorn Approximation for Learning with Wasserstein Distance.
On GANs and GMMs.
Masking: A New Perspective of Noisy Supervision.
Gamma-Poisson Dynamic Matrix Factorization Embedded with Metadata Influence.
CapProNet: Deep Feature Learning via Orthogonal Projections onto Capsule Subspaces.
Neural Interaction Transparency (NIT): Disentangling Learned Interactions for Improved Interpretability.
Computing Kantorovich-Wasserstein Distances on d-dimensional histograms using (d+1)-partite graphs.
Loss Functions for Multiset Prediction.
Learning to Multitask.
Adversarially Robust Optimization with Gaussian Processes.
Mental Sampling in Multimodal Representations.
Variational Inference with Tail-adaptive f-Divergence.
Insights on representational similarity in neural networks with canonical correlation.
Critical initialisation for deep signal propagation in noisy rectifier neural networks.
Learning convex polytopes with margin.
Efficient inference for time-varying behavior during learning.
Unsupervised Video Object Segmentation for Deep Reinforcement Learning.
On Fast Leverage Score Sampling and Optimal Learning.
Bandit Learning in Concave N-Person Games.
Online Improper Learning with an Approximation Oracle.
Contextual Pricing for Lipschitz Buyers.
Learning Others' Intentional Models in Multi-Agent Settings Using Interactive POMDPs.
Fast Greedy MAP Inference for Determinantal Point Process to Improve Recommendation Diversity.
Manifold Structured Prediction.
Occam's razor is insufficient to infer the preferences of irrational agents.
How Much Restricted Isometry is Needed In Nonconvex Matrix Recovery?
Multimodal Generative Models for Scalable Weakly-Supervised Learning.
A Simple Proximal Stochastic Gradient Method for Nonsmooth Nonconvex Optimization.
Reparameterization Gradient for Non-differentiable Models.
To Trust Or Not To Trust A Classifier.
First-order Stochastic Algorithms for Escaping From Saddle Points in Almost Linear Time.
Middle-Out Decoding.
Inference Aided Reinforcement Learning for Incentive Mechanism Design in Crowdsourcing.
Low-rank Interaction with Sparse Additive Effects Model for Large Data Frames.
A Dual Framework for Low-rank Tensor Completion.
Community Exploration: From Offline Optimization to Online Learning.
Algorithmic Assurance: An Active Approach to Algorithmic Testing using Bayesian Optimisation.
Estimating Learnability in the Sublinear Data Regime.
Policy Optimization via Importance Sampling.
Differentially Private k-Means with Constant Multiplicative Error.
Learning Concave Conditional Likelihood Models for Improved Analysis of Tandem Mass Spectra.
The Spectrum of the Fisher Information Matrix of a Single-Hidden-Layer Neural Network.
Evolved Policy Gradients.
Fully Understanding The Hashing Trick.
Learning a latent manifold of odor representations from neural responses in piriform cortex.
Learning Task Specifications from Demonstrations.
Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation.
Hyperbolic Neural Networks.
Generalizing to Unseen Domains via Adversarial Data Augmentation.
Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data by Minimizing Predictive Variance.
Sample Efficient Stochastic Gradient Iterative Hard Thresholding Method for Stochastic Sparse Linear Regression with Limited Attribute Observation.
Meta-Reinforcement Learning of Structured Exploration Strategies.
Task-Driven Convolutional Recurrent Models of the Visual System.
Experimental Design for Cost-Aware Learning of Causal Graphs.
Exploiting Numerical Sparsity for Efficient Learning : Faster Eigenvector Computation and Regression.
Horizon-Independent Minimax Linear Regression.
A Convex Duality Framework for GANs.
Multiple-Step Greedy Policies in Approximate and Online Reinforcement Learning.
Assessing Generative Models via Precision and Recall.
Contour location via entropy reduction leveraging multiple information sources.
Causal Inference and Mechanism Clustering of A Mixture of Additive Noise Models.
ChannelNets: Compact and Efficient Convolutional Neural Networks via Channel-Wise Convolutions.
Near-Optimal Time and Sample Complexities for Solving Markov Decision Processes with a Generative Model.
Why so gloomy? A Bayesian explanation of human pessimism bias in the multi-armed bandit task.
Link Prediction Based on Graph Neural Networks.
Dropping Symmetry for Fast Symmetric Nonnegative Matrix Factorization.
Scalable methods for 8-bit training of neural networks.
Learning in Games with Lossy Feedback.
GradiVeQ: Vector Quantization for Bandwidth-Efficient Gradient Aggregation in Distributed CNN Training.
Multi-armed Bandits with Compensation.
Content preserving text generation with attribute controls.
Unsupervised Adversarial Invariance.
Geometry-Aware Recurrent Neural Networks for Active Visual Recognition.
Power-law efficient neural codes provide general link between perceptual bias and discriminability.
Scalable Robust Matrix Factorization with Nonconvex Loss.
LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning.
Practical exact algorithm for trembling-hand equilibrium refinements in games.
Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents.
Adversarially Robust Generalization Requires More Data.
Nonparametric Bayesian Lomax delegate racing for survival analysis with competing risks.
Supervising Unsupervised Learning.
Learning from Group Comparisons: Exploiting Higher Order Interactions.
Objective and efficient inference for couplings in neuronal networks.
Neural Edit Operations for Biological Sequences.
Hessian-based Analysis of Large Batch Training and Robustness to Adversaries.
Efficient Neural Network Robustness Certification with General Activation Functions.
Learning Confidence Sets using Support Vector Machines.
Bandit Learning with Positive Externalities.
Densely Connected Attention Propagation for Reading Comprehension.
On the Local Minima of the Empirical Risk.
Measures of distortion for machine learning.
Interpreting Neural Network Judgments via Minimal, Stable, and Symbolic Corrections.
Is Q-Learning Provably Efficient?
Adaptive Negative Curvature Descent with Applications in Non-convex Optimization.
Fairness Through Computationally-Bounded Awareness.
Porcupine Neural Networks: Approximating Neural Network Landscapes.
Information-based Adaptive Stimulus Selection to Optimize Communication Efficiency in Brain-Computer Interfaces.
Non-Ergodic Alternating Proximal Augmented Lagrangian Algorithms with Optimal Rates.
Hierarchical Graph Representation Learning with Differentiable Pooling.
A Unified View of Piecewise Linear Neural Network Verification.
Context-dependent upper-confidence bounds for directed exploration.
A Smoother Way to Train Structured Prediction Models.
Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models.
Fast greedy algorithms for dictionary selection with generalized sparsity constraints.
Recurrently Controlled Recurrent Networks.
Non-metric Similarity Graphs for Maximum Inner Product Search.
Negotiable Reinforcement Learning for Pareto Optimal Sequential Decision-Making.
A Mathematical Model For Optimal Decisions In A Representative Democracy.
Learning Bounds for Greedy Approximation with Explicit Feature Maps from Multiple Kernels.
Fast Rates of ERM and Stochastic Approximation: Adaptive to Error Bound Conditions.
Adversarial Text Generation via Feature-Mover's Distance.
Boolean Decision Rules via Column Generation.
On Learning Intrinsic Rewards for Policy Gradient Methods.
Spectral Filtering for General Linear Dynamical Systems.
PG-TS: Improved Thompson Sampling for Logistic Contextual Bandits.
Byzantine Stochastic Gradient Descent.
Learning filter widths of spectral decompositions with wavelets.
Active Matting.
Towards Robust Detection of Adversarial Examples.
Hunting for Discriminatory Proxies in Linear Regression Models.
Adaptive Sampling Towards Fast Graph Representation Learning.
MiME: Multilevel Medical Embedding of Electronic Health Records for Predictive Healthcare.
COLA: Decentralized Linear Learning.
Third-order Smoothness Helps: Faster Stochastic Optimization Algorithms for Finding Local Minima.
Explaining Deep Learning Models - A Bayesian Non-parametric Approach.
Lifelong Inverse Reinforcement Learning.
Expanding Holographic Embeddings for Knowledge Completion.
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis.
Importance Weighting and Variational Inference.
Exponentiated Strongly Rayleigh Distributions.
Sparsified SGD with Memory.
End-to-end Symmetry Preserving Inter-atomic Potential Energy Model for Finite and Extended Systems.
Semi-Supervised Learning with Declaratively Specified Entropy Constraints.
Limited Memory Kelley's Method Converges for Composite Convex and Submodular Objectives.
Maximum Causal Tsallis Entropy Imitation Learning.
Amortized Inference Regularization.
Mallows Models for Top-k Lists.
The Physical Systems Behind Optimization Algorithms.
Mean-field theory of graph neural networks in graph partitioning.
Adding One Neuron Can Eliminate All Bad Local Minima.
Optimization of Smooth Functions with Noisy Observations: Local Minimax Rates.
Completing State Representations using Spectral Learning.
A Bridging Framework for Model Optimization and Deep Propagation.
Submodular Field Grammars: Representation, Inference, and Application to Image Parsing.
Differentially Private Contextual Linear Bandits.
SimplE Embedding for Link Prediction in Knowledge Graphs.
Binary Rating Estimation with Graph Side Information.
Can We Gain More from Orthogonality Regularizations in Training Deep Networks?
Inexact trust-region algorithms on Riemannian manifolds.
BML: A High-performance, Low-cost Gradient Synchronization Algorithm for DML Training.
Integrated accounts of behavioral and neuroimaging data using flexible recurrent neural network models.
Scalable Coordinated Exploration in Concurrent Reinforcement Learning.
Differentially Private Uniformly Most Powerful Tests for Binomial Data.
Bilevel Distance Metric Learning for Robust Image Recognition.
Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator.
The Price of Privacy for Low-rank Factorization.
Flexible and accurate inference and learning for deep generative models.
An Information-Theoretic Analysis for Thompson Sampling with Many Actions.
Meta-Learning MCMC Proposals.
Differentially Private Robust Low-Rank Approximation.
Cooperative neural networks (CoNN): Exploiting prior independence structure for improved classification.
TETRIS: TilE-matching the TRemendous Irregular Sparsity.
Efficient Projection onto the Perfect Phylogeny Model.
Distributed Weight Consolidation: A Brain Segmentation Case Study.
Beauty-in-averageness and its contextual modulations: A Bayesian statistical account.
Neural Networks Trained to Solve Differential Equations Learn General Representations.
GumBolt: Extending Gumbel trick to Boltzmann priors.
KONG: Kernels for ordered-neighborhood graphs.
The streaming rollout of deep networks - towards fully model-parallel execution.
Probabilistic Neural Programmed Networks for Scene Generation.
Unsupervised Learning of Object Landmarks through Conditional Image Generation.
Heterogeneous Bitwidth Binarization in Convolutional Neural Networks.
Solving Non-smooth Constrained Programs with Lower Complexity than \mathcal{O}(1/\varepsilon): A Primal-Dual Homotopy Smoothing Approach.
Early Stopping for Nonparametric Testing.
Deep Generative Markov State Models.
RetGK: Graph Kernels based on Return Probabilities of Random Walks.
Learning from discriminative feature feedback.
TopRank: A practical algorithm for online stochastic ranking.
Faster Neural Networks Straight from JPEG.
Stochastic Nested Variance Reduced Gradient Descent for Nonconvex Optimization.
Adversarial Examples that Fool both Computer Vision and Time-Limited Humans.
Direct Runge-Kutta Discretization Achieves Acceleration.
Faster Online Learning of Optimal Threshold for Consistent F-measure Optimization.
Learning sparse neural networks via sensitivity-driven regularization.
Bipartite Stochastic Block Models with Tiny Clusters.
Leveraging the Exact Likelihood of Deep Latent Variable Models.
Minimax Estimation of Neural Net Distance.
Lipschitz regularity of deep neural networks: analysis and efficient estimation.
Acceleration through Optimistic No-Regret Dynamics.
Data center cooling using model-predictive control.
Bayesian Inference of Temporal Task Specifications from Demonstrations.
Variational PDEs for Acceleration on Manifolds and Application to Diffeomorphisms.
Sublinear Time Low-Rank Approximation of Distance Matrices.
Direct Estimation of Differences in Causal Graphs.
Convergence of Cubic Regularization for Nonconvex Optimization under KL Property.
DeepProbLog: Neural Probabilistic Logic Programming.
Online Structured Laplace Approximations for Overcoming Catastrophic Forgetting.
Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization.
NEON2: Finding Local Minima via First-Order Oracles.
Inferring Networks From Random Walk-Based Node Similarities.
Unsupervised Attention-guided Image-to-Image Translation.
Towards Understanding Acceleration Tradeoff between Momentum and Asynchrony in Nonconvex Stochastic Optimization.
Equality of Opportunity in Classification: A Causal Approach.
A Bandit Approach to Sequential Experimental Design with False Discovery Control.
Optimal Subsampling with Influence Functions.
Adversarial Attacks on Stochastic Bandits.
Escaping Saddle Points in Constrained Optimization.
Modern Neural Networks Generalize on Small Data Sets.
BinGAN: Learning Compact Binary Descriptors with a Regularized GAN.
Tight Bounds for Collaborative PAC Learning via Multiplicative Weights.
Neural Code Comprehension: A Learnable Representation of Code Semantics.
Communication Efficient Parallel Algorithms for Optimization on Manifolds.
Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning.
Multi-Layered Gradient Boosting Decision Trees.
Why Is My Classifier Discriminatory?
Multiplicative Weights Updates with Constant Step-Size in Graphical Constant-Sum Games.
Scaling the Poisson GLM to massive neural datasets through polynomial approximations.
Sequence-to-Segment Networks for Segment Detection.
Dimensionality Reduction for Stationary Time Series via Stochastic Nonconvex Optimization.
Infinite-Horizon Gaussian Processes.
Hybrid-MST: A Hybrid Active Sampling Strategy for Pairwise Preference Aggregation.
Latent Gaussian Activity Propagation: Using Smoothness and Structure to Separate and Localize Sounds in Large Noisy Environments.
Zeroth-order (Non)-Convex Stochastic Optimization via Conditional Gradient and Gradient Updates.
Derivative Estimation in Random Design.
Step Size Matters in Deep Learning.
Actor-Critic Policy Optimization in Partially Observable Multiagent Environments.
Nearly tight sample complexity bounds for learning mixtures of Gaussians via sample compression schemes.
Boosting Black Box Variational Inference.
Learning to Optimize Tensor Programs.
But How Does It Work in Theory? Linear SVM with Random Features.
Recurrent Relational Networks.
Stochastic Spectral and Conjugate Descent Methods.
Probabilistic Matrix Factorization for Automated Machine Learning.
Learning Gaussian Processes by Minimizing PAC-Bayesian Generalization Bounds.
Inequity aversion improves cooperation in intertemporal social dilemmas.
Speaker-Follower Models for Vision-and-Language Navigation.
Data-Efficient Hierarchical Reinforcement Learning.
Multivariate Convolutional Sparse Coding for Electromagnetic Brain Signals.
Deep, complex, invertible networks for inversion of transmission effects in multimode optical fibres.
Re-evaluating evaluation.
Training deep learning based denoisers without ground truth data.
Contextual Combinatorial Multi-armed Bandits with Volatile Arms and Submodular Reward.
Realistic Evaluation of Deep Semi-Supervised Learning Algorithms.
The committee machine: Computational to statistical gaps in learning a two-layers neural network.
Semi-crowdsourced Clustering with Deep Generative Models.
Single-Agent Policy Tree Search With Guarantees.
Parsimonious Bayesian deep networks.
Evidential Deep Learning to Quantify Classification Uncertainty.
Deep Reinforcement Learning of Marked Temporal Point Processes.
The Nearest Neighbor Information Estimator is Adaptively Near Minimax Rate-Optimal.
Learning latent variable structured prediction models with Gaussian perturbations.
Asymptotic optimality of adaptive importance sampling.
Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization.
Q-learning with Nearest Neighbors.
Near-Optimal Policies for Dynamic Multinomial Logit Assortment Selection Models.
On Binary Classification in Extreme Regions.
From Stochastic Planning to Marginal MAP.
Faithful Inversion of Generative Models for Effective Amortized Inference.
Weakly Supervised Dense Event Captioning in Videos.
Constructing Deep Neural Networks by Bayesian Network Structure Learning.
On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport.
NAIS-Net: Stable Deep Networks from Non-Autonomous Differential Equations.
Practical Methods for Graph Two-Sample Testing.
Optimistic optimization of a Brownian.
Near Optimal Exploration-Exploitation in Non-Communicating Markov Decision Processes.
When do random forests fail?
Fast Estimation of Causal Interactions using Wold Processes.
Optimization over Continuous and Multi-dimensional Decisions with Observational Data.
Norm-Ranging LSH for Maximum Inner Product Search.
Dialog-to-Action: Conversational Question Answering Over a Large-Scale Knowledge Base.
Playing hard exploration games by watching YouTube.
Differentially Private Bayesian Inference for Exponential Families.
Adaptation to Easy Data in Prediction with Limited Advice.
Stochastic Cubic Regularization for Fast Nonconvex Optimization.
Moonshine: Distilling with Cheap Convolutions.
Mirrored Langevin Dynamics.
Learning a High Fidelity Pose Invariant Model for High-resolution Face Frontalization.
Metric on Nonlinear Dynamical Systems with Perron-Frobenius Operators.
Delta-encoder: an effective sample synthesis method for few-shot object recognition.
Factored Bandits.
Gradient Descent Meets Shift-and-Invert Preconditioning for Eigenvector Computation.
Continuous-time Value Function Approximation in Reproducing Kernel Hilbert Spaces.
Unsupervised Learning of Shape and Pose with Differentiable Point Clouds.
Empirical Risk Minimization Under Fairness Constraints.
Demystifying excessively volatile human learning: A Bayesian persistent prior and a neural approximation.
Analytic solution and stationary phase approximation for the Bayesian lasso and elastic net.
Paraphrasing Complex Network: Network Compression via Factor Transfer.
Computing Higher Order Derivatives of Matrix and Tensor Expressions.
Optimal Algorithms for Non-Smooth Distributed Optimization in Networks.
Safe Active Learning for Time-Series Modeling with Gaussian Processes.
Processing of missing data by neural networks.
Learning Hierarchical Semantic Image Manipulation through Structured Representations.
Provable Variational Inference for Constrained Log-Submodular Models.
Minimax Statistical Learning with Wasserstein distances.
Natasha 2: Faster Non-Convex Optimization Than SGD.
Causal Discovery from Discrete Data using Hidden Compact Representation.
Out of the Box: Reasoning with Graph Convolution Nets for Factual Visual Question Answering.
Representation Balancing MDPs for Off-policy Policy Evaluation.
Representation Learning for Treatment Effect Estimation from Observational Data.
Contextual bandits with surrogate losses: Margin bounds and efficient algorithms.
Isolating Sources of Disentanglement in Variational Autoencoders.
Online Learning with an Unknown Fairness Metric.
A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation.
Answerer in Questioner's Mind: Information Theoretic Approach to Goal-Oriented Visual Dialog.
Structural Causal Bandits: Where to Intervene?
Batch-Instance Normalization for Adaptively Style-Invariant Neural Networks.
Tree-to-tree Neural Networks for Program Translation.
Active Learning for Non-Parametric Regression Using Purely Random Trees.
A Linear Speedup Analysis of Distributed Deep Learning with Sparse and Quantized Communication.
Model Agnostic Supervised Local Explanations.
Leveraged volume sampling for linear regression.
Verifiable Reinforcement Learning via Policy Extraction.
How Does Batch Normalization Help Optimization?
Wasserstein Variational Inference.
Ridge Regression and Provable Deterministic Ridge Leverage Score Sampling.
Recurrent World Models Facilitate Policy Evolution.
A theory on the absence of spurious solutions for nonconvex and nonsmooth optimization.
Query Complexity of Bayesian Private Learning.
Learning to Navigate in Cities Without a Map.
Modular Networks: Learning to Decompose Neural Computation.
Meta-Gradient Reinforcement Learning.
Gaussian Process Conditional Density Estimation.
Local Differential Privacy for Evolving Data.
MetaGAN: An Adversarial Approach to Few-Shot Learning.
Non-monotone Submodular Maximization in Exponentially Fewer Iterations.
Modelling sparsity, heterogeneity, reciprocity and community structure in temporal interaction data.
GIANT: Globally Improved Approximate Newton Method for Distributed Optimization.
Structured Local Minima in Sparse Blind Deconvolution.
Breaking the Span Assumption Yields Fast Finite-Sum Minimization.
Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate.
Zero-Shot Transfer with Deictic Object-Oriented Representation in Reinforcement Learning.
Smoothed analysis of the low-rank approach for smooth semidefinite programs.
BourGAN: Generative Networks with Metric Embeddings.
Learning to Reconstruct Shapes from Unseen Classes.
A Practical Algorithm for Distributed Clustering and Outlier Detection.
Revisiting Decomposable Submodular Function Minimization with Incidence Relations.
A Smoothed Analysis of the Greedy Algorithm for the Linear Contextual Bandit Problem.
The Description Length of Deep Learning models.
Trajectory Convolution for Action Recognition.
Mixture Matrix Completion.
MULAN: A Blind and Off-Grid Method for Multichannel Echo Retrieval.
Dual Principal Component Pursuit: Improved Analysis and Efficient Algorithms.
Norm matters: efficient and accurate normalization schemes in deep networks.
DeepExposure: Learning to Expose Photos with Asynchronously Reinforced Adversarial Learning.
Algorithmic Linearly Constrained Gaussian Processes.
Overlapping Clustering Models, and One (class) SVM to Bind Them All.
Regularizing by the Variance of the Activations' Sample-Variances.
One-Shot Unsupervised Cross Domain Translation.
Automatic Program Synthesis of Long Programs with a Learned Garbage Collector.
SEGA: Variance Reduction via Gradient Sketching.
Nonparametric learning from Bayesian models with randomized objective functions.
Discovery of Latent 3D Keypoints via End-to-end Geometric Reasoning.
Sequential Context Encoding for Duplicate Removal.
Learning Optimal Reserve Price against Non-myopic Bidders.
Embedding Logical Queries on Knowledge Graphs.
Neural Architecture Search with Bayesian Optimisation and Optimal Transport.
Generalized Zero-Shot Learning with Deep Calibration Network.
SplineNets: Continuous Neural Decision Graphs.
Efficient Stochastic Gradient Hard Thresholding.
Bayesian Model Selection Approach to Boundary Detection with Non-Local Priors.
Universal Growth in Production Economies.
Pelee: A Real-Time Object Detection System on Mobile Devices.
Attention in Convolutional LSTM for Gesture Recognition.
Virtual Class Enhanced Discriminative Embedding Learning.
Deep Attentive Tracking via Reciprocative Learning.
Precision and Recall for Time Series.
Distributed Stochastic Optimization via Adaptive SGD.
Random Feature Stein Discrepancies.
3D-Aware Scene Manipulation via Inverse Graphics.
Partially-Supervised Image Captioning.
DVAE#: Discrete Variational Autoencoders with Relaxed Boltzmann Priors.
Symbolic Graph Reasoning Meets Convolutions.
High Dimensional Linear Regression using Lattice Basis Reduction.
Collaborative Learning for Deep Neural Networks.
Entropy and mutual information in models of deep neural networks.
Generating Informative and Diverse Conversational Responses via Adversarial Information Maximization.
Simple random search of static linear policies is competitive for reinforcement learning.
The Pessimistic Limits and Possibilities of Margin-based Losses in Semi-supervised Learning.
Temporal Regularization for Markov Decision Process.
Enhancing the Accuracy and Fairness of Human Decision Making.
Fighting Boredom in Recommender Systems with Linear Reinforcement Learning.
Genetic-Gated Networks for Deep Reinforcement Learning.
Neural Guided Constraint Logic Programming for Program Synthesis.
Learning to Exploit Stability for 3D Scene Parsing.
Distilled Wasserstein Learning for Word Embedding and Topic Modeling.
Video Prediction via Selective Sampling.
Foreground Clustering for Joint Segmentation and Localization in Videos and Images.
Bayesian Semi-supervised Learning with Graph Gaussian Processes.
Non-Local Recurrent Network for Image Restoration.
Relating Leverage Scores and Density using Regularized Christoffel Functions.
Neighbourhood Consensus Networks.
Conditional Adversarial Domain Adaptation.
DifNet: Semantic Segmentation by Diffusion Networks.
Accelerated Stochastic Matrix Inversion: General Theory and Speeding up BFGS Rules for Faster Second-Order Optimization.
Learning Versatile Filters for Efficient Convolutional Neural Networks.
Multivariate Time Series Imputation with Generative Adversarial Networks.
Multi-Class Learning: From Theory to Algorithm.
Parsimonious Quantile Regression of Financial Asset Tail Dynamics via Sequential Learning.
Bilinear Attention Networks.
Hybrid Knowledge Routed Modules for Large-scale Object Detection.
Overcoming Language Priors in Visual Question Answering with Adversarial Regularization.
Hybrid Retrieval-Generation Reinforced Agent for Medical Image Report Generation.
Stochastic Composite Mirror Descent: Optimal Bounds with High Probabilities.
Variational Memory Encoder-Decoder.
PacGAN: The power of two samples in generative adversarial networks.
A loss framework for calibrated anomaly detection.

Understanding the Role of Adaptivity in Machine Teaching: The Case of Version Space Learners.
Designing by Training: Acceleration Neural Network for Fast High-Dimensional Convolution.
Where Do You Think You're Going?: Inferring Beliefs about Dynamics from Behavior.
Generalizing Tree Probability Estimation via Bayesian Networks.
Gradient Descent for Spiking Neural Networks.
On Oracle-Efficient PAC RL with Rich Observations.
SLAYER: Spike Layer Error Reassignment in Time.
Geometry Based Data Generation.
Multitask Boosting for Survival Analysis with Competing Risks.
Regularization Learning Networks: Deep Learning for Tabular Datasets.
Joint Active Feature Acquisition and Classification with Variable-Size Set Encoding.
Found Graph Data and Planted Vertex Covers.
Generative Neural Machine Translation.
FRAGE: Frequency-Agnostic Word Representation.
Adaptive Online Learning in Dynamic Environments.
Revisiting Multi-Task Learning with ROCK: a Deep Residual Auxiliary Block for Visual Detection.
Gradient Sparsification for Communication-Efficient Distributed Optimization.
Image-to-image translation for cross-domain disentanglement.
Global Gated Mixture of Second-order Pooling for Improving Deep Convolutional Neural Networks.
Fairness Behind a Veil of Ignorance: A Welfare Analysis for Automated Decision Making.
Unsupervised Learning of View-invariant Action Representations.
The Lingering of Gradients: How to Reuse Gradients Over Time.
New Insight into Hybrid Stochastic Gradient Descent: Beyond With-Replacement Sampling and Convexity.
FD-GAN: Pose-guided Feature Distilling GAN for Robust Person Re-identification.
Alternating optimization of decision trees, with application to learning sparse oblique trees.
Toddler-Inspired Visual Object Learning.
Evolution-Guided Policy Gradient in Reinforcement Learning.
Adversarial vulnerability for any classifier.
Synthesize Policies for Transfer and Adaptation across Tasks and Environments.
How To Make the Gradients Small Stochastically: Even Faster Convex and Nonconvex SGD.
Video-to-Video Synthesis.
Global Geometry of Multichannel Sparse Blind Deconvolution on the Sphere.
Interactive Structure Learning with Structural Query-by-Committee.
A Game-Theoretic Approach to Recommendation Systems with Strategic Content Providers.
Efficient nonmyopic batch active search.
Neural Nearest Neighbors Networks.
\ell_1-regression with Heavy-tailed Distributions.
A Block Coordinate Ascent Algorithm for Mean-Variance Optimization.
Quadratic Decomposable Submodular Function Minimization.
Frequency-Domain Dynamic Pruning for Convolutional Neural Networks.
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding.
Domain-Invariant Projection Learning for Zero-Shot Recognition.
Boosted Sparse and Low-Rank Tensor Regression.
MetaReg: Towards Domain Generalization using Meta-Regularization.
Learning semantic similarity in a continuous space.
Low-shot Learning via Covariance-Preserving Adversarial Augmentation Networks.
Empirical Risk Minimization in Non-interactive Local Differential Privacy Revisited.
A Deep Bayesian Policy Reuse Approach Against Non-Stationary Agents.
A flexible model for training action localization with varying levels of supervision.
Posterior Concentration for Sparse Deep Learning.
DropMax: Adaptive Variational Softmax.
Uncertainty-Aware Attention for Reliable Interpretation and Prediction.
Reinforced Continual Learning.
On the Dimensionality of Word Embedding.
Discrimination-aware Channel Pruning for Deep Neural Networks.
Solving Large Sequential Games with the Excessive Gap Technique.
Generalizing Graph Matching beyond Quadratic Assignment Model.
Large Margin Deep Networks for Classification.
Connectionist Temporal Classification with Maximum Entropy Regularization.
PointCNN: Convolution On X-Transformed Points.
Informative Features for Model Comparison.
Greedy Hash: Towards Fast Optimization for Accurate Hash Coding in CNN.
Long short-term memory and Learning-to-learn in networks of spiking neurons.
KDGAN: Knowledge Distillation with Generative Adversarial Networks.
Visual Memory for Robust Path Following.
FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction.
Deep Neural Nets with Interpolating Function as Output Activation.
Sparse Covariance Modeling in High Dimensions with Gaussian Processes.
Do Less, Get More: Streaming Submodular Maximization with Subsampling.
TADAM: Task dependent adaptive metric for improved few-shot learning.
Learning Disentangled Joint Continuous and Discrete Representations.
Are GANs Created Equal? A Large-Scale Study.
SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path-Integrated Differential Estimator.
Dialog-based Interactive Image Retrieval.
Quantifying Learning Guarantees for Convex but Inconsistent Surrogates.
A Neural Compositional Paradigm for Image Captioning.
On Learning Markov Chains.
Maximum-Entropy Fine Grained Classification.
Removing the Feature Correlation Effect of Multiplicative Noise.
A Unified Framework for Extensive-Form Game Abstraction with Bounds.
HitNet: Hybrid Ternary Recurrent Neural Network.
Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives.
Which Neural Net Architectures Give Rise to Exploding and Vanishing Gradients?
How to Start Training: The Effect of Initialization and Architecture.
LinkNet: Relational Embedding for Scene Graph.
Self-Erasing Network for Integral Object Attention.
Combinatorial Optimization with Graph Convolutional Networks and Guided Tree Search.
Multi-Task Learning as Multi-Objective Optimization.
Learning to Decompose and Disentangle Representations for Video Prediction.
Are ResNets Provably Better than Linear Predictors?
Nonlocal Neural Networks, Nonlocal Diffusion and Nonlocal Modeling.
Deep Functional Dictionaries: Learning Consistent Semantic Structures on 3D Models from Functions.
Soft-Gated Warping-GAN for Pose-Guided Person Image Synthesis.
A Model for Learned Bloom Filters and Optimizing by Sandwiching.
Training DNNs with Hybrid Block Floating Point.
Implicit Reparameterization Gradients.
Rest-Katyusha: Exploiting the Solution's Structure via Scheduled Restart Schemes.
Deep Defense: Training DNNs with Improved Adversarial Robustness.
(Probably) Concave Graph Matching.
Optimization for Approximate Submodularity.
Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced.
How Many Samples are Needed to Estimate a Convolutional Neural Network?
Self-Supervised Generation of Spatial Audio for 360° Video.
A^2-Nets: Double Attention Networks.
On Misinformation Containment in Online Social Networks.
Image Inpainting via Generative Multi-column Convolutional Neural Networks.
MetaAnchor: Learning to Detect Objects with Customized Anchors.
Bayesian Pose Graph Optimization via Bingham Distributions and Tempered Geodesic MCMC.
Deep Non-Blind Deconvolution via Generalized Low-Rank Approximation.
Sigsoftmax: Reanalysis of the Softmax Bottleneck.
Chain of Reasoning for Visual Question Answering.
See and Think: Disentangling Semantic Scene Completion.
Snap ML: A Hierarchical Framework for Machine Learning.
Sparse DNNs with Improved Adversarial Robustness.
PAC-learning in the presence of adversaries.
An Efficient Pruning Algorithm for Robust Isotonic Regression.
Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose Estimation.
Geometrically Coupled Monte Carlo Sampling.
Learning Deep Disentangled Embeddings With the F-Statistic Loss.
Fast Similarity Search via Optimal Sparse Lifting.
Joint Sub-bands Learning with Clique Structures for Wavelet Domain Super-Resolution.
Learning long-range spatial dependencies with horizontal gated recurrent units.
Learning Pipelines with Limited Data and Domain Knowledge: A Study in Parsing Physics Problems.
Understanding Weight Normalized Deep Neural Networks with Rectified Linear Units.
Visual Object Networks: Image Generation with Disentangled 3D Representations.
Supervised autoencoders: Improving generalization performance with unsupervised regularizers.
An Off-policy Policy Gradient Theorem Using Emphatic Weightings.
Generalized Inverse Optimization through Online Learning.
Adapted Deep Embeddings: A Synthesis of Methods for k-Shot Inductive Transfer Learning.
Doubly Robust Bayesian Inference for Non-Stationary Streaming Data with \beta-Divergences.
IntroVAE: Introspective Variational Autoencoders for Photographic Image Synthesis.
Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language.
HOGWILD!-Gibbs can be PanAccurate.
Kalman Normalization: Normalizing Internal Representations Across Network Layers.
Structure-Aware Convolutional Neural Networks.
Efficient Algorithms for Non-convex Isotonic Regression through Submodular Optimization.